From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 756503858C3A for ; Fri, 10 Nov 2023 14:31:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 756503858C3A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 756503858C3A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699626703; cv=none; b=G31uxs1+gQT2GcXHSqruN7Dp0jmy0W91iI83R0GY47rKJBBdb6n60O/j1RfPn03kKW+Q9bfbaD+/9EfqK281q0bZ7AQ3mf/b/bRpU9pbJxuFCwVTfPPohfZ5iCpXewi9foxYu74xdn/kcn/Hxyte+KBzvZciCnkHdt1oXmmFFUI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699626703; c=relaxed/simple; bh=+KZlLSXOBbsuKcjNZdtqaSiVlBua1VQrJO4UhgFNr+w=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=ipBIISy42PFjZpUPJSEgKDmwCQfmtLwZrHkBnRXvvtk6FWqn0YnbfnxKtWfhyCvYEKYPjraooePFa9zD/+K0YOcQHfCznn137Rcu0u7sUf6SIIU2D2fGQPgqThpRGpkYzrTQsDr0afPN3wvKxIbwlq3mHyto/CkgUhrth2/XYfA= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0353723.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3AADeFSk027646 for ; Fri, 10 Nov 2023 14:31:40 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=U85Sqvuu0NaGKGl7+WfyBPGrZrNhLy+awJTQsuQQ1AI=; b=c2pCPrdq48jOmLs8hrhp0lEfSCNWitpg2v+ZnVplIgauqswgODCj75t6WKRbmc9Lf46I LMUfopYczsksG8ze3wq4zWdH0XNjnbA7vIfAoQOktGc3kQEFcfF/LrQUrEOVMSZPH3uY EbQB94+GGvbA7lLM/txloJiRWqR1pRxBfoMR9GZi+Gr7/rfS2/1d7Eqmwx83hI/6KQyx 28hdyr/BueF9Ln3ZiSjfJV4ecGI27ELjwNbo1VWC/Ky8z3xRDSUcHpWLsmWkQZ5vqObf 20JC3gKcOaj3nfVISYVH7vGHAhw1j2zjK98yuuZB8ejgvWvNyfiYzKePjXd1SfaRtQLF +A== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3u9nkq1uf7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 10 Nov 2023 14:31:40 +0000 Received: from m0353723.ppops.net (m0353723.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3AADgdah004101 for ; Fri, 10 Nov 2023 14:31:40 GMT Received: from ppma11.dal12v.mail.ibm.com (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3u9nkq1udg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 10 Nov 2023 14:31:39 +0000 Received: from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1]) by ppma11.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3AADISog004132; Fri, 10 Nov 2023 14:31:38 GMT Received: from smtprelay01.wdc07v.mail.ibm.com ([172.16.1.68]) by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 3u7w21bdhv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 10 Nov 2023 14:31:38 +0000 Received: from smtpav01.dal12v.mail.ibm.com (smtpav01.dal12v.mail.ibm.com [10.241.53.100]) by smtprelay01.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3AAEVbGA30999278 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Nov 2023 14:31:37 GMT Received: from smtpav01.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 698D358059; Fri, 10 Nov 2023 14:31:37 +0000 (GMT) Received: from smtpav01.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 759B958058; Fri, 10 Nov 2023 14:31:35 +0000 (GMT) Received: from [9.43.64.36] (unknown [9.43.64.36]) by smtpav01.dal12v.mail.ibm.com (Postfix) with ESMTP; Fri, 10 Nov 2023 14:31:35 +0000 (GMT) Message-ID: <6fb1020f-4481-47a2-913e-fbdfbeaa1832@linux.ibm.com> Date: Fri, 10 Nov 2023 20:01:33 +0530 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] tree-ssa-loop-ivopts : Add live analysis in regs used in decision making Content-Language: en-US To: Richard Biener Cc: Jeff Law , Peter Bergner , gcc-patches References: <908bdc21-ea98-436e-9566-01e4d8da9132@linux.ibm.com> <45fa6563-42a9-4442-8b36-f243417459c6@linux.ibm.com> From: Ajit Agarwal In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: IQ_pgSyQD-rfSO52s6evseTvgRXEbo0N X-Proofpoint-ORIG-GUID: vcLt0d_a0z_WXylRrwBdJO_VRGeQgPpv X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-10_11,2023-11-09_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 impostorscore=0 mlxscore=0 lowpriorityscore=0 priorityscore=1501 bulkscore=0 mlxlogscore=999 adultscore=0 suspectscore=0 phishscore=0 malwarescore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311060000 definitions=main-2311100120 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hello Richard: On 10/11/23 7:29 pm, Richard Biener wrote: > On Fri, Nov 10, 2023 at 7:42 AM Ajit Agarwal wrote: >> >> Hello Richard: >> >> >> On 09/11/23 6:21 pm, Richard Biener wrote: >>> On Wed, Nov 8, 2023 at 4:00 PM Ajit Agarwal wrote: >>>> >>>> tree-ssa-loop-ivopts : Add live analysis in regs used in decision making. >>>> >>>> Add live anaysis in regs used calculation in decision making of >>>> selecting ivopts candidates. >>>> >>>> 2023-11-08 Ajit Kumar Agarwal >>>> >>>> gcc/ChangeLog: >>>> >>>> * tree-ssa-loop-ivopts.cc (get_regs_used): New function. >>>> (determine_set_costs): Call to get_regs_used to use live >>>> analysis. >>>> --- >>>> gcc/tree-ssa-loop-ivopts.cc | 73 +++++++++++++++++++++++++++++++++++-- >>>> 1 file changed, 70 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc >>>> index c3336603778..e02fe7d434b 100644 >>>> --- a/gcc/tree-ssa-loop-ivopts.cc >>>> +++ b/gcc/tree-ssa-loop-ivopts.cc >>>> @@ -6160,6 +6160,68 @@ ivopts_estimate_reg_pressure (struct ivopts_data *data, unsigned n_invs, >>>> return cost + n_cands; >>>> } >>>> >>>> +/* Return regs used based on live-in and liveout of given ssa variables. */ >>> >>> Please explain how the following code relates to anything like "live >>> analysis" and >>> where it uses live-in and live-out. And what "live-in/out of a given >>> SSA variable" >>> should be. >>> >>> Also explain why you are doing this at all. The patch doesn't come >>> with a testcase >>> or with any other hint that motivated you. >>> >>> Richard. >>> >> >> The function get_regs_used increments the regs_used based on live-in >> and live-out analysis of given ssa name. Instead of setting live-in and >> live-out bitmap I increment the regs_used. >> >> Below is how I identify live-in and live-out and increments the regs_used >> variable: >> >> a) For a given def_bb of gimple statement of ssa name there should be >> live-out and increments the regs_used. >> >> b) Visit each use of SSA_NAME and if it isn't in the same block as the def, >> we identify live on entry blocks and increments regs_used. >> >> The below function is the modification of set_var_live_on_entry of tree-ssa-live.cc >> Where we set the bitmap of liveout and livein of basic block. Instead of setting bitmap, regs_used is incremented.e > > It clearly doesn't work that way, and the number doesn't in any way relate to > the number of registers used or register pressure. > I agree with you that actual regs_used is not actually the registers used calculated based on livein and liveout. Above decision making is using the variable reg_used which is not actually related to registers used or registers used. My decision making is based on livein and liveout instead of actual registers used. I tried to sync up with variables names same as used in ivopts_estimate_register_pressure. My logic is changing the actual implementation of ivopts_estimate_register_pressure considering the livein and liveout instead of actual registers used. Idea behind is to use the livein and liveout considering the regions that doing ivopts increases or decreases the register pressure based on livein and liveout. My calculation of register pressure should be based livein and liveout across the region based on ivopts instead of calculating the register used based on number of iv candidates. This is how my notion of register pressure. I can change code to give variables names meaningful stated in above decison making. >> I identify regs_used as the number of live-in and liveout of given ssa name variable. >> >> For each iv candiate ssa variables I identify regs_used and take maximum of regs >> used for all the iv candidates that will be used in ivopts_estimate_register_pressure >> cost analysis. >> >> Motivation behind doing this opttks for FP and INT around 2% to 7%. > > An interesting GIGO effect. Why you think its GIGO effect. The gains are happening because of decision making on register pressure stated above. Please elaborate if you think otherwise. Thanks & Regards Ajit > >> Also setting regs_used as number of iv candiates, which is not >> optimized and robust way of decision making for ivopts optimization I decide >> on live-in and live-out analysis which is more correct and appropriate way of >> identifying regs_used. >> >> And also there are no regressions in bootstrapped/regtested on powerpc64-linux-gnu. >> >> Thanks & Regards >> Ajit >> >>>> +static unsigned >>>> +get_regs_used (tree ssa_name) >>>> +{ >>>> + unsigned regs_used = 0; >>>> + gimple *stmt; >>>> + use_operand_p use; >>>> + basic_block def_bb = NULL; >>>> + imm_use_iterator imm_iter; >>>> + >>>> + stmt = SSA_NAME_DEF_STMT (ssa_name); >>>> + if (stmt) >>>> + { >>>> + def_bb = gimple_bb (stmt); >>>> + /* Mark defs in liveout bitmap temporarily. */ >>>> + if (def_bb) >>>> + regs_used++; >>>> + } >>>> + else >>>> + def_bb = ENTRY_BLOCK_PTR_FOR_FN (cfun); >>>> + >>>> + /* An undefined local variable does not need to be very alive. */ >>>> + if (virtual_operand_p (ssa_name) >>>> + || ssa_undefined_value_p (ssa_name, false)) >>>> + return 0; >>>> + >>>> + /* Visit each use of SSA_NAME and if it isn't in the same block as the def, >>>> + add it to the list of live on entry blocks. */ >>>> + FOR_EACH_IMM_USE_FAST (use, imm_iter, ssa_name) >>>> + { >>>> + gimple *use_stmt = USE_STMT (use); >>>> + basic_block add_block = NULL; >>>> + >>>> + if (gimple_code (use_stmt) == GIMPLE_PHI) >>>> + { >>>> + /* Uses in PHI's are considered to be live at exit of the SRC block >>>> + as this is where a copy would be inserted. Check to see if it is >>>> + defined in that block, or whether its live on entry. */ >>>> + int index = PHI_ARG_INDEX_FROM_USE (use); >>>> + edge e = gimple_phi_arg_edge (as_a (use_stmt), index); >>>> + if (e->src != def_bb) >>>> + add_block = e->src; >>>> + } >>>> + else if (is_gimple_debug (use_stmt)) >>>> + continue; >>>> + else >>>> + { >>>> + /* If its not defined in this block, its live on entry. */ >>>> + basic_block use_bb = gimple_bb (use_stmt); >>>> + if (use_bb != def_bb) >>>> + add_block = use_bb; >>>> + } >>>> + >>>> + /* If there was a live on entry use, increment register used. */ >>>> + if (add_block) >>>> + { >>>> + regs_used++; >>>> + } >>>> + } >>>> + return regs_used; >>>> +} >>>> + >>>> /* For each size of the induction variable set determine the penalty. */ >>>> >>>> static void >>>> @@ -6200,15 +6262,20 @@ determine_set_costs (struct ivopts_data *data) >>>> n++; >>>> } >>>> >>>> + unsigned max = 0; >>>> EXECUTE_IF_SET_IN_BITMAP (data->relevant, 0, j, bi) >>>> { >>>> struct version_info *info = ver_info (data, j); >>>> - >>>> if (info->inv_id && info->has_nonlin_use) >>>> - n++; >>>> + { >>>> + tree ssa_name = ssa_name (j); >>>> + n = get_regs_used (ssa_name); >>>> + if (n >= max) >>>> + max = n; >>>> + } >>>> } >>>> >>>> - data->regs_used = n; >>>> + data->regs_used = max; >>>> if (dump_file && (dump_flags & TDF_DETAILS)) >>>> fprintf (dump_file, " regs_used %d\n", n); >>>> >>>> -- >>>> 2.39.3 >>>> >>>>