From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 056773857C69 for ; Tue, 11 May 2021 07:10:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 056773857C69 Received: from pps.filterd (m0098414.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 14B75EVc134527; Tue, 11 May 2021 03:10:40 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 38fm1t9yqq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 11 May 2021 03:10:39 -0400 Received: from m0098414.ppops.net (m0098414.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 14B75Chw134368; Tue, 11 May 2021 03:10:39 -0400 Received: from ppma05fra.de.ibm.com (6c.4a.5195.ip4.static.sl-reverse.com [149.81.74.108]) by mx0b-001b2d01.pphosted.com with ESMTP id 38fm1t9yps-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 11 May 2021 03:10:39 -0400 Received: from pps.filterd (ppma05fra.de.ibm.com [127.0.0.1]) by ppma05fra.de.ibm.com (8.16.0.43/8.16.0.43) with SMTP id 14B79GMC032323; Tue, 11 May 2021 07:10:37 GMT Received: from b06avi18626390.portsmouth.uk.ibm.com (b06avi18626390.portsmouth.uk.ibm.com [9.149.26.192]) by ppma05fra.de.ibm.com with ESMTP id 38dj98gtcg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 11 May 2021 07:10:37 +0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06avi18626390.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 14B7A8IZ36241798 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 11 May 2021 07:10:08 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9AE59A4055; Tue, 11 May 2021 07:10:34 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B610EA4040; Tue, 11 May 2021 07:10:32 +0000 (GMT) Received: from KewenLins-MacBook-Pro.local (unknown [9.197.242.40]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Tue, 11 May 2021 07:10:32 +0000 (GMT) Subject: Re: [PATCH 1/2] vect: Add costing_for_scalar parameter to init_cost hook To: Richard Biener Cc: GCC Patches , Bill Schmidt , David Edelsohn , Segher Boessenkool References: <990b2492-9198-b713-c79f-68563d488ba0@linux.ibm.com> <5169a943-4a45-040a-60c6-91af7718dc6f@linux.ibm.com> From: "Kewen.Lin" Message-ID: <16a85dfa-7d47-4919-84a2-fb75a68c1dfb@linux.ibm.com> Date: Tue, 11 May 2021 15:10:30 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 1auARoLz-B1hrhec6848Bxald5f6-ugk X-Proofpoint-GUID: dWE7GMu47z_Z1jmvD5X3OYNILgm9Bmn4 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.761 definitions=2021-05-11_02:2021-05-10, 2021-05-11 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 phishscore=0 mlxscore=0 suspectscore=0 clxscore=1015 lowpriorityscore=0 mlxlogscore=999 spamscore=0 impostorscore=0 adultscore=0 bulkscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104190000 definitions=main-2105110053 X-Spam-Status: No, score=-4.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 May 2021 07:10:42 -0000 Hi Richi, on 2021/5/10 下午9:55, Richard Biener wrote: > On Sat, May 8, 2021 at 10:05 AM Kewen.Lin wrote: >> >> Hi Richi, >> >> Thanks for the comments! >> >> on 2021/5/7 下午5:43, Richard Biener wrote: >>> On Fri, May 7, 2021 at 5:30 AM Kewen.Lin via Gcc-patches >>> wrote: >>>> >>>> Hi, >>>> >>>> When I was investigating density_test heuristics, I noticed that >>>> the current rs6000_density_test could be used for single scalar >>>> iteration cost calculation, through the call trace: >>>> vect_compute_single_scalar_iteration_cost >>>> -> rs6000_finish_cost >>>> -> rs6000_density_test >>>> >>>> It looks unexpected as its desriptive function comments and Bill >>>> helped to confirm this needs to be fixed (thanks!). >>>> >>>> So this patch is to check the passed data, if it's the same as >>>> the one in loop_vinfo, it indicates it's working on vector version >>>> cost calculation, otherwise just early return. >>>> >>>> Bootstrapped/regtested on powerpc64le-linux-gnu P9. >>>> >>>> Nothing remarkable was observed with SPEC2017 Power9 full run. >>>> >>>> Is it ok for trunk? >>> >>> + /* Only care about cost of vector version, so exclude scalar >>> version here. */ >>> + if (LOOP_VINFO_TARGET_COST_DATA (loop_vinfo) != (void *) data) >>> + return; >>> >>> Hmm, looks like a quite "random" test to me. What about adding a >>> parameter to finish_cost () (or init_cost?) indicating the cost kind? >>> >> >> I originally wanted to change the hook interface, but noticed that >> the finish_cost in function vect_estimate_min_profitable_iters is >> the only invocation with LOOP_VINFO_TARGET_COST_DATA (loop_vinfo), >> it looks enough to differentiate the scalar version costing or >> vector version costing for loop. Do you mean this observation/ >> assumption easy to be broken sometime later? > > Yes, this field is likely to become stale. > >> >> The attached patch to add one new parameter to indicate the >> costing kind explicitly as you suggested. >> >> Does it look better? >> >> gcc/ChangeLog: >> >> * doc/tm.texi: Regenerated. >> * target.def (init_cost): Add new parameter costing_for_scalar. >> * targhooks.c (default_init_cost): Adjust for new parameter. >> * targhooks.h (default_init_cost): Likewise. >> * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Likewise. >> (vect_compute_single_scalar_iteration_cost): Likewise. >> (vect_analyze_loop_2): Likewise. >> * tree-vect-slp.c (_bb_vec_info::_bb_vec_info): Likewise. >> (vect_bb_vectorization_profitable_p): Likewise. >> * tree-vectorizer.h (init_cost): Likewise. >> * config/aarch64/aarch64.c (aarch64_init_cost): Likewise. >> * config/i386/i386.c (ix86_init_cost): Likewise. >> * config/rs6000/rs6000.c (rs6000_init_cost): Likewise. >> >>> OTOH we already pass scalar_stmt to individual add_stmt_cost, >>> so not sure whether the context really matters. That said, >>> the density test looks "interesting" ... the intent was that finish_cost >>> might look at gathered data from add_stmt, not that it looks at >>> the GIMPLE IL ... so why are you not counting vector_stmt vs. >>> scalar_stmt entries in vect_body and using that for this metric? >>> >> >> Good to know the intention behind finish_cost, thanks! >> >> I'm afraid that the check on vector_stmt and scalar_stmt entries >> from add_stmt_cost doesn't work for the density test here. The >> density test focuses on the vector version itself, there are some >> stmts whose relevants are marked as vect_unused_in_scope, IIUC >> they won't be passed down when costing for both versions. But the >> existing density check would like to know the cost for the >> non-vectorized part. The current implementation does: >> >> vec_cost = data->cost[vect_body] >> >> if (!STMT_VINFO_RELEVANT_P (stmt_info) >> && !STMT_VINFO_IN_PATTERN_P (stmt_info)) >> not_vec_cost++ >> >> density_pct = (vec_cost * 100) / (vec_cost + not_vec_cost); >> >> it takes those unrelevant stmts into account, and then has >> both costs from the non-vectorized part (not_vec_cost) >> and vectorized part (cost[vect_body]), it can calculate the >> vectorization code density ratio. > > Yes, but then what "relevant" stmts are actually needed and what > not is missed by your heuristics. It's really some GIGO one > I fear - each vectorized data reference will add a pointer IV > (eventually commoned by IVOPTs later) and pointer value updates > that are not accounted for in costing (the IVs and updates in the > scalar code are marked as not relevant). Are those the stmts > this heuristic wants to look at? Yes, the IVs and updates (even the comparison for exit) are what the heuristics tries to count. In most cases, the non vectorized part in the loop are IV updates. And it's so true that the collected not_vec_cost could be not accurate, but it seems hard to predict the cost exactly here? Assuming this not_vect_cost cost is over priced, it could result in a lower density ratio than what it should be. Also assuming the density threshold is relatively conservative, in this case if the ratio still exceeds the density threshold, we can say the loop is really dense. It could miss to catch some "dense" loops, but I hope it won't take "non-dense" loops as "dense" unexpectedly. > > The patch looks OK btw. > Thanks for the review! The patch was bootstrapped/reg-tested on powerpc64le-linux-gnu P9, x86_64-redhat-linux and aarch64-linux-gnu. Committed in r12-704. BR, Kewen