From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from qs51p00im-qukt01071701.me.com (qs51p00im-qukt01071701.me.com [17.57.155.6]) by sourceware.org (Postfix) with ESMTPS id 3413C3857733 for ; Tue, 16 May 2023 20:13:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3413C3857733 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=icloud.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=icloud.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=icloud.com; s=1a1hai; t=1684267989; bh=QHmCUkYZffMhcDp+CZBXsSsA3bkDWbeZMx1II37VDA8=; h=From:Message-Id:Content-Type:Mime-Version:Subject:Date:To; b=sShX2qvC8pfJLDkO+kGmRyxOTL8DDX7E42xkfov7UNiU6JWg15PzVK02qm/+/ILqp 0ViXj9gwKahE7Oa+Qm4kqZfZr5mEcVFu+HlkGMSm/nzsUnJ2qudgcFrH/U1nbXAR2Q Vni2OE0jxe1tcbjsfXTBqcOpGsoqgait8I6C7mkDbegMM7pQegkE1PzEcU50U7Gqhy IznGDmgFBb7SOzbUryeATLfq26vWsH4CBZzuxgjlTS2vB9FY/rchm20Cnz/FzZuwSv xs7y/mXHAIHhHTT6oaU2H3OEWh8FbrdO2zkrsz7rmS+bR2uKOOskLhAHM+xY0ZmZaK O3bo372kopklw== Received: from smtpclient.apple (qs51p00im-dlb-asmtp-mailmevip.me.com [17.57.155.28]) by qs51p00im-qukt01071701.me.com (Postfix) with ESMTPSA id BB2664D0024D; Tue, 16 May 2023 20:13:08 +0000 (UTC) From: Evandro Menezes Message-Id: Content-Type: multipart/alternative; boundary="Apple-Mail=_3FC0201D-BBF4-48D0-A09E-ACDE6FB4368B" Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.500.231\)) Subject: Re: [PATCH] aarch64: Add SVE instruction types Date: Tue, 16 May 2023 15:12:37 -0500 In-Reply-To: Cc: Richard Sandiford , Evandro Menezes via Gcc-patches , "evandro+gcc@gcc.gnu.org" , Tamar Christina To: Kyrylo Tkachov References: <1D567E08-9EBB-4EF7-9626-BA95D8E0EB36@icloud.com> <691ED2DC-333C-4E3D-AB54-B84519673C75@icloud.com> X-Mailer: Apple Mail (2.3731.500.231) X-Proofpoint-GUID: vIR2fUHkON2JZBCHb-GzU5h06CIxr1B2 X-Proofpoint-ORIG-GUID: vIR2fUHkON2JZBCHb-GzU5h06CIxr1B2 X-Proofpoint-Virus-Version: =?UTF-8?Q?vendor=3Dfsecure_engine=3D1.1.170-22c6f66c430a71ce266a39bfe25bc?= =?UTF-8?Q?2903e8d5c8f:6.0.138,18.0.790,17.11.62.513.0000000_definitions?= =?UTF-8?Q?=3D2022-01-12=5F02:2020-02-14=5F02,2022-01-12=5F02,2021-12-02?= =?UTF-8?Q?=5F01_signatures=3D0?= X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 malwarescore=0 adultscore=0 mlxlogscore=999 spamscore=0 suspectscore=0 mlxscore=0 clxscore=1015 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2305160169 X-Spam-Status: No, score=-3.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE,KAM_SHORT,RCVD_IN_DNSWL_LOW,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --Apple-Mail=_3FC0201D-BBF4-48D0-A09E-ACDE6FB4368B Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Hi, Kyrill. It makes sense. I could add the classification to a different attribute as= you did and keep it in aarch64 as well. I took the same approach, gleaning over several optimization guides for Arm= processors supporting SVE and figuring out the smallest number of types th= at could cover most variations of resources used. Methinks that the classi= fication in this patch is close to that goal, but feedback is appreciated. I did observe a meaningful gain in performance. Of course, wide machines l= ike the V1 can handle most instruction sequences thrown at it, but there=E2= =80=99s still some efficiency left on the table without a tailored scheduli= ng, especially when recovering from cache or branch misses, when it=E2=80= =99s important to quickly fill up the pipeline back to regime, albeit umpte= en transistors are dedicated to make sure that misses do not happen often. Thank you, --=20 Evandro Menezes > Em 16 de mai. de 2023, =C3=A0(s) 03:36, Kyrylo Tkachov escreveu: >=20 > Hi Evandro, >=20=20 > I created a new attribute so I didn=E2=80=99t have to extend the =E2=80= =9Ctype=E2=80=9D attribute that lives in config/arm/types.md. As that attri= bute and file lives in the arm backend but SVE is AArch64-only I didn=E2=80= =99t want to add logic to the arm backend as it=E2=80=99s not truly shared. > The granularity has been somewhat subjective. I had looked at the Softwar= e Optimisation guides for various SVE and SVE2-capable cores from Arm on de= veloper.arm.com and tried to glean commonalitie= s between different instruction groups. > I did try writing a model for Neoverse V1 using that classification but I= couldn=E2=80=99t spend much time on it and the resulting model didn=E2=80= =99t give me much improvements and gave some regressions instead. > I think that was more down to my rushed model rather than anything else t= hough. >=20=20 > Thanks, > Kyrill >=20=20 > From: Evandro Menezes =20 > Sent: Monday, May 15, 2023 9:13 PM > To: Kyrylo Tkachov > Cc: Richard Sandiford ; Evandro Menezes via Gc= c-patches ; evandro+gcc@gcc.gnu.org; Tamar Christi= na > Subject: Re: [PATCH] aarch64: Add SVE instruction types >=20=20 > Hi, Kyrill. >=20=20 > I wasn=E2=80=99t aware of your previous patch. Could you clarify why you= considered creating an SVE specific type attribute instead of reusing the = common one? I really liked the iterators that you created; I=E2=80=99d lik= e to use them. >=20=20 > Do you have specific examples which you might want to mention with regard= s to granularity? >=20=20 > Yes, my intent for this patch is to enable modeling the SVE instructions = on N1. The patch that implements it brings up some performance improvement= s, but it=E2=80=99s mostly flat, as expected. >=20=20 > Thank you, >=20 > --=20 > Evandro Menezes >=20=20 >=20=20 >=20 >=20 > Em 15 de mai. de 2023, =C3=A0(s) 04:49, Kyrylo Tkachov > escreveu: >=20=20 >=20 >=20 >=20 > -----Original Message----- > From: Richard Sandiford > > Sent: Monday, May 15, 2023 10:01 AM > To: Evandro Menezes via Gcc-patches > > Cc: evandro+gcc@gcc.gnu.org ; Evandro Men= ezes >; > Kyrylo Tkachov >; = Tamar Christina > > > Subject: Re: [PATCH] aarch64: Add SVE instruction types >=20 > Evandro Menezes via Gcc-patches > writes: >=20 > This patch adds the attribute `type` to most SVE1 instructions, as in the > other >=20 > instructions. >=20 > Thanks for doing this. >=20 > Could you say what criteria you used for picking the granularity? Other > maintainers might disagree, but personally I'd prefer to distinguish two > instructions only if: >=20 > (a) a scheduling description really needs to distinguish them or > (b) grouping them together would be very artificial (because they're > logically unrelated) >=20 > It's always possible to split types later if new scheduling descriptions > require it. Because of that, I don't think we should try to predict ahead > of time what future scheduling descriptions will need. >=20 > Of course, this depends on having results that show that scheduling > makes a significant difference on an SVE core. I think one of the > problems here is that, when a different scheduling model changes the > performance of a particular test, it's difficult to tell whether > the gain/loss is caused by the model being more/less accurate than > the previous one, or if it's due to important "secondary" effects > on register live ranges. Instinctively, I'd have expected these > secondary effects to dominate on OoO cores. >=20 > I agree with Richard on these points. The key here is getting the granula= rity right without having too maintain too many types that aren't useful in= the models. > FWIW I had posted https://gcc.gnu.org/pipermail/gcc-patches/2022-November= /607101.html in November. It adds annotations to SVE2 patterns as well as f= or base SVE. > Feel free to reuse it if you'd like. > I see you had posted a Neoverse V1 scheduling model. Does that give an im= provement on SVE code when combined with the scheduling attributes somehow? > Thanks, > Kyrill --Apple-Mail=_3FC0201D-BBF4-48D0-A09E-ACDE6FB4368B--