From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id D37A33858C98 for ; Thu, 16 Nov 2023 10:34:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D37A33858C98 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D37A33858C98 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700130864; cv=none; b=iNFCwsK6jGwrsFQm1yD+F7XcpBEoQ+clRl7wy1xbBGf5Ls12LtS99k1AN5NYuIulieg2lN75qStGqK1Xsduapou+G22mhcTCQ3QwkC0uLWGAiPajiTlcRCmWuI9MhbXxF0WTayZDV0hUnNm2CTojoj2oyuQHHMLMFrAI9wRzeyA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700130864; c=relaxed/simple; bh=Kga5D6m+AUqCKCCc4+z6SzRScZmj1DPS2UmjHJKI8yw=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=tbP6RoVzFm7JIBp0qjVgzSG4SBTiDj2yQwCrnt1jp90WvvGZLqXAV29b429nLeChSX2iyDJRuVrHxYYfJkSwy0wNvE5mYOflKGub8vkWUDYD3RaYHAHp/jWFyjhdi5V1iPSb4sHVtyFwh7ZMiVrOXrLWXqb0E0pmocms6HVqqm4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8BB131595; Thu, 16 Nov 2023 02:35:08 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9C4553F6C4; Thu, 16 Nov 2023 02:34:21 -0800 (PST) From: Richard Sandiford To: Tamar Christina Mail-Followup-To: Tamar Christina ,gcc-patches@gcc.gnu.org, nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, Kyrylo.Tkachov@arm.com, richard.sandiford@arm.com Cc: gcc-patches@gcc.gnu.org, nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, Kyrylo.Tkachov@arm.com Subject: Re: [PATCH]AArch64: only discount MLA for vector and scalar statements References: Date: Thu, 16 Nov 2023 10:34:20 +0000 In-Reply-To: (Tamar Christina's message of "Wed, 15 Nov 2023 17:02:10 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-23.0 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Tamar Christina writes: > Hi All, > > In testcases gcc.dg/tree-ssa/slsr-19.c and gcc.dg/tree-ssa/slsr-20.c we have a > fairly simple computation. On the current generic costing we generate: > > f: > add w0, w0, 2 > madd w1, w0, w1, w1 > lsl w0, w1, 1 > ret > > but on any other cost model but generic (including the new up coming generic) > we generate: > > f: > adrp x2, .LC0 > dup v31.2s, w0 > fmov s30, w1 > ldr d29, [x2, #:lo12:.LC0] > add v31.2s, v31.2s, v29.2s > mul v31.2s, v31.2s, v30.s[0] > addp v31.2s, v31.2s, v31.2s > fmov w0, s31 > ret > .LC0: > .word 2 > .word 4 > > This seems to be because the vectorizer thinks the vector transfers are free: > > x1_4 + x2_6 1 times vector_stmt costs 0 in body > x1_4 + x2_6 1 times vec_to_scalar costs 0 in body > > This happens because the stmt it's using to get the cost of register transfers > for the given type happens to be one feeding into a MUL. we incorrectly > discount the + for the register transfer. > > This is fixed by guarding the check for aarch64_multiply_add_p with a kind > check and only do it for scalar_stmt and vector_stmt. > > I'm sending this separate to my patch series but it's required for it. > It also seems to fix overvectorization cases in fotonik3d_r in SPECCPU 2017. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? > > Thanks, > Tamar > > gcc/ChangeLog: > > * config/aarch64/aarch64.cc (aarch64_adjust_stmt_cost): Guard mla. > (aarch64_vector_costs::count_ops): Likewise. > > --- inline copy of patch -- > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc > index 06ec22057e10fd591710aa4c795a78f34eeaa8e5..0f05877ead3dca6477ebc70f53c632e4eb48d439 100644 > --- a/gcc/config/aarch64/aarch64.cc > +++ b/gcc/config/aarch64/aarch64.cc > @@ -14587,7 +14587,7 @@ aarch64_adjust_stmt_cost (vec_info *vinfo, vect_cost_for_stmt kind, > } > > gassign *assign = dyn_cast (STMT_VINFO_STMT (stmt_info)); > - if (assign) > + if ((kind == scalar_stmt || kind == vector_stmt) && assign) > { > /* For MLA we need to reduce the cost since MLA is 1 instruction. */ > if (!vect_is_reduction (stmt_info) This properly protects both the MLA and aarch64_bool_compound_p tests (good!), so... > @@ -14669,7 +14669,9 @@ aarch64_vector_costs::count_ops (unsigned int count, vect_cost_for_stmt kind, > } > > /* Assume that multiply-adds will become a single operation. */ > - if (stmt_info && aarch64_multiply_add_p (m_vinfo, stmt_info, m_vec_flags)) > + if (stmt_info > + && (kind == scalar_stmt || kind == vector_stmt) > + && aarch64_multiply_add_p (m_vinfo, stmt_info, m_vec_flags)) > return; > > /* Assume that bool AND with compare operands will become a single ...I think we should do the same here, for the code that beings with the comment line above. It's probably worth sharing the: if (stmt_info && (kind == scalar_stmt || kind == vector_stmt)) condition to help avoid the same situation in future. OK with that change, thanks. Richard