From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 6141D3858292; Fri,  4 Nov 2022 17:26:27 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6141D3858292
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1667582787;
	bh=iHG02W+oAWxnT3xN5nMORmPqZK1xHk6EMrEoPp7FQZM=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=B1kaI9R1wLarMPPj1jAi2RWNj+waYBESJwqTMIStoqWF0piiAISrhw2ref6llJiLV
	 TL0yJC5Rp6F4cEPznXK2vFvq4c16QvZl7i+5z+AVNQ34pqd9kMZuyP+X3dONeGwai8
	 w1rnqjyc6GU6k+HuTvjYKkhNYz7ypUu7yR6lo93c=
From: "wilco at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/107413] Perf loss ~14% on 519.lbm_r SPEC
 cpu2017 benchmark with r8-7132-gb5b33e113434be
Date: Fri, 04 Nov 2022 17:26:26 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 13.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: wilco at gcc dot gnu.org
X-Bugzilla-Status: ASSIGNED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: wilco at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: everconfirmed cf_reconfirmed_on bug_status
 assigned_to
Message-ID: <bug-107413-4-AKBu8bjAS4@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-107413-4@http.gcc.gnu.org/bugzilla/>
References: <bug-107413-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D107413

Wilco <wilco at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2022-11-04
             Status|UNCONFIRMED                 |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |wilco at gcc dot gn=
u.org
--- Comment #10 from Wilco <wilco at gcc dot gnu.org> ---
(In reply to Rama Malladi from comment #9)
> (In reply to Rama Malladi from comment #8)
> > (In reply to Wilco from comment #7)
> > > The revert results in about 0.5% loss on Neoverse N1, so it looks lik=
e the
> > > reassociation pass is still splitting FMAs into separate MUL and ADD =
(which
> > > is bad for narrow cores).
> >=20
> > Thank you for checking on N1. Did you happen to check on V1 too to repr=
oduce
> > the perf results I had? Any other experiments/ tests I can do to help on
> > this filing? Thanks again for the debug/ fix.
>=20
> I ran SPEC cpu2017 fprate 1-copy benchmark built with the patch reverted =
and
> using option 'neoverse-n1' on the Graviton 3 processor (which has support
> for SVE). The performance was up by 0.4%, primary contributor being
> 519.lbm_r which was up 13%.

I'm seeing about 1.5% gain on Neoverse V1 and 0.5% loss on Neoverse N1. I'll
post a patch that allows per-CPU settings for FMA reassociation, so you'll =
get
good performance with -mcpu=3Dnative. However reassociation really needs to=
 be
taught about the existence of FMAs.=