From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 2DCEF3858427; Wed, 7 Feb 2024 10:24:23 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2DCEF3858427 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1707301463; bh=1i9icVprHB7EoL/s5LKyp86HN/ZlCg7rchbS3IVgP2c=; h=From:To:Subject:Date:In-Reply-To:References:From; b=mEl4KTozY0Vgl/rEzvRzofXoxDZS+Rwo6PsD7b+U4tc14RkFFxMXbcxwCaAY0Evbr clZUOURmJAlHbfqTufkASCJx4aHM9KFYcifm/2DvnpJbjXRrOZoYlidv05vtOrQcF9 zDMfWTV+pH5YBUl8DNkEYcrPK7QIkM5sASRrlW/Q= From: "rguenther at suse dot de" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/113583] Main loop in 519.lbm not vectorized. Date: Wed, 07 Feb 2024 10:24:21 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: enhancement X-Bugzilla-Who: rguenther at suse dot de X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D113583 --- Comment #17 from rguenther at suse dot de --- On Wed, 7 Feb 2024, juzhe.zhong at rivai dot ai wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D113583 >=20 > --- Comment #16 from JuzheZhong --- > The FMA is generated in widening_mul PASS: >=20 > Before widening_mul (fab1): >=20 > _5 =3D 3.33333333333333314829616256247390992939472198486328125e-1 - _4; > _6 =3D _5 * 1.229999999999999982236431605997495353221893310546875e-1; > _8 =3D _4 + _6; So this is x + (CST1 - x) * CST2 which we might fold/associate to x * (1. - CST2) + CST1 * CST2 this looks like something for reassociation (it knows some rules, like what it does in undistribute_ops_list, I'm not sure if that comes into play here already, this would be doing the reverse before). A match.pd pattern also works, but it wouldn't be general enough to handle more complicated but similar cases.=