From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 77FDD3858294; Wed,  1 May 2024 13:21:41 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 77FDD3858294
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1714569701;
	bh=+RascdX5HnCx/VQ4Pw/+tIfgGD7SWikjwroohXhG/fs=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=CZcosFWRd0Oh2k1AdZtXVWdHv9Gtc8TxE4RJzTmvlbdruKuZfX7mWwxdmB5z8zUhQ
	 i5wDlYfW5tfYwrB3uxwn0gA2VMmiJ7Vl8w/P+BYkjcRdBkPB/wTAzHHAGM/34G0357
	 mp328XjI95Ur6r/AeI/Xq3YDHMQVpLLkR4XwIEe4=
From: "mjr19 at cam dot ac.uk" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/114324] [13/14/15 Regression] AVX2
 vectorisation performance regression with gfortran 13/14
Date: Wed, 01 May 2024 13:21:38 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 13.1.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: mjr19 at cam dot ac.uk
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P2
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 13.3
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-114324-4-qhmCr5FvMO@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-114324-4@http.gcc.gnu.org/bugzilla/>
References: <bug-114324-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D114324

--- Comment #5 from mjr19 at cam dot ac.uk ---
Note that bug 114767 also turns out to be a case in which the inability to
alternate neg and nop along a vector leads to poor performance with some
operations on the complex type. That optimisation improvement request also
discusses that the ability to alternate add and nop could be beneficial.

Ifort can alternate neg and nop, at least in the simple case of

  complex(kind(1d0)) :: c(*)
  do i=3D1,n
     c(i)=3Dconjg(c(i))
  enddo

Helped by aggressive default unrolling, it ends up being almost four times
faster than gfortran-14 on the machine I tested it on. On asking gfortran-1=
4 to
unroll, the difference is reduced to about a factor of two.=