From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 770D23858C35; Tue, 14 May 2024 15:30:30 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 770D23858C35
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1715700630;
	bh=q0nAN1vznP+wQEWxKMrU5QZWYjJ1OubQ5PXCd2f/5T0=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=cC+6k2X3pdq2TV9vS/3QC5ut91/sQZeT4EVXdiULtlnB3brFBzx/VQPlyIYQ4kYB2
	 x+cynSUxAalJagMn8wVB8pl52vgO7/moOJmzo6g+4upXd1s43Mfej15kjTow0q5Wb0
	 W65WBB8T1kjNrR2H2zN3pnsiFEzuNBimKUupFsC8=
From: "mjr19 at cam dot ac.uk" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/114767] gfortran AVX2 complex multiplication
 by (0d0,1d0) suboptimal
Date: Tue, 14 May 2024 15:30:30 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 14.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: mjr19 at cam dot ac.uk
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-114767-4-R8t6gid3pq@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-114767-4@http.gcc.gnu.org/bugzilla/>
References: <bug-114767-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D114767

--- Comment #7 from mjr19 at cam dot ac.uk ---
Another manifestation of this issue in GCC 13.1 and 14.1 is that the loop

  do i=3D1,n
     c(i)=3Da(i)*c(i)*(0d0,1d0)
  enddo

takes about twice as long to run as

  do i=3D1,n
     c(i)=3Da(i)*(0d0,1d0)*c(i)
  enddo

when compiled -Ofast -mavx2. In the second case the compiler manages to mer=
ge
its unnecessary desire to form separate vectors of real and imaginary
components to perform the sign flips on multiplying by i, with its much more
reasonable desire to form such vectors for the general complex-complex
multiplication.

One might also argue that, as the above expressions are mathematically
identical, at -Ofast the compiler ought to chose the faster anyway.=