From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id C9039386F43F; Tue, 21 Apr 2020 07:37:56 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C9039386F43F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1587454676; bh=uBrHqTmn84iPvhGeisBsWgabRq1XF2y2Q+UPALyncPE=; h=From:To:Subject:Date:In-Reply-To:References:From; b=RrLKZI8vXtFFbc1ouMoxPg4R5VQ1ke2mkYPxdZ5F/d6G520S+4njiziJSaC5F18IZ gHScLcVpHRjbTMTBU4CCCuUhr2Cq/QMC4jIAsgezgunNOH66PlMQph7YzzPvTorxXZ v/qJazZbU8vjcE1L6W/+YMHt/TJN/PB5lBRkejzQ= From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/31485] C complex numbers, amd64 SSE, missed optimization opportunity Date: Tue, 21 Apr 2020 07:37:56 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 4.1.2 X-Bugzilla-Keywords: X-Bugzilla-Severity: enhancement X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Apr 2020 07:37:56 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D31485 --- Comment #14 from Richard Biener --- (In reply to Joel Yliluoma from comment #13) > GCC 4.1.2 is indicated in the bug report headers. > Luckily, Compiler Explorer has a copy of that exact version, and it indeed > vectorizes the second function: https://godbolt.org/z/DC_SSb >=20 > On my own system, the earliest I have is 4.6. The Compiler Explorer has 4= .4, > and it, or anything newer than that, no longer vectorizes either function. Ah, OK - that's before GCC learned vectorization and is code-generated by RTL expanding return {BIT_FIELD_REF + BIT_FIELD_REF }; so the only vector support was GCCs generic vectors (and intrinsics). The generated code is far from perfect though. I also think llvms code generation is bogus since it appears the ABI does not guarantee zeroed upper elements of the xmm0 argument which means they could contain sNaNs: typedef float ss2 __attribute__((vector_size(8))); typedef float ss4 __attribute__((vector_size(16))); ss2 add2(ss2 a, ss2 b); void bar(ss4 a) { volatile ss2 x; x =3D add2 ((ss2){a[0], a[1]}, (ss2){a[0], a[1]}); } produces bar: .LFB1:=20=20 .cfi_startproc subq $56, %rsp .cfi_def_cfa_offset 64 movdqa %xmm0, %xmm1 call add2 movq %xmm0, 24(%rsp) addq $56, %rsp which means we pass through 'a' unchanged.=