From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 806B23858D33; Mon, 6 May 2024 09:05:06 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 806B23858D33 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1714986306; bh=+laaStMj5WqqcmhClG2cJkauicjcBPqH3iGu23qhj1M=; h=From:To:Subject:Date:In-Reply-To:References:From; b=FUjbA3OrHFdc/skJIgsJzlyZ805hGoYXnpc0q4HwvAnZXaLS/gOtlyHn2ul9ig+Lg j1MDwqOdC5urJaS90w2XnwV2W9qoQ/d09xLHFJj1WNIAEiuvSkaLspCNgRWGqYvvUu SzGYoc+tUfiAJqvfTvG2DwmvwQz2V2rL4pjr5CmY= From: "mkretz at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/114908] fails to optimize avx2 in-register permute written with std::experimental::simd Date: Mon, 06 May 2024 09:05:05 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: enhancement X-Bugzilla-Who: mkretz at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D114908 --- Comment #3 from Matthias Kretz (Vir) --- The stdx::simd implementation in this area is old and mainly tuned to be correct. I can rewrite the split and concat implementation to use __builtin_shufflevector (which wasn't available in GCC at the time when I originally implemented it). Doing so I can resolve this issue. How do you want to handle this? Because it would certainly be nice if the compiler can optimize this in the same way as Clang can. Should I try to co= me up with a testcase that doesn't need stdx::simd and then improve stdx::simd independently?=