From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id F3B1E3858CDB; Tue, 22 Aug 2023 04:32:27 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org F3B1E3858CDB DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1692678748; bh=HC8lJpxo/G8thsF8xTKoIL6LHxvfa4UjOPQNrO2NY+8=; h=From:To:Subject:Date:In-Reply-To:References:From; b=GpxFXfxaFTCMsg4GoxD99jLBHHiI2SAQTYP8s3Zpk4w9b4/c6LgKchREJg0PVwtyF gRT/gcwLNGpFwp7myiIOfewnH+QJnAak2M1JaJoMrtsipyQdIm5e/wwOH5UOiADCFE DMmvFFgH3kY5JCUENOX+blUt8pTsT4aytgckDYmg= From: "pinskia at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/43147] SSE shuffle merge Date: Tue, 22 Aug 2023 04:32:24 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: enhancement X-Bugzilla-Who: pinskia at gcc dot gnu.org X-Bugzilla-Status: RESOLVED X-Bugzilla-Resolution: FIXED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 13.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_status target_milestone resolution Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D43147 Andrew Pinski changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Target Milestone|--- |13.0 Resolution|--- |FIXED --- Comment #21 from Andrew Pinski --- Constant folding part was fixed in GCC 12 but combining shuffles was fixed = in GCC 13. That is for: ``` __m128 m; int main() { m =3D _mm_shuffle_ps(m, m, 0xC9); // Those two shuffles together sw= ap pairs m =3D _mm_shuffle_ps(m, m, 0x2D); // And could be optimized to 0x4E printv(m); return 0; } ``` GCC 13+ Produces: ``` movaps m(%rip), %xmm0 shufps $78, %xmm0, %xmm0 movaps %xmm0, m(%rip) call _Z6printvDv4_f ``` instead of what was there in GCC 12: ``` movaps m(%rip), %xmm0 shufps $201, %xmm0, %xmm0 shufps $45, %xmm0, %xmm0 movaps %xmm0, m(%rip) ``` So closing as fixed in GCC 13.=