From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id DF6B5385803F; Tue, 6 Sep 2022 22:04:43 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DF6B5385803F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1662501883; bh=5Ywrd2lQaoKT8C0FHcdUgLHVEEu38j5pKyDTLQGsOC8=; h=From:To:Subject:Date:In-Reply-To:References:From; b=l/V5uPL3QHPc98pPk/+tEQs37VLQLjTmqfKybYtXQQx+wl8J/dqahz7jMKBCH2sSF mWiOLD0xAGz+1XuufeRU8jaU7trmS2jLuJvKqCMBiwWt+qzz/lLqsQbGMiwaVS7TVz fw1E02136PnUie3J6HIfXA/5dn0FmRXA5RxXKlaI= From: "pobrn at protonmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug middle-end/104151] [10/11/12/13 Regression] x86: excessive code generated for 128-bit byteswap Date: Tue, 06 Sep 2022 22:04:42 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: middle-end X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: pobrn at protonmail dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 12.3 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D104151 Barnab=C3=A1s P=C5=91cze changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |pobrn at protonmail dot com --- Comment #15 from Barnab=C3=A1s P=C5=91cze = --- Sorry, I haven't found a better issue. But I think the example below exhibi= ts the same or a very similar issue. I would expect the following code void f(unsigned char *p, std::uint32_t x, std::uint32_t y) { p[0] =3D x >> 24; p[1] =3D x >> 16; p[2] =3D x >> 8; p[3] =3D x >> 0; p[4] =3D y >> 24; p[5] =3D y >> 16; p[6] =3D y >> 8; p[7] =3D y >> 0; } to be compiled to something along the lines of f(unsigned char*, unsigned int, unsigned int): bswap esi bswap edx mov DWORD PTR [rdi], esi mov DWORD PTR [rdi+4], edx ret however, I get scores of bitwise operations instead if `-fno-tree-vectorize= ` is not specified. https://gcc.godbolt.org/z/z51K6qorv=