From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 4A9573858C5E; Mon, 10 Jul 2023 14:47:14 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4A9573858C5E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689000434; bh=oamgzPiY4EUjVFxyfEvMs+Z3zICWLue8394y1JMXSC8=; h=From:To:Subject:Date:From; b=iAW+Z4FYH9Cd6lB1Y72rKp8z/dB6MXfRe7raieVrknHcLNWTS29z5ibsB/ZNysouP g/INz6VhxJYxu+S+m11EUgds4PBzTxIxvmhBPU/z0vbY/NZe0suKRjWYdnl+X7xqf/ /jasQsysi62D68Aks+7PR780j2kjDfbIGVExwwgc= From: "lh_mouse at 126 dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/110613] New: optimization about combined store of adjacent bitfields Date: Mon, 10 Jul 2023 14:47:13 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: lh_mouse at 126 dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D110613 Bug ID: 110613 Summary: optimization about combined store of adjacent bitfields Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: lh_mouse at 126 dot com Target Milestone: --- This is a piece of code taken from a WebSocket frame parser: ``` #include struct Header=20 { // 1st byte uint8_t opcode : 4; uint8_t rsv3 : 1; uint8_t rsv2 : 1; uint8_t rsv1 : 1; uint8_t fin : 1; // 2nd byte uint8_t reserved_1 : 7; uint8_t mask : 1; // 3rd and 4th bytes uint8_t reserved_2; uint8_t reserved_3; // 5th to 7th bytes union { char mask_key[4]; uint32_t mask_key_u32; }; // 8th to 15th bytes uint64_t payload_len; }; void set_header(Header* ph, const uint8_t* bptr) { uint8_t f =3D bptr[0]; uint8_t t =3D bptr[1]; ph->fin =3D f >> 7; ph->rsv1 =3D f >> 6; ph->rsv2 =3D f >> 5; ph->rsv3 =3D f >> 4; ph->opcode =3D f; ph->mask =3D t >> 7; ph->reserved_1 =3D t; } ``` The structure is designed to match x86_64 ABI (little endian), so ``` ph->fin =3D f >> 7; ph->rsv1 =3D f >> 6; ph->rsv2 =3D f >> 5; ph->rsv3 =3D f >> 4; ph->opcode =3D f; ``` should be a simple move (https://gcc.godbolt.org/z/9vTqs7axj), and ``` ph->mask =3D t >> 7; ph->reserved_1 =3D t; ``` should also be a simple move (https://gcc.godbolt.org/z/KdchWvEn1), but! When put these two pieces of code together, guess what?: (godbolt: https://gcc.godbolt.org/z/hbEaeb3MT) ``` set_header(Header*, unsigned char const*): movzx edx, BYTE PTR [rsi] movzx ecx, BYTE PTR [rsi+1] mov eax, edx mov esi, edx shr al, 4 and esi, 15 and eax, 1 sal eax, 4 or eax, esi mov esi, edx shr sil, 5 and esi, 1 sal esi, 5 or eax, esi mov esi, edx shr dl, 7 shr sil, 6 movzx edx, dl and esi, 1 sal edx, 7 sal esi, 6 or eax, esi or eax, edx mov edx, ecx shr cl, 7 and edx, 127 sal ecx, 15 sal edx, 8 or eax, edx or eax, ecx mov WORD PTR [rdi], ax ret ```=