From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1442 invoked by alias); 10 Jun 2012 02:50:31 -0000 Received: (qmail 1432 invoked by uid 22791); 10 Jun 2012 02:50:30 -0000 X-SWARE-Spam-Status: No, hits=-3.3 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00,TW_BJ,TW_SB,TW_SW,TW_VZ,TW_ZB,TW_ZW X-Spam-Check-By: sourceware.org Received: from localhost (HELO gcc.gnu.org) (127.0.0.1) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sun, 10 Jun 2012 02:50:18 +0000 From: "adam at consulting dot net.nz" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/53623] New: [4.7 Regression] sign extension is effectively split into two x86-64 instructions Date: Sun, 10 Jun 2012 02:50:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: adam at consulting dot net.nz X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Changed-Fields: Message-ID: X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2012-06/txt/msg00509.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53623 Bug #: 53623 Summary: [4.7 Regression] sign extension is effectively split into two x86-64 instructions Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassigned@gcc.gnu.org ReportedBy: adam@consulting.net.nz Note: #include typedef (*inst_t)(int64_t rdi, int64_t rsi, int64_t rdx); int16_t code[256]; inst_t dispatch[256]; void an_inst(int64_t rdi, int64_t rsi, int64_t rdx) { rdx = code[rdx]; uint8_t inst = (uint8_t) rdx; rdx >>= 8; dispatch[inst](rdi, rsi, rdx); } int main(void) { return 0; } $ gcc-4.6 -O3 sign_extension_regression.c && objdump -d -m i386:x86-64 a.out |less 00000000004004a0 : 4004a0: 48 0f bf 94 12 20 1a movswq 0x601a20(%rdx,%rdx,1),%rdx 4004a7: 60 00 4004a9: 0f b6 c2 movzbl %dl,%eax 4004ac: 48 c1 fa 08 sar $0x8,%rdx 4004b0: 48 8b 04 c5 20 12 60 mov 0x601220(,%rax,8),%rax 4004b7: 00 4004b8: ff e0 jmpq *%rax int16_t is sign extended into RDX. RDX is arithmetic shifted down by 8 (after first extracting DL). Result: RDX contains a sign extended 8-bit value. $ gcc-4.7 -O3 sign_extension_regression.c && objdump -d -m i386:x86-64 a.out |less 00000000004004b0 : 4004b0: 0f b7 84 12 60 1a 60 movzwl 0x601a60(%rdx,%rdx,1),%eax 4004b7: 00 4004b8: 48 0f bf d0 movswq %ax,%rdx 4004bc: 0f b6 c0 movzbl %al,%eax 4004bf: 48 c1 fa 08 sar $0x8,%rdx 4004c3: 48 8b 04 c5 60 12 60 mov 0x601260(,%rax,8),%rax 4004ca: 00 4004cb: ff e0 jmpq *%rax int16_t is loaded into EAX without sign extension. The low 16 bits of EAX are loaded into RDX with sign extension. RDX is arithmetic shifted down by 8. Result: RDX contains a sign extended 8-bit value. This is a regression. gcc-4.6 achieved the same result with one less instruction. Note: The quality of the generated code is affect by Bug 45434 and Bug 46219. Suggested optimal approach with four instructions: 1. movzwl mem16 -> edx 2. movzbl dl -> eax 3. movsbq dh -> rdx 4. complex indrect jmp (combining mov mem64 -> rax; jmp rax)