From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 11827 invoked by alias); 4 Dec 2012 00:06:25 -0000 Received: (qmail 11747 invoked by uid 48); 4 Dec 2012 00:06:01 -0000 From: "mtkilpailut at torni dot org" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/55583] New: Extended shift instruction on x86-64 is not used, producing unoptimal code Date: Tue, 04 Dec 2012 00:06:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Keywords: X-Bugzilla-Severity: enhancement X-Bugzilla-Who: mtkilpailut at torni dot org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Changed-Fields: Message-ID: X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2012-12/txt/msg00310.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55583 Bug #: 55583 Summary: Extended shift instruction on x86-64 is not used, producing unoptimal code Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: rtl-optimization AssignedTo: unassigned@gcc.gnu.org ReportedBy: mtkilpailut@torni.org Created attachment 28866 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=28866 Source code demonstrating bad code generation On x86-64, extended shift instruction is not generated for some reason. Combined with other problems this creates very bad code. Test functions included for signed and unsigned 16,32,64-bit types for both left and right shifts and for constant n and function parameter n. Code of this form: unsigned int a, b; const int n = 2; void test32l (void) { b = (b << n) | (a >> (32 - n)); } expected code: mov a(%rip),%eax shld $0x2,%eax,b(%rip) ret produced code: mov b(%rip), %edx ; Size of register used here depends on gcc version mov a(%rip), %eax ; Size of register used here depends on gcc version sal $2, %edx ; Size of register used here depends on gcc version shr $25, %eax or %edx, %eax mov %eax, b(%rip) ret Tested with: COLLECT_GCC_OPTIONS='-v' '-c' '-save-temps' '-O2' '-Wall' '-W' '-o' 'gcc_shld_not_used' '-mtune=generic' I tried gcc versions: GNU C (Debian 4.7.2-4) version 4.7.2 (x86_64-linux-gnu) GNU C (Debian 4.6.3-11) version 4.6.3 (x86_64-linux-gnu) GNU C (Debian 4.5.3-9) version 4.5.3 (x86_64-linux-gnu) GNU C (Debian 4.4.7-2) version 4.4.7 (x86_64-linux-gnu) GNU C (GCC) version 4.8.0 20121203 (experimental) [trunk revision 194106] (x86_64-unknown-linux-gnu) All produce the same code modulo register size differences mentioned above. gcc HEAD changes sal to leal (,%rcx,4),%eax