From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 3284 invoked by alias); 26 Aug 2014 00:45:51 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 3259 invoked by uid 48); 26 Aug 2014 00:45:46 -0000 From: "oneill+gccbugs at cs dot hmc.edu" To: gcc-bugs@gcc.gnu.org Subject: [Bug middle-end/62263] New: Good codegen for bitwise rotate requires code that is technically undefined behavior Date: Tue, 26 Aug 2014 00:45:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: middle-end X-Bugzilla-Version: 4.9.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: oneill+gccbugs at cs dot hmc.edu X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2014-08/txt/msg01713.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62263 Bug ID: 62263 Summary: Good codegen for bitwise rotate requires code that is technically undefined behavior Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: oneill+gccbugs at cs dot hmc.edu LLVM lacks an intrinsic for performing bitwise rotation, relying instead on spotting the classic C idioms for specifying rotation using two shifts. Unfortunately, when the rotation is defined by variable, its ability to spot rotation code is poor. Code that supports a variable rotation also needs to handle rotation-by-zero, which the underlying instruction has no problem with, but when translated into the classic C idiom, results in an undefined shift (because shifting a 32-bit integer by 32 bits isn't allowed). In the following code, only rotate32_undefined1, rotate32_undefined2 and _rotl32_doubleand1 compiles to a simple rotate instruction. rotl32_zerocheck also compiles to a rotate, but it contains a redundant test for zero -- a test that is necessary in the C code but not necessary for the rotate. Somewhat annoyingly, Clang 3.5 also has poor rotation detection, and only detects it for rotl_doubleand2 and rotl_doubleand3, as well as rotl32_undefined2. It is filed as http://llvm.org/bugs/show_bug.cgi?id=20750 Thus GCC and Clang differ as to which code they want for a rotate, and neither is good at recognizing variations of the rotate idiom. --- C code --- unsigned int rotl32_undefined1(unsigned int v, unsigned char r) { r = r & 31; return (v << r) | (v >> (32 - r)); } unsigned int rotl32_undefined2(unsigned int v, unsigned char r) { return (v << (r & 31)) | (v >> (32 - (r & 31))); } unsigned int rotl32_zerocheck(unsigned int v, unsigned char r) { r = r & 31; return r ? (v << r) | (v >> (32 - r)) : v; } unsigned int rotl32_doubleand1(unsigned int v, unsigned char r) { r = r & 31; return (v << r) | (v >> ((32 - r) & 31)); } unsigned int rotl32_doubleand2(unsigned int v, unsigned char r) { return (v << (r & 31)) | (v >> ((32 - (r & 31)) & 31)); } unsigned int rotl32_doubleand3(unsigned int v, unsigned char r) { return (v << (r & 31)) | (v >> ((32 - r) & 31)); } --- Assembly output, gcc 4.9.0 -O3 -fomit-frame-pointer --- _rotl32_undefined1: LFB0: movl %esi, %ecx movl %edi, %eax andl $31, %ecx roll %cl, %eax ret LFE0: .section __TEXT,__text_cold,regular,pure_instructions LCOLDE0: .text LHOTE0: .section __TEXT,__text_cold,regular,pure_instructions LCOLDB1: .text LHOTB1: .align 4,0x90 .globl _rotl32_undefined2 _rotl32_undefined2: LFB1: movl %esi, %ecx movl %edi, %eax andl $31, %ecx roll %cl, %eax ret LFE1: .section __TEXT,__text_cold,regular,pure_instructions LCOLDE1: .text LHOTE1: .section __TEXT,__text_cold,regular,pure_instructions LCOLDB2: .text LHOTB2: .align 4,0x90 .globl _rotl32_zerocheck _rotl32_zerocheck: LFB2: movl %esi, %ecx movl %edi, %eax andl $31, %ecx roll %cl, %eax testb %cl, %cl cmove %edi, %eax ret LFE2: .section __TEXT,__text_cold,regular,pure_instructions LCOLDE2: .text LHOTE2: .section __TEXT,__text_cold,regular,pure_instructions LCOLDB3: .text LHOTB3: .align 4,0x90 .globl _rotl32_doubleand1 _rotl32_doubleand1: LFB3: movl %esi, %ecx movl %edi, %eax andl $31, %ecx roll %cl, %eax ret LFE3: .section __TEXT,__text_cold,regular,pure_instructions LCOLDE3: .text LHOTE3: .section __TEXT,__text_cold,regular,pure_instructions LCOLDB4: .text LHOTB4: .align 4,0x90 .globl _rotl32_doubleand2 _rotl32_doubleand2: LFB4: movl %esi, %ecx movl %edi, %eax negl %ecx shrl %cl, %eax movl %esi, %ecx andl $31, %ecx sall %cl, %edi orl %edi, %eax ret