From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 22713 invoked by alias); 25 Apr 2010 05:06:59 -0000 Received: (qmail 21080 invoked by uid 48); 25 Apr 2010 05:06:39 -0000 Date: Sun, 25 Apr 2010 05:06:00 -0000 Subject: [Bug middle-end/43883] New: missed optimization of constant __int128_t modulus X-Bugzilla-Reason: CC Message-ID: Reply-To: gcc-bugzilla@gcc.gnu.org To: gcc-bugs@gcc.gnu.org From: "svfuerst at gmail dot com" Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2010-04/txt/msg02621.txt.bz2 The following function gets optimized at -O3 to: long long tmod2(long long x) { return x % 2; } mov %rdi,%rdx shr $0x3f,%rdx lea (%rdi,%rdx,1),%rax and $0x1,%eax sub %rdx,%rax retq This is very good code. Unfortunately, the 128 bit version doesn't get optimized nearly so well. __int128_t tmod2(__int128_t x) { return x % 2; } mov %rsi,%rdx mov %rdi,%r8 xor %ecx,%ecx shr $0x3f,%rdx push %rbx add %rdx,%r8 xor %edi,%edi mov %r8,%rsi mov %rdi,%r9 and $0x1,%esi mov %rsi,%r8 sub %rdx,%r8 sbb %rcx,%r9 mov %r8,%rax mov %r9,%rdx pop %rbx retq It looks like this simple variation of the 64bit algorithm will work for the 128 bit version: mov %rsi,%rdx <--- Just changed rdi into rsi shr $0x3f,%rdx <--- nicely already calculates high bytes in rdx lea (%rdi,%rdx,1),%rax and $0x1,%eax sub %rdx,%rax retq -- Summary: missed optimization of constant __int128_t modulus Product: gcc Version: 4.5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: svfuerst at gmail dot com GCC build triplet: x86_64-linux GCC host triplet: x86_64-linux GCC target triplet: x86_64-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43883