public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug regression/44281]  New: Global Register variable pessimisation and regression
@ 2010-05-26  5:12 adam at consulting dot net dot nz
  2010-06-07  5:36 ` [Bug regression/44281] " adam at consulting dot net dot nz
                   ` (7 more replies)
  0 siblings, 8 replies; 14+ messages in thread
From: adam at consulting dot net dot nz @ 2010-05-26  5:12 UTC (permalink / raw)
  To: gcc-bugs

I am aware developers WONTFIX GCC being a pessimising compiler with respect to
some global register variable issues:
<http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42596>

GCC is copying registers for no good reason whatsoever. Below is a very simple
example where gcc 3.3.6 does a better job of optimising the code. Unnecessary
copying of registers may also occur with local register variables.

#include <stdint.h>

register uint64_t global_flag_stack __asm__("rbx");

void push_flag_into_global_reg_var(uint64_t a, uint64_t b) {
  uint64_t flag = (a==b);
  global_flag_stack <<= 8;
  global_flag_stack  |= flag;
}

uint64_t push_flag_into_local_var(uint64_t a, uint64_t b,
                                  uint64_t local_flag_stack) {
  uint64_t flag = (a==b);
  local_flag_stack <<= 8;
  return local_flag_stack | flag;
}

int main() {
}


gcc-3.3 (GCC) 3.3.6 (Debian 1:3.3.6-15):
$ gcc-3.3 -Os flags.c && objdump -d -m i386:x86-64:intel a.out|less
...
0000000000400478 <push_flag_into_global_reg_var>:
  400478:       31 c0                   xor    eax,eax
  40047a:       48 39 f7                cmp    rdi,rsi
  40047d:       0f 94 c0                sete   al
  400480:       48 c1 e3 08             shl    rbx,0x8
  400484:       48 09 c3                or     rbx,rax
  400487:       c3                      ret    

0000000000400488 <push_flag_into_local_var>:
  400488:       31 c0                   xor    eax,eax
  40048a:       48 39 f7                cmp    rdi,rsi
  40048d:       0f 94 c0                sete   al
  400490:       48 c1 e2 08             shl    rdx,0x8
  400494:       48 09 d0                or     rax,rdx
  400497:       c3                      ret  
...

gcc-4.1 (GCC) 4.1.3 20080704 (prerelease) (Debian 4.1.2-29):
$ gcc-4.1 -Os flags.c && objdump -d -m i386:x86-64:intel a.out|less
...
0000000000400448 <push_flag_into_global_reg_var>:
  400448:       48 89 da                mov    rdx,rbx
  40044b:       31 c0                   xor    eax,eax
  40044d:       48 c1 e2 08             shl    rdx,0x8
  400451:       48 39 f7                cmp    rdi,rsi
  400454:       0f 94 c0                sete   al
  400457:       48 89 d3                mov    rbx,rdx
  40045a:       48 09 c3                or     rbx,rax
  40045d:       c3                      ret    

000000000040045e <push_flag_into_local_var>:
  40045e:       48 c1 e2 08             shl    rdx,0x8
  400462:       31 c0                   xor    eax,eax
  400464:       48 39 f7                cmp    rdi,rsi
  400467:       0f 94 c0                sete   al
  40046a:       48 09 d0                or     rax,rdx
  40046d:       c3                      ret 
...

gcc-4.5 (Debian 4.5.0-1) 4.5.0:
$ gcc-4.5 -Os flags.c && objdump -d -m i386:x86-64:intel a.out|less
...
0000000000400494 <push_flag_into_global_reg_var>:
  400494:       31 d2                   xor    edx,edx
  400496:       48 39 f7                cmp    rdi,rsi
  400499:       48 89 d8                mov    rax,rbx
  40049c:       0f 94 c2                sete   dl
  40049f:       48 c1 e0 08             shl    rax,0x8
  4004a3:       48 89 d3                mov    rbx,rdx
  4004a6:       48 09 c3                or     rbx,rax
  4004a9:       c3                      ret    

00000000004004aa <push_flag_into_local_var>:
  4004aa:       48 89 d0                mov    rax,rdx
  4004ad:       31 d2                   xor    edx,edx
  4004af:       48 c1 e0 08             shl    rax,0x8
  4004b3:       48 39 f7                cmp    rdi,rsi
  4004b6:       0f 94 c2                sete   dl
  4004b9:       48 09 d0                or     rax,rdx
  4004bc:       c3                      ret   
...

The object code that current GCC is generating is embarrassing compared with
GCC 3.3.6. Is it also necessary to increase the code footprint of
push_flag_into_local_var when optimising for size (-Os) when compared to gcc
3.3.6 and 4.1.3?


-- 
           Summary: Global Register variable pessimisation and regression
           Product: gcc
           Version: 4.5.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: regression
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: adam at consulting dot net dot nz


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44281


^ permalink raw reply	[flat|nested] 14+ messages in thread
[parent not found: <bug-44281-4@http.gcc.gnu.org/bugzilla/>]

end of thread, other threads:[~2011-03-05  2:01 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-26  5:12 [Bug regression/44281] New: Global Register variable pessimisation and regression adam at consulting dot net dot nz
2010-06-07  5:36 ` [Bug regression/44281] " adam at consulting dot net dot nz
2010-07-20 22:53 ` [Bug rtl-optimization/44281] [4.3/4.4/4.5/4.6 Regression] Global Register variable pessimisation steven at gcc dot gnu dot org
2010-07-20 22:55 ` pinskia at gcc dot gnu dot org
2010-07-22  8:48 ` rguenth at gcc dot gnu dot org
2010-09-11 11:16 ` adam at consulting dot net dot nz
2010-09-11 13:50 ` hjl dot tools at gmail dot com
2010-09-12 14:12 ` pinskia at gcc dot gnu dot org
2010-09-13  0:24 ` adam at consulting dot net dot nz
     [not found] <bug-44281-4@http.gcc.gnu.org/bugzilla/>
2011-03-04  7:23 ` adam at consulting dot net.nz
2011-03-04  7:46 ` jakub at gcc dot gnu.org
2011-03-04 10:51 ` adam at consulting dot net.nz
2011-03-04 11:23 ` jakub at gcc dot gnu.org
2011-03-05  2:01 ` adam at consulting dot net.nz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).