public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug other/46847] New: Missed optimization for variant of Duff's device
@ 2010-12-08  9:09 jjk at acm dot org
  2010-12-08  9:12 ` [Bug other/46847] " jjk at acm dot org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: jjk at acm dot org @ 2010-12-08  9:09 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46847

           Summary: Missed optimization for variant of Duff's device
           Product: gcc
           Version: 4.3.2
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: other
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: jjk@acm.org


The attached file contains a function that implements a variant of
"Duff's device" to zero out part of an array.  GCC 4.3.2 (Red Hat 4.3.2-7)
with -O3 generates rather bad code for x86-64; the code it produces for i386
might also be improved.

Generated code for -m32 and -m64 is in the attached file. The -m64 version
contains multiple copies of a redundant load for %esi (and corresponding
jumps).
Both versions move the computation of 'end' inside the loop, which seems
unnecessary to me.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug other/46847] Missed optimization for variant of Duff's device
  2010-12-08  9:09 [Bug other/46847] New: Missed optimization for variant of Duff's device jjk at acm dot org
@ 2010-12-08  9:12 ` jjk at acm dot org
  2010-12-08 12:02 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: jjk at acm dot org @ 2010-12-08  9:12 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46847

--- Comment #1 from Jens Kilian <jjk at acm dot org> 2010-12-08 09:12:07 UTC ---
Created attachment 22681
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=22681
Source code and generated assembly


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug other/46847] Missed optimization for variant of Duff's device
  2010-12-08  9:09 [Bug other/46847] New: Missed optimization for variant of Duff's device jjk at acm dot org
  2010-12-08  9:12 ` [Bug other/46847] " jjk at acm dot org
@ 2010-12-08 12:02 ` rguenth at gcc dot gnu.org
  2010-12-08 12:10 ` jjk at acm dot org
  2010-12-08 12:27 ` [Bug tree-optimization/46847] " rguenth at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2010-12-08 12:02 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46847

--- Comment #2 from Richard Guenther <rguenth at gcc dot gnu.org> 2010-12-08 12:02:09 UTC ---
GCC 4.5 doesn't move end computation inside the loop.  What do you expect
to be "good" code?  I get at -O[23]:

foo:
.LFB0:
        mov     %esi, %ecx
        subl    %edi, %esi
        mov     %edi, %eax
        andl    $7, %esi
        leaq    (%rdx,%rax,8), %rax
        leaq    (%rdx,%rcx,8), %rdx
        jmp     *.L10(,%rsi,8)
        .section        .rodata
        .align 8
        .align 4
.L10:
        .quad   .L2
        .quad   .L3
        .quad   .L4
        .quad   .L5
        .quad   .L6
        .quad   .L7
        .quad   .L8
        .quad   .L9
        .text
        .p2align 4,,10
        .p2align 3
.L9:
        movq    $0, (%rax)
        addq    $8, %rax
.L8:
        movq    $0, (%rax)
        addq    $8, %rax
.L7:
        movq    $0, (%rax)
        addq    $8, %rax
.L6:
        movq    $0, (%rax)
        addq    $8, %rax
.L5:
        movq    $0, (%rax)
        addq    $8, %rax
.L4:
        movq    $0, (%rax)
        addq    $8, %rax
.L3:
        movq    $0, (%rax)
        addq    $8, %rax
.L2:
        movq    $0, (%rax)
        addq    $8, %rax
        cmpq    %rax, %rdx
        jae     .L9
        rep
        ret

Similar code is generated by ICC 11.1.

Duffs device may be a fun thing from a C language perspective, but it
is a bad thing in general because you defy most
loop optimizations as it is a loop with multiple entries (which means
the loop isn't recognized as a loop by GCC).


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug other/46847] Missed optimization for variant of Duff's device
  2010-12-08  9:09 [Bug other/46847] New: Missed optimization for variant of Duff's device jjk at acm dot org
  2010-12-08  9:12 ` [Bug other/46847] " jjk at acm dot org
  2010-12-08 12:02 ` rguenth at gcc dot gnu.org
@ 2010-12-08 12:10 ` jjk at acm dot org
  2010-12-08 12:27 ` [Bug tree-optimization/46847] " rguenth at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: jjk at acm dot org @ 2010-12-08 12:10 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46847

--- Comment #3 from Jens Kilian <jjk at acm dot org> 2010-12-08 12:10:02 UTC ---
(In reply to comment #2)
> GCC 4.5 doesn't move end computation inside the loop.  What do you expect
> to be "good" code?  I get at -O[23]:

[snip]

The code generated by 4.5 is what I would have expected.
Feel free to close this as "fixed in the latest version".

> Duffs device may be a fun thing from a C language perspective, but it
> is a bad thing in general because you defy most
> loop optimizations as it is a loop with multiple entries (which means
> the loop isn't recognized as a loop by GCC).

I usually wouldn't consider using it, I just noticed the problem while
trying to tune some heavily used parts of our code.

Thanks,
        Jens.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/46847] Missed optimization for variant of Duff's device
  2010-12-08  9:09 [Bug other/46847] New: Missed optimization for variant of Duff's device jjk at acm dot org
                   ` (2 preceding siblings ...)
  2010-12-08 12:10 ` jjk at acm dot org
@ 2010-12-08 12:27 ` rguenth at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2010-12-08 12:27 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46847

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
             Status|UNCONFIRMED                 |RESOLVED
          Component|other                       |tree-optimization
         Resolution|                            |FIXED
   Target Milestone|---                         |4.5.1

--- Comment #4 from Richard Guenther <rguenth at gcc dot gnu.org> 2010-12-08 12:27:00 UTC ---
Thus, fixed in at least 4.5.1.

Thanks.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-12-08 12:27 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-08  9:09 [Bug other/46847] New: Missed optimization for variant of Duff's device jjk at acm dot org
2010-12-08  9:12 ` [Bug other/46847] " jjk at acm dot org
2010-12-08 12:02 ` rguenth at gcc dot gnu.org
2010-12-08 12:10 ` jjk at acm dot org
2010-12-08 12:27 ` [Bug tree-optimization/46847] " rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).