public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/67886] New: Incomplete optimization for virtual function call into freshly constructed object
@ 2015-10-07 20:06 Simon.Richter at hogyros dot de
  2023-05-05  6:51 ` [Bug tree-optimization/67886] " pinskia at gcc dot gnu.org
  0 siblings, 1 reply; 2+ messages in thread
From: Simon.Richter at hogyros dot de @ 2015-10-07 20:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67886

            Bug ID: 67886
           Summary: Incomplete optimization for virtual function call into
                    freshly constructed object
           Product: gcc
           Version: 4.9.2
            Status: UNCONFIRMED
          Severity: minor
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: Simon.Richter at hogyros dot de
  Target Milestone: ---

This is a bit of a corner/academic case, but came up in a Stack Overflow
discussion:

    struct Base {
        virtual void func() = 0;
    };

    struct Derived : Base {
        virtual void func() { };
    };

    void test()
    {
        Base* base = new Derived;

        for (int i = 0; i < 1000; ++i)
        {
            base->func();
        }
    }

The generated assembler code on x86_64 with -O3 is

Disassembly of section .text:

0000000000000000 <test()>:
   0:   55                      push   %rbp
   1:   53                      push   %rbx
   2:   bf 08 00 00 00          mov    $0x8,%edi
   7:   bb e8 03 00 00          mov    $0x3e8,%ebx
   c:   48 83 ec 08             sub    $0x8,%rsp
  10:   e8 00 00 00 00          callq  15 <test()+0x15>
                        11: R_X86_64_PC32       operator new(unsigned long)-0x4
  15:   ba 00 00 00 00          mov    $0x0,%edx
                        16: R_X86_64_32 vtable for Derived+0x10
  1a:   48 89 c5                mov    %rax,%rbp
  1d:   48 c7 00 00 00 00 00    movq   $0x0,(%rax)
                        20: R_X86_64_32S        vtable for Derived+0x10
  24:   eb 13                   jmp    39 <test()+0x39>
  26:   66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
  2d:   00 00 00 
  30:   83 eb 01                sub    $0x1,%ebx
  33:   74 1a                   je     4f <test()+0x4f>
  35:   48 8b 55 00             mov    0x0(%rbp),%rdx
  39:   48 8b 12                mov    (%rdx),%rdx
  3c:   48 81 fa 00 00 00 00    cmp    $0x0,%rdx
                        3f: R_X86_64_32S        Derived::func()
  43:   74 eb                   je     30 <test()+0x30>
  45:   48 89 ef                mov    %rbp,%rdi
  48:   ff d2                   callq  *%rdx
  4a:   83 eb 01                sub    $0x1,%ebx
  4d:   75 e6                   jne    35 <test()+0x35>
  4f:   48 83 c4 08             add    $0x8,%rsp
  53:   5b                      pop    %rbx
  54:   5d                      pop    %rbp
  55:   c3                      retq   

Disassembly of section .text._ZN7Derived4funcEv:

0000000000000000 <Derived::func()>:
   0:   f3 c3                   repz retq 

This looks like an optimization half-done. The optimizer correctly inlines the
function call to Derived::func() into the loop, and also correctly verifies
that the function pointer found in the vtable is indeed the same function that
was inlined -- otherwise, the inlined function is skipped and the regular
function called.

I presume that the pointer is rechecked on every loop iteration because it is
possible that the function call can destroy the object and create a new one in
its place that still derives from Base, so that is correct.

If you set -fPIC, the actual values for the vtable pointer and the pointer to
Derived::func() are fetched outside of the loop, and rechecked on each loop
iteration, again, correctly.

However: without -fPIC, there is no way to get a different definition of
Derived::func() without invoking UB, so the function pointer check is
tautological and can be optimized out, unraveling the entire fuzzy ball, as the
inlined function does not destroy the object, and inlining it into the loop
should give an empty loop that can be removed.

Also, wouldn't setting -fvisibility=hidden also take Derived's symbols out of
the dynamic symbol table, in which case I wouldn't be able to override them at
runtime with a preload library?

The optimal solution from an assembler programmer's perspective would be to
take the knowledge that the inlined function does not touch the object's
vtable, and create a path that handles the remaining loop iterations after the
object was shown to be a Derived object once -- this would probably be
optimized to a conditional jump to the ret instruction in the RTL pass -- but I
don't have enough knowledge to tell whether that would be easily doable in this
case.


^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Bug tree-optimization/67886] Incomplete optimization for virtual function call into freshly constructed object
  2015-10-07 20:06 [Bug tree-optimization/67886] New: Incomplete optimization for virtual function call into freshly constructed object Simon.Richter at hogyros dot de
@ 2023-05-05  6:51 ` pinskia at gcc dot gnu.org
  0 siblings, 0 replies; 2+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-05  6:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67886

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to fail|                            |7.5.0, 8.5.0
      Known to work|                            |9.1.0

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Seems fixed in GCC 9.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-05-05  6:51 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-07 20:06 [Bug tree-optimization/67886] New: Incomplete optimization for virtual function call into freshly constructed object Simon.Richter at hogyros dot de
2023-05-05  6:51 ` [Bug tree-optimization/67886] " pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).