public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/46507] New: std::tuple missed optimization
@ 2010-11-16 17:59 miles at gnu dot org
  2010-11-16 19:06 ` [Bug tree-optimization/46507] " paolo.carlini at oracle dot com
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: miles at gnu dot org @ 2010-11-16 17:59 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46507

           Summary: std::tuple missed optimization
           Product: gcc
           Version: 4.6.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: miles@gnu.org


My understanding is that std::tuple should offer largely the same optimization
opportunities as std::pair (while being more flexible).

However gcc-4.6 seems to miss some optimizations with std::tuple that it
doesn't with pair.

E.g. the following code ("tp.cc"):

---- start ----
   #include <tuple>
   #include <utility>

   struct A
   {
     virtual void f () const;
   };

   void arg_tuple_test (const std::tuple<A> &t)
   {
     std::get<0>(t).f ();
   }

   void extern_tuple_test ()
   {
     extern const std::tuple<A> &t;
     std::get<0>(t).f ();
   }

   void arg_pair_test (const std::pair<A,A> &t)
   {
     t.first.f ();
   }
---- end ----

compiled with:

  g++-snapshot -std=c++0x -O2 -march=amdfam10 -S tp.cc

results in the following assembly:

   arg_tuple_test(std::tuple<A> const&):
       movq    (%rdi), %rax
       movq    (%rax), %rax
       jmp    *%rax

   extern_tuple_test():
       movq    t(%rip), %rdi
       movq    (%rdi), %rax
       movq    (%rax), %rax
       jmp    *%rax

   arg_pair_test(std::pair<A, A> const&):
       jmp    A::f() const

       .ident    "GCC: (Debian 20101114-1) 4.6.0 20101114 (experimental) [trunk
revision 166728]"


It seems like all of these functions should use a direct jump to A::f, but the
tuple versions do not.

Note that when I tried this same test yesterday, on a different machine (but
the same compiler version), "extern_tuple_test" (but not "arg_tuple_test")
_did_ result in a direct jump to A::f!  So maybe something funny is going on...


Thanks,

-Miles


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/46507] std::tuple missed optimization
  2010-11-16 17:59 [Bug tree-optimization/46507] New: std::tuple missed optimization miles at gnu dot org
@ 2010-11-16 19:06 ` paolo.carlini at oracle dot com
  2010-11-16 19:11 ` miles at gnu dot org
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: paolo.carlini at oracle dot com @ 2010-11-16 19:06 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46507

Paolo Carlini <paolo.carlini at oracle dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |paolo.carlini at oracle dot
                   |                            |com

--- Comment #1 from Paolo Carlini <paolo.carlini at oracle dot com> 2010-11-16 19:04:37 UTC ---
Interesting. Not that I think it would make a huge difference, but probably
this is a more "fair" comparison:

   #include <tuple>
   #include <utility>

   struct A
   {
     virtual void f () const;
   };

   void arg_tuple_test (const std::tuple<A,A> &t)
   {
     std::get<0>(t).f ();
   }

   void extern_tuple_test ()
   {
     extern const std::tuple<A, A> &t;
     std::get<0>(t).f ();
   }

   void arg_pair_test (const std::pair<A,A> &t)
   {
     std::get<0>(t).f ();
   }


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/46507] std::tuple missed optimization
  2010-11-16 17:59 [Bug tree-optimization/46507] New: std::tuple missed optimization miles at gnu dot org
  2010-11-16 19:06 ` [Bug tree-optimization/46507] " paolo.carlini at oracle dot com
@ 2010-11-16 19:11 ` miles at gnu dot org
  2010-11-16 19:53 ` redi at gcc dot gnu.org
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: miles at gnu dot org @ 2010-11-16 19:11 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46507

--- Comment #2 from miles at gnu dot org 2010-11-16 19:06:02 UTC ---
> Note that when I tried this same test yesterday, on a different machine (but
> the same compiler version), "extern_tuple_test" (but not "arg_tuple_test")
> _did_ result in a direct jump to A::f!  So maybe something funny is going on...

Actually this is wrong:  the version of "extern_tuple_test" which worked
yesterday was really this:

   void extern_tuple_test ()
   {
     extern const std::tuple<A> t;
     std::get<0>(t).f ();
   }

resulting in:

   extern_tuple_test():
    movl    $t, %edi
    jmp    A::f() const

Still, it seems like "arg_tuple_test" (and the previous version of
"extern_tuple_tst") should also received the same optimization.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/46507] std::tuple missed optimization
  2010-11-16 17:59 [Bug tree-optimization/46507] New: std::tuple missed optimization miles at gnu dot org
  2010-11-16 19:06 ` [Bug tree-optimization/46507] " paolo.carlini at oracle dot com
  2010-11-16 19:11 ` miles at gnu dot org
@ 2010-11-16 19:53 ` redi at gcc dot gnu.org
  2013-09-14 21:35 ` glisse at gcc dot gnu.org
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: redi at gcc dot gnu.org @ 2010-11-16 19:53 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46507

--- Comment #3 from Jonathan Wakely <redi at gcc dot gnu.org> 2010-11-16 19:27:48 UTC ---
Paolo, your example *does* show a significant difference:  the missed
optimisation is in std::get() and not tuple itself, as shown by 

#include <tuple>
#include <utility>

struct A
{
    virtual void f () const;
};

void arg_tuple_test (const std::pair<A,A> &t)
{
    std::get<0>(t).f ();
}

void arg_pair_test (const std::pair<A,A> &t)
{
    t.first.f ();
}

The second function jumps directly to A::f


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/46507] std::tuple missed optimization
  2010-11-16 17:59 [Bug tree-optimization/46507] New: std::tuple missed optimization miles at gnu dot org
                   ` (2 preceding siblings ...)
  2010-11-16 19:53 ` redi at gcc dot gnu.org
@ 2013-09-14 21:35 ` glisse at gcc dot gnu.org
  2013-10-12 20:13 ` [Bug tree-optimization/46507] std::get and devirtualization on non-automatic variables glisse at gcc dot gnu.org
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: glisse at gcc dot gnu.org @ 2013-09-14 21:35 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46507

Marc Glisse <glisse at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
                 CC|                            |mjambor at suse dot cz

--- Comment #4 from Marc Glisse <glisse at gcc dot gnu.org> ---
In t.first.f(), the front-end knows that t.first is really an object of type A,
not something derived, so it doesn't need to do a virtual call (I am just
guessing, I didn't check in the FE sources). On the other hand, std::get
returns an A&, which as far as the front-end knows could really be an object of
a derived type. The compiler can only know better after inlining. However, gcc
now knows how to devirtualize a number of calls, so there is hope.

Martin, do you have an opinion on the testcase in comment 3? We get:

  _2 = &t_1(D)->first;
  _4 = MEM[(const struct type *)t_1(D)]._vptr.A;
  _5 = *_4;
  OBJ_TYPE_REF(_5;_2->0) (_2); [tail call]

vs

  _2 = &t_1(D)->first;
  A::f (_2); [tail call]


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/46507] std::get and devirtualization on non-automatic variables
  2010-11-16 17:59 [Bug tree-optimization/46507] New: std::tuple missed optimization miles at gnu dot org
                   ` (3 preceding siblings ...)
  2013-09-14 21:35 ` glisse at gcc dot gnu.org
@ 2013-10-12 20:13 ` glisse at gcc dot gnu.org
  2013-11-08 14:41 ` jamborm at gcc dot gnu.org
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: glisse at gcc dot gnu.org @ 2013-10-12 20:13 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46507

Marc Glisse <glisse at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|std::tuple missed           |std::get and
                   |optimization                |devirtualization on
                   |                            |non-automatic variables

--- Comment #5 from Marc Glisse <glisse at gcc dot gnu.org> ---
I just read this comment in ipa-prop.c: "only for automatically allocated
objects but at the moment we devirtualize only these". And indeed, if we define
an automatic variable of type A and call arg_tuple_test on it (and that call is
inlined), then the call to f is devirtualized. So this is a known limitation of
the current devirtualization code. I don't think there is anything we can do at
the library level.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/46507] std::get and devirtualization on non-automatic variables
  2010-11-16 17:59 [Bug tree-optimization/46507] New: std::tuple missed optimization miles at gnu dot org
                   ` (4 preceding siblings ...)
  2013-10-12 20:13 ` [Bug tree-optimization/46507] std::get and devirtualization on non-automatic variables glisse at gcc dot gnu.org
@ 2013-11-08 14:41 ` jamborm at gcc dot gnu.org
  2014-01-16  6:10 ` hubicka at gcc dot gnu.org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: jamborm at gcc dot gnu.org @ 2013-11-08 14:41 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46507

Martin Jambor <jamborm at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jamborm at gcc dot gnu.org

--- Comment #6 from Martin Jambor <jamborm at gcc dot gnu.org> ---
(In reply to Marc Glisse from comment #4)
> Martin, do you have an opinion on the testcase in comment 3? We get:
> 
>   _2 = &t_1(D)->first;
>   _4 = MEM[(const struct type *)t_1(D)]._vptr.A;
>   _5 = *_4;
>   OBJ_TYPE_REF(_5;_2->0) (_2); [tail call]

Assuming that first is a non-artificial (as in DECL_ARTIFICIAL) field,
we should be able to devirtualize this but as you have already
noticed, our current type-based middle-end intraprocedural
devirtualization machinery only works on automatic variables, but this
example shows another interesting situation.  Thanks for the testcase,
I will try to make it work after I cleanup the the code to better
interoperate with Honza's recent devirtualization work.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/46507] std::get and devirtualization on non-automatic variables
  2010-11-16 17:59 [Bug tree-optimization/46507] New: std::tuple missed optimization miles at gnu dot org
                   ` (5 preceding siblings ...)
  2013-11-08 14:41 ` jamborm at gcc dot gnu.org
@ 2014-01-16  6:10 ` hubicka at gcc dot gnu.org
  2014-01-16  6:18 ` hubicka at gcc dot gnu.org
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: hubicka at gcc dot gnu.org @ 2014-01-16  6:10 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46507

Jan Hubicka <hubicka at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2014-01-16
                 CC|                            |hubicka at gcc dot gnu.org,
                   |                            |rguenther at suse dot de
           Assignee|unassigned at gcc dot gnu.org      |hubicka at gcc dot gnu.org
     Ever confirmed|0                           |1

--- Comment #7 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
OK, GIMPLE representation is pre-inlining as follows:
void arg_tuple_test(const std::pair<A, A>&) (const struct pair & t)
{
  const struct type & _2;
  const struct type & _3;
  int (*__vtbl_ptr_type) () * _5;
  int (*__vtbl_ptr_type) () _6;

  <bb 2>:
  _2 = std::get<0ul, A, A> (t_1(D));
  _3 = _2;
  _5 = _3->_vptr.A;
  _6 = *_5;
  OBJ_TYPE_REF(_6;(const struct A)_3->0) (_3);
  return;

}
Here I do not think we can devirtualize, because std::get can return a derived
type in general.

After inlining we get:
void arg_tuple_test(const std::pair<A, A>&) (const struct pair & t)
{
  int (*__vtbl_ptr_type) () * _3;
  int (*__vtbl_ptr_type) () _4;
  const struct A & _6;

  <bb 2>:
  _6 = &t_1(D)->first;
  _3 = MEM[(const struct type *)t_1(D)]._vptr.A;
  _4 = *_3;
  OBJ_TYPE_REF(_4;(const struct A)_6->0) (_6); [tail call]
  return;

}
Here we look all the way to figuring out that the object comes as parameter of
arg_tuple_test and give up on this point.
I think GIMPLE typing rules does not really allow us to take the type from
COMPONENT_REF in statement defining _6, so it seems we completely lose track of
the type info here. (and it indeed looks like a common case)

Richard?


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/46507] std::get and devirtualization on non-automatic variables
  2010-11-16 17:59 [Bug tree-optimization/46507] New: std::tuple missed optimization miles at gnu dot org
                   ` (6 preceding siblings ...)
  2014-01-16  6:10 ` hubicka at gcc dot gnu.org
@ 2014-01-16  6:18 ` hubicka at gcc dot gnu.org
  2014-01-16  8:51 ` rguenther at suse dot de
  2014-01-16 18:02 ` hubicka at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: hubicka at gcc dot gnu.org @ 2014-01-16  6:18 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46507

--- Comment #8 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
Hmm, also if there is anything in standard that prevents user to pass object of
type A by reference to type B (where A is not derived from B) then we can get
the type info from type of parameter t.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/46507] std::get and devirtualization on non-automatic variables
  2010-11-16 17:59 [Bug tree-optimization/46507] New: std::tuple missed optimization miles at gnu dot org
                   ` (7 preceding siblings ...)
  2014-01-16  6:18 ` hubicka at gcc dot gnu.org
@ 2014-01-16  8:51 ` rguenther at suse dot de
  2014-01-16 18:02 ` hubicka at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: rguenther at suse dot de @ 2014-01-16  8:51 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46507

--- Comment #9 from rguenther at suse dot de <rguenther at suse dot de> ---
On Thu, 16 Jan 2014, hubicka at gcc dot gnu.org wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46507
> 
> --- Comment #8 from Jan Hubicka <hubicka at gcc dot gnu.org> ---

void arg_tuple_test(const std::pair<A, A>&) (const struct pair & t)
{
  int (*__vtbl_ptr_type) () * _3;
  int (*__vtbl_ptr_type) () _4;
  const struct A & _6;

  <bb 2>:
  _6 = &t_1(D)->first;
  _3 = MEM[(const struct type *)t_1(D)]._vptr.A;
  _4 = *_3;
  OBJ_TYPE_REF(_4;(const struct A)_6->0) (_6); [tail call]
  return;

}
Here we look all the way to figuring out that the object comes as 
parameter
of
arg_tuple_test and give up on this point.
I think GIMPLE typing rules does not really allow us to take the type from
COMPONENT_REF in statement defining _6, so it seems we completely lose 
track
of
the type info here. (and it indeed looks like a common case)

Richard?

That is correct.  &t_1(D)->first is merely pointer offsetting and
we may not forward it to form t_1(D)->first._vptr.A for example
(there were wrong-code bugs because of such forwarding in the past).

In GIMPLE you don't have type information on pointed to types, you only
have type info for values.

> Hmm, also if there is anything in standard that prevents user to pass 
> object of type A by reference to type B (where A is not derived from B) 
> then we can get the type info from type of parameter t.

"Standards" do not apply to GIMPLE so you have to be very careful
in this area (one reason why I am not exactly comfortable about
all this devirt and ODR stuff done in the middle-end as opposed
to in a frontend).

Richard.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/46507] std::get and devirtualization on non-automatic variables
  2010-11-16 17:59 [Bug tree-optimization/46507] New: std::tuple missed optimization miles at gnu dot org
                   ` (8 preceding siblings ...)
  2014-01-16  8:51 ` rguenther at suse dot de
@ 2014-01-16 18:02 ` hubicka at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: hubicka at gcc dot gnu.org @ 2014-01-16 18:02 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46507

--- Comment #10 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
I agree we want to do as much as possible in FE.

ODR decisions are basically done in frontend now - we basically use mangling of
virtual tables given to us by C++ FE.  Eventually we will want to get ODR
information on all types, not only polymorphic ones.  Then I plan to go with
Jason's suggested way of calling decl_assembler name on types to get mangling
from C++ FE rather than my initial attempt to actually compare types for
equivalence by their high level names that hits interesting issues with
templates. It is left in code mostly as backup if the other solution blows up
our memory footprint.

I do not think we can do type inheritance analysis + devirtualization purely by
frotnend, since it is naturally an inter-procuedural problem that needs to be
handled by LTO. We already have sane way to represent the type inheritance tree
by binfos.  What we  need to work on is a sane and safe way representing the
relevant type information that is given to you by C++ standard but thrown away
on way to GIMPLE so the propagation can work.

I was thinking of type assert expressions that can be inserted by FE on places
where dynamic type is known for non-trivial reasons. Those can be used as
starting points for the propagation of type information and removed after IPA.
This would allow us to preserve information while inlining and also make it
easier to deal with dynamic type changes. For simple operations, like in this
testcase, this approach may be however impractical producing too many new
gimple statements.

I am slowly getting to stage where current devirt infrastructure seems to work
as expected and seems to use most of information already there in gimple that
seems to be safe to assume.

My plan was to basically wait for this release cycle, see how many bugs shows
up so I am sure I do not miss something obvious (well, like the virtual
inheritance fun).

I think only parts we miss now is to
 - move ipa-cp to new infrastructure
 - move dynamic type change code to ipa-devirt and make it work with contextes
 - understand ctors/dtors calls, since we already ave flag marking them
specially.

I also collected testcases we need to handle. I will try to sit down and
prepare some proposal about what we need to preserve in gimple.
I think we also want central place (probably in ipa-devirt toplevel comment)
documenting all the information we take into account and assumptions we make.
So far they are mostly trivial (i.e. static vars/automatic vars/values passed
by invisible references are known to have their type and not derived type)
except for Martin's dynamic changing code.


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-01-16 18:02 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-16 17:59 [Bug tree-optimization/46507] New: std::tuple missed optimization miles at gnu dot org
2010-11-16 19:06 ` [Bug tree-optimization/46507] " paolo.carlini at oracle dot com
2010-11-16 19:11 ` miles at gnu dot org
2010-11-16 19:53 ` redi at gcc dot gnu.org
2013-09-14 21:35 ` glisse at gcc dot gnu.org
2013-10-12 20:13 ` [Bug tree-optimization/46507] std::get and devirtualization on non-automatic variables glisse at gcc dot gnu.org
2013-11-08 14:41 ` jamborm at gcc dot gnu.org
2014-01-16  6:10 ` hubicka at gcc dot gnu.org
2014-01-16  6:18 ` hubicka at gcc dot gnu.org
2014-01-16  8:51 ` rguenther at suse dot de
2014-01-16 18:02 ` hubicka at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).