From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 17642 invoked by alias); 18 Apr 2011 15:57:47 -0000 Received: (qmail 17625 invoked by uid 22791); 18 Apr 2011 15:57:43 -0000 X-SWARE-Spam-Status: No, hits=-4.6 required=5.0 tests=AWL,BAYES_50,RCVD_IN_DNSWL_HI,TW_FN,TW_TM X-Spam-Check-By: sourceware.org Received: from cantor.suse.de (HELO mx1.suse.de) (195.135.220.2) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 18 Apr 2011 15:57:25 +0000 Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.221.2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.suse.de (Postfix) with ESMTP id 8884A8FEA2 for ; Mon, 18 Apr 2011 17:57:23 +0200 (CEST) Date: Mon, 18 Apr 2011 16:32:00 -0000 From: Martin Jambor To: Richard Guenther Cc: GCC Patches Subject: Re: [PATCH 4/4] Devirtualization based on global objects Message-ID: <20110418155722.GC3555@virgil.arch.suse.de> Mail-Followup-To: Richard Guenther , GCC Patches References: <20110415125619.325556455@virgil.suse.cz> <20110415125645.676652789@virgil.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2011-04/txt/msg01410.txt.bz2 Hi, On Fri, Apr 15, 2011 at 05:38:50PM +0200, Richard Guenther wrote: > On Fri, 15 Apr 2011, Martin Jambor wrote: > > > Hi, > > > > this is the patch that actually speeds up astar (together with the > > previous one). It implements devirtualization based on global > > objects. In the past we have refrained from doing this because in > > general it is difficult to say whether the object is currently being > > constructed and so it might have a dynamic type of one of its > > ancestors. However, if the object's class does not have any ancestors > > that obviously cannot happen. > > > > Devirtualizing in such conditions is enough to change a virtual call > > to regwayobj::isaddtobound in 473.astar to a direct one which can and > > should be inlined. That seemed a good justification to implement this > > and so the patch below does so and brings about 3.1% speedup for that > > benchmark with LTO. > > > > I acknowledge that instead of discarding all classes with ancestors it > > would be better to check that the called virtual method has the same > > implementation in all ancestors instead. That is perhaps something > > for later. > > > > It took me surprisingly long to realize that this technique can be > > used for folding virtual calls based on local automatically allocated > > objexts too and so can be used to un-XFAIL g++.dg/opt/devirt1.c that > > regressed in 4.6. > > > > Bootstrapped and tested on x86_64-linux. OK for trunk? > > > > Thanks, > > > > Martin > > > > > > 2011-04-15 Martin Jambor > > > > * ipa-cp.c (ipcp_process_devirtualization_opportunities): Devirtualize > > also according to actual contants. > > * gimple-fold.c (gimple_extract_devirt_binfo_from_cst): New function. > > (gimple_fold_obj_type_ref_call): New function. > > (gimple_fold_call): Call gimple_fold_obj_type_ref_call on > > OBJ_TYPE_REFs. > > * gimple.h (gimple_extract_devirt_binfo_from_cst): Declare. > > > > * testsuite/g++.dg/opt/devirt1.C: Bump to -O2, remove XFAIL. > > * testsuite/g++.dg/opt/devirt2.C: New test. > > * testsuite/g++.dg/ipa/devirt-g-1.C: Likewise. > > > > ... > > Index: src/gcc/gimple-fold.c > > =================================================================== > > --- src.orig/gcc/gimple-fold.c > > +++ src/gcc/gimple-fold.c > > @@ -1438,6 +1438,95 @@ gimple_adjust_this_by_delta (gimple_stmt > > gimple_call_set_arg (call_stmt, 0, tmp); > > } > > > > +/* Return a binfo to be used for devirtualization of calls based on an object > > + represented by a declaration (i.e. a global or automatically allocated one) > > + or NULL if it cannot be found or is not safe. CST is expected to be an > > + ADDR_EXPR of such object or the function will return NULL. Currently it is > > + safe to use such binfo only if it has no base binfo (i.e. no ancestors). */ > > + > > +tree > > +gimple_extract_devirt_binfo_from_cst (tree cst) > > +{ > > + HOST_WIDE_INT offset, size, max_size; > > + tree base, type, expected_type, binfo; > > + bool last_artificial = false; > > + > > + if (!flag_devirtualize > > + || TREE_CODE (cst) != ADDR_EXPR > > + || TREE_CODE (TREE_TYPE (TREE_TYPE (cst))) != RECORD_TYPE) > > + return NULL_TREE; > > + > > + cst = TREE_OPERAND (cst, 0); > > + expected_type = TREE_TYPE (cst); > > + base = get_ref_base_and_extent (cst, &offset, &size, &max_size); > > + type = TREE_TYPE (base); > > + if (!DECL_P (base) > > + || max_size == -1 > > + || max_size != size > > + || TREE_CODE (type) != RECORD_TYPE) > > + return NULL_TREE; > > + > > + while (true) > > + { > > + HOST_WIDE_INT pos, size; > > + tree fld; > > + > > + if (TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT (expected_type)) > > + break; > > + if (offset < 0) > > + return NULL_TREE; > > + > > + for (fld = TYPE_FIELDS (type); fld; fld = DECL_CHAIN (fld)) > > + { > > + if (TREE_CODE (fld) != FIELD_DECL) > > + continue; > > + > > + pos = int_bit_position (fld); > > + size = tree_low_cst (DECL_SIZE (fld), 1); > > + if (pos <= offset && (pos + size) > offset) > > + break; > > + } > > + if (!fld || TREE_CODE (TREE_TYPE (fld)) != RECORD_TYPE) > > + return NULL_TREE; > > + > > + last_artificial = DECL_ARTIFICIAL (fld); > > + type = TREE_TYPE (fld); > > + offset -= pos; > > + } > > Please add come comments on what this loop does. It looks like > it searches for the FIELD_DECL that corresponds to the incoming > constant address (but it doesn't have to exactly point to its > beginning?). Yes, an object can be within another. Like S within R in the added testcase g++/opt/devirt2.C. Then it checks it is not a representative of an ancestor but a user-defined object by looking at DECL_ARTIFICIAL. > Note that you miss to handle the case where the > declaration is view-converted to a different type and you > instead use the declared type. Which ISTR was correct as > placement new on decls is kind of invalid. Yes, I know I'm assuming that. > > > + if (last_artificial) > > + return NULL_TREE; > > + binfo = TYPE_BINFO (type); > > + if (!binfo || BINFO_N_BASE_BINFOS (binfo) > 0) > > + return NULL_TREE; > > + else > > + return binfo; > > +} > > + > > +/* Fold a call statement to OBJ_TYPE_REF to a direct call if possible. Return > > + true iff the statement was changed. GSI determines the statement. */ > > + > > +static bool > > +gimple_fold_obj_type_ref_call (gimple_stmt_iterator *gsi) > > +{ > > + gimple stmt = gsi_stmt (*gsi); > > + tree ref = gimple_call_fn (stmt); > > + tree binfo, fndecl, delta; > > + HOST_WIDE_INT token; > > + > > + binfo = gimple_extract_devirt_binfo_from_cst (OBJ_TYPE_REF_OBJECT (ref)); > > + if (!binfo) > > + return false; > > + > > + token = tree_low_cst (OBJ_TYPE_REF_TOKEN (ref), 1); > > Either use TREE_INT_CST_LOW or first check with host_integerp. > > Ok with that change. > OK. Unfortunately I have conflicted with your patch folding OBJ_TYPE_REFs and since you did not introduce a special function for OBJ_TYPE_REFs, I kept the functionality within gimple_fold_call too. So this is what I am testing and what I intend to commit if all goes well. Thanks a lot, Martin 2011-04-18 Martin Jambor * ipa-cp.c (ipcp_process_devirtualization_opportunities): Devirtualize also according to actual contants. * gimple-fold.c (gimple_extract_devirt_binfo_from_cst): New function. (gimple_fold_call): Use it. * gimple.h (gimple_extract_devirt_binfo_from_cst): Declare. * testsuite/g++.dg/opt/devirt1.C: Bump to -O2, remove XFAIL. * testsuite/g++.dg/opt/devirt2.C: New test. * testsuite/g++.dg/ipa/devirt-g-1.C: Likewise. Index: src/gcc/ipa-cp.c =================================================================== *** src.orig/gcc/ipa-cp.c --- src/gcc/ipa-cp.c *************** ipcp_process_devirtualization_opportunit *** 1246,1296 **** for (ie = node->indirect_calls; ie; ie = next_ie) { ! int param_index, types_count, j; HOST_WIDE_INT token, anc_offset; tree target, delta, otr_type; next_ie = ie->next_callee; if (!ie->indirect_info->polymorphic) continue; param_index = ie->indirect_info->param_index; ! if (param_index == -1 ! || ipa_param_cannot_devirtualize_p (info, param_index) ! || ipa_param_types_vec_empty (info, param_index)) continue; token = ie->indirect_info->otr_token; anc_offset = ie->indirect_info->anc_offset; otr_type = ie->indirect_info->otr_type; target = NULL_TREE; ! types_count = VEC_length (tree, info->params[param_index].types); ! for (j = 0; j < types_count; j++) { ! tree binfo = VEC_index (tree, info->params[param_index].types, j); ! tree d, t; ! binfo = get_binfo_at_offset (binfo, anc_offset, otr_type); if (!binfo) ! { ! target = NULL_TREE; ! break; ! } ! t = gimple_get_virt_mehtod_for_binfo (token, binfo, &d, true); ! if (!t) ! { ! target = NULL_TREE; ! break; ! } ! else if (!target) ! { ! target = t; ! delta = d; ! } ! else if (target != t || !tree_int_cst_equal (delta, d)) { ! target = NULL_TREE; ! break; } } --- 1246,1316 ---- for (ie = node->indirect_calls; ie; ie = next_ie) { ! int param_index; HOST_WIDE_INT token, anc_offset; tree target, delta, otr_type; + struct ipcp_lattice *lat; next_ie = ie->next_callee; if (!ie->indirect_info->polymorphic) continue; param_index = ie->indirect_info->param_index; ! if (param_index == -1) continue; + lat = ipcp_get_lattice (info, param_index); token = ie->indirect_info->otr_token; anc_offset = ie->indirect_info->anc_offset; otr_type = ie->indirect_info->otr_type; target = NULL_TREE; ! if (lat->type == IPA_CONST_VALUE) { ! tree binfo = gimple_extract_devirt_binfo_from_cst (lat->constant); ! if (!binfo) ! continue; binfo = get_binfo_at_offset (binfo, anc_offset, otr_type); if (!binfo) ! continue; ! target = gimple_get_virt_mehtod_for_binfo (token, binfo, &delta, ! false); ! } ! else ! { ! int types_count, j; ! if (ipa_param_cannot_devirtualize_p (info, param_index) ! || ipa_param_types_vec_empty (info, param_index)) ! continue; ! ! types_count = VEC_length (tree, info->params[param_index].types); ! for (j = 0; j < types_count; j++) { ! tree binfo = VEC_index (tree, info->params[param_index].types, j); ! tree d, t; ! ! binfo = get_binfo_at_offset (binfo, anc_offset, otr_type); ! if (!binfo) ! { ! target = NULL_TREE; ! break; ! } ! ! t = gimple_get_virt_mehtod_for_binfo (token, binfo, &d, true); ! if (!t) ! { ! target = NULL_TREE; ! break; ! } ! else if (!target) ! { ! target = t; ! delta = d; ! } ! else if (target != t || !tree_int_cst_equal (delta, d)) ! { ! target = NULL_TREE; ! break; ! } } } Index: src/gcc/gimple-fold.c =================================================================== *** src.orig/gcc/gimple-fold.c --- src/gcc/gimple-fold.c *************** gimple_adjust_this_by_delta (gimple_stmt *** 1441,1446 **** --- 1441,1514 ---- gimple_call_set_arg (call_stmt, 0, tmp); } + /* Return a binfo to be used for devirtualization of calls based on an object + represented by a declaration (i.e. a global or automatically allocated one) + or NULL if it cannot be found or is not safe. CST is expected to be an + ADDR_EXPR of such object or the function will return NULL. Currently it is + safe to use such binfo only if it has no base binfo (i.e. no ancestors). */ + + tree + gimple_extract_devirt_binfo_from_cst (tree cst) + { + HOST_WIDE_INT offset, size, max_size; + tree base, type, expected_type, binfo; + bool last_artificial = false; + + if (!flag_devirtualize + || TREE_CODE (cst) != ADDR_EXPR + || TREE_CODE (TREE_TYPE (TREE_TYPE (cst))) != RECORD_TYPE) + return NULL_TREE; + + cst = TREE_OPERAND (cst, 0); + expected_type = TREE_TYPE (cst); + base = get_ref_base_and_extent (cst, &offset, &size, &max_size); + type = TREE_TYPE (base); + if (!DECL_P (base) + || max_size == -1 + || max_size != size + || TREE_CODE (type) != RECORD_TYPE) + return NULL_TREE; + + /* Find the sub-object the constant actually refers to and mark whether it is + an artificial one (as opposed to a user-defined one). */ + while (true) + { + HOST_WIDE_INT pos, size; + tree fld; + + if (TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT (expected_type)) + break; + if (offset < 0) + return NULL_TREE; + + for (fld = TYPE_FIELDS (type); fld; fld = DECL_CHAIN (fld)) + { + if (TREE_CODE (fld) != FIELD_DECL) + continue; + + pos = int_bit_position (fld); + size = tree_low_cst (DECL_SIZE (fld), 1); + if (pos <= offset && (pos + size) > offset) + break; + } + if (!fld || TREE_CODE (TREE_TYPE (fld)) != RECORD_TYPE) + return NULL_TREE; + + last_artificial = DECL_ARTIFICIAL (fld); + type = TREE_TYPE (fld); + offset -= pos; + } + /* Artifical sub-objects are ancestors, we do not want to use them for + devirtualization, at least not here. */ + if (last_artificial) + return NULL_TREE; + binfo = TYPE_BINFO (type); + if (!binfo || BINFO_N_BASE_BINFOS (binfo) > 0) + return NULL_TREE; + else + return binfo; + } + /* Attempt to fold a call statement referenced by the statement iterator GSI. The statement may be replaced by another statement, e.g., if the call simplifies to a constant value. Return true if any changes were made. *************** gimple_fold_call (gimple_stmt_iterator * *** 1469,1478 **** /* Check for virtual calls that became direct calls. */ callee = gimple_call_fn (stmt); ! if (TREE_CODE (callee) == OBJ_TYPE_REF ! && gimple_call_addr_fndecl (OBJ_TYPE_REF_EXPR (callee)) != NULL_TREE) { ! gimple_call_set_fn (stmt, OBJ_TYPE_REF_EXPR (callee)); return true; } --- 1537,1563 ---- /* Check for virtual calls that became direct calls. */ callee = gimple_call_fn (stmt); ! if (TREE_CODE (callee) == OBJ_TYPE_REF) { ! tree binfo, fndecl, delta, obj; ! HOST_WIDE_INT token; ! ! if (gimple_call_addr_fndecl (OBJ_TYPE_REF_EXPR (callee)) != NULL_TREE) ! { ! gimple_call_set_fn (stmt, OBJ_TYPE_REF_EXPR (callee)); ! return true; ! } ! ! obj = OBJ_TYPE_REF_OBJECT (callee); ! binfo = gimple_extract_devirt_binfo_from_cst (obj); ! if (!binfo) ! return false; ! token = TREE_INT_CST_LOW (OBJ_TYPE_REF_TOKEN (callee)); ! fndecl = gimple_get_virt_mehtod_for_binfo (token, binfo, &delta, false); ! if (!fndecl) ! return false; ! gcc_assert (integer_zerop (delta)); ! gimple_call_set_fndecl (stmt, fndecl); return true; } Index: src/gcc/gimple.h =================================================================== *** src.orig/gcc/gimple.h --- src/gcc/gimple.h *************** const char *gimple_decl_printable_name ( *** 898,903 **** --- 898,904 ---- bool gimple_fold_call (gimple_stmt_iterator *gsi, bool inplace); tree gimple_get_virt_mehtod_for_binfo (HOST_WIDE_INT, tree, tree *, bool); void gimple_adjust_this_by_delta (gimple_stmt_iterator *, tree); + tree gimple_extract_devirt_binfo_from_cst (tree); /* Returns true iff T is a valid GIMPLE statement. */ extern bool is_gimple_stmt (tree); Index: src/gcc/testsuite/g++.dg/opt/devirt2.C =================================================================== *** /dev/null --- src/gcc/testsuite/g++.dg/opt/devirt2.C *************** *** 0 **** --- 1,11 ---- + // { dg-do compile } + // { dg-options "-O2" } + // { dg-final { scan-assembler-times "xyzzy" 2 } } + + struct S { S(); virtual void xyzzy(); }; + struct R { int a; S s; R(); }; + S s; + R r; + inline void foo(S *p) { p->xyzzy(); } + void bar() {foo(&s);} + void bah() {foo(&r.s);} Index: src/gcc/testsuite/g++.dg/ipa/devirt-g-1.C =================================================================== *** /dev/null --- src/gcc/testsuite/g++.dg/ipa/devirt-g-1.C *************** *** 0 **** --- 1,24 ---- + // { dg-do compile } + // { dg-options "-O2 -fdump-ipa-cp -fdump-tree-optimized" } + + struct S { S(); virtual void xyzzy(); void otherstuff(); }; + struct R { int a; S s; R(); }; + S s; + R r; + + void S::xyzzy () + { + otherstuff (); + otherstuff (); + } + + static void __attribute__ ((noinline)) foo(S *p) { p->xyzzy(); } + void bar() {foo(&s); } + + static void __attribute__ ((noinline)) foh(S *p) { p->xyzzy(); } + void bah() {foh(&r.s); } + + /* { dg-final { scan-ipa-dump "Discovered a virtual call to a known target.*S::xyzzy" "cp" } } */ + /* { dg-final { scan-tree-dump-times "OBJ_TYPE_REF" 0 "optimized"} } */ + /* { dg-final { cleanup-ipa-dump "cp" } } */ + /* { dg-final { cleanup-tree-dump "optimized" } } */ Index: src/gcc/testsuite/g++.dg/opt/devirt1.C =================================================================== *** src.orig/gcc/testsuite/g++.dg/opt/devirt1.C --- src/gcc/testsuite/g++.dg/opt/devirt1.C *************** *** 1,6 **** // { dg-do compile } ! // { dg-options "-O" } ! // { dg-final { scan-assembler "xyzzy" { xfail *-*-* } } } struct S { S(); virtual void xyzzy(); }; inline void foo(S *s) { s->xyzzy(); } --- 1,6 ---- // { dg-do compile } ! // { dg-options "-O2" } ! // { dg-final { scan-assembler "xyzzy" } } struct S { S(); virtual void xyzzy(); }; inline void foo(S *s) { s->xyzzy(); }