From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 19090 invoked by alias); 30 Oct 2012 19:10:47 -0000 Received: (qmail 18979 invoked by uid 22791); 30 Oct 2012 19:10:45 -0000 X-SWARE-Spam-Status: No, hits=-6.0 required=5.0 tests=AWL,BAYES_00,KHOP_BIG_TO_CC,KHOP_RCVD_UNTRUST,KHOP_THREADED,RCVD_IN_DNSWL_HI,RCVD_IN_HOSTKARMA_W,RP_MATCHES_RCVD,SPF_HELO_PASS X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 30 Oct 2012 19:10:35 +0000 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id q9UJAVGN031769 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 30 Oct 2012 15:10:31 -0400 Received: from [10.3.113.101] (ovpn-113-101.phx2.redhat.com [10.3.113.101]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id q9UJASLm022160; Tue, 30 Oct 2012 15:10:29 -0400 Message-ID: <50902624.3020705@redhat.com> Date: Tue, 30 Oct 2012 19:18:00 -0000 From: Jason Merrill User-Agent: Mozilla/5.0 (X11; Linux i686; rv:16.0) Gecko/20121016 Thunderbird/16.0.1 MIME-Version: 1.0 To: Sriraman Tallam CC: Jan Hubicka , Diego Novillo , Jan Hubicka , Xinliang David Li , Mark Mitchell , Nathan Sidwell , "H.J. Lu" , Richard Guenther , Uros Bizjak , reply@codereview.appspotmail.com, GCC Patches Subject: Re: User directed Function Multiversioning via Function Overloading (issue5752064) References: <5008708E.1030109@redhat.com> <506F27AF.3070805@redhat.com> <50816D63.3020908@google.com> <20121026155447.GA4348@atrey.karlin.mff.cuni.cz> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2012-10/txt/msg02786.txt.bz2 On 10/27/2012 09:16 PM, Sriraman Tallam wrote: > + /* See if there's a match. For functions that are multi-versioned, > + all the versions match. */ > if (same_type_p (target_fn_type, static_fn_type (fn))) > - matches = tree_cons (fn, NULL_TREE, matches); > + { > + matches = tree_cons (fn, NULL_TREE, matches); > + /*If versioned, push all possible versions into a vector. */ > + if (DECL_FUNCTION_VERSIONED (fn)) > + { > + if (fn_ver_vec == NULL) > + fn_ver_vec = VEC_alloc (tree, heap, 2); > + VEC_safe_push (tree, heap, fn_ver_vec, fn); > + } > + } Why do we need to keep both a list and vector of the matches? > + Call decls_match to make sure they are different because they are > + versioned. */ > + if (DECL_FUNCTION_VERSIONED (fn)) > + { > + for (match = TREE_CHAIN (matches); match; match = TREE_CHAIN (match)) > + if (decls_match (fn, TREE_PURPOSE (match))) > + break; > + } What if you have multiple matches that aren't all versions of the same function? Why would it be a problem to have two separate declarations of the same function? > + dispatcher_decl = targetm.get_function_versions_dispatcher (fn_ver_vec); Is the idea here that if you have some versions declared, then a call, then more versions declared, then another call, you will call two different dispatchers, where the first one will only dispatch to the versions declared before the first call? If not, why do we care about the set of declarations at this point? > + /* Mark this functio to be output. */ > + node->local.finalized = 1; Missing 'n' in "function". > @@ -14227,7 +14260,11 @@ cxx_comdat_group (tree decl) > else > break; > } > - name = DECL_ASSEMBLER_NAME (decl); > + if (TREE_CODE (decl) == FUNCTION_DECL > + && DECL_FUNCTION_VERSIONED (decl)) > + name = DECL_NAME (decl); This would mean that f in the global namespace and f in namespace foo would end up in the same comdat group. Why do we need special handling here at all? > dump_function_name (tree t, int flags) > { > - tree name = DECL_NAME (t); > + tree name; > > + /* For function versions, use the assembler name as the decl name is > + the same for all versions. */ > + if (TREE_CODE (t) == FUNCTION_DECL > + && DECL_FUNCTION_VERSIONED (t)) > + name = DECL_ASSEMBLER_NAME (t); This shouldn't be necessary; we should print the target attribute when printing the function declaration. > + Also, mark this function as needed if it is marked inline but > + is a multi-versioned function. */ > + if (((flag_keep_inline_functions > + || DECL_FUNCTION_VERSIONED (fn)) This should be marked as needed by the code that builds the dispatcher. > + /* For calls to a multi-versioned function, overload resolution > + returns the function with the highest target priority, that is, > + the version that will checked for dispatching first. If this > + version is inlinable, a direct call to this version can be made > + otherwise the call should go through the dispatcher. */ I'm a bit confused why people would want both dispatched calls and non-dispatched inlining; I would expect that if a function can be compiled differently enough on newer hardware to make versioning worthwhile, that would be a larger difference than the call overhead. > + if (DECL_FUNCTION_VERSIONED (fn) > + && !targetm.target_option.can_inline_p (current_function_decl, fn)) > + { > + struct cgraph_node *dispatcher_node = NULL; > + fn = get_function_version_dispatcher (fn); > + if (fn == NULL) > + return NULL; > + dispatcher_node = cgraph_get_create_node (fn); > + gcc_assert (dispatcher_node != NULL); > + /* Mark this function to be output. */ > + dispatcher_node->local.finalized = 1; > + } Why do you need to mark this here? If you generate a call to the dispatcher, cgraph should mark it to be output automatically. > + /* For candidates of a multi-versioned function, make the version with > + the highest priority win. This version will be checked for dispatching > + first. If this version can be inlined into the caller, the front-end > + will simply make a direct call to this function. */ This is still too high in joust. I believe I said before that this code should come just above /* If the two function declarations represent the same function (this can happen with declarations in multiple scopes and arg-dependent lookup), arbitrarily choose one. But first make sure the default args we're using match. */ > + /* For multiversioned functions, aggregate all the versions here for > + generating the dispatcher body later if necessary. Check to see if > + the dispatcher is already generated to avoid doing this more than > + once. */ This caching seems to assume that you'll always be considering the same group of declarations, which goes back to my earlier question. Jason