From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21480 invoked by alias); 31 Oct 2012 00:36:38 -0000 Received: (qmail 21244 invoked by uid 22791); 31 Oct 2012 00:36:36 -0000 X-SWARE-Spam-Status: No, hits=-4.2 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,KHOP_BIG_TO_CC,KHOP_RCVD_TRUST,KHOP_THREADED,RCVD_IN_DNSWL_LOW,RCVD_IN_HOSTKARMA_YE,RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mail-oa0-f47.google.com (HELO mail-oa0-f47.google.com) (209.85.219.47) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 31 Oct 2012 00:36:30 +0000 Received: by mail-oa0-f47.google.com with SMTP id h1so941636oag.20 for ; Tue, 30 Oct 2012 17:36:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:x-system-of-record:x-gm-message-state; bh=eufBK6zQ/PFKZSODMLbPKTTx65ThzwO33nHmODnUMic=; b=gDkcJzWQKjkH2ORB3R4YIQ1OYDsjtlw0CSI5arouHCRkzqzpfn8f6GzYkMtDFKrNyq S2i51oCcVKrR83flWf3PwuMHOd1tQJ0oaHqy8mJK/n/mkds4e42lNf+d7IYf6cy8sCTs IwPoMAmM4nMPYPlrhWIHrqoWtgFir00SMEPnKjDlnIunKKIBa7/XyYrMVVJpYgO6Lxai +85/JhYqOsFdsAiNgqPNDDTNFPQPIQkJitKnsjW4LlEBTjmolFfi3wE4DxYgN6aL8KlY HFjmfdcMp43Mo85VHR0MGn1JtZoPyK/wmBmg8NJsQcdjW38fhq2bhKQMzSPh3MAZhqjV j8UQ== MIME-Version: 1.0 Received: by 10.182.221.7 with SMTP id qa7mr29029214obc.5.1351643789662; Tue, 30 Oct 2012 17:36:29 -0700 (PDT) Received: by 10.182.176.106 with HTTP; Tue, 30 Oct 2012 17:36:29 -0700 (PDT) In-Reply-To: <50902624.3020705@redhat.com> References: <5008708E.1030109@redhat.com> <506F27AF.3070805@redhat.com> <50816D63.3020908@google.com> <20121026155447.GA4348@atrey.karlin.mff.cuni.cz> <50902624.3020705@redhat.com> Date: Wed, 31 Oct 2012 00:58:00 -0000 Message-ID: Subject: Re: User directed Function Multiversioning via Function Overloading (issue5752064) From: Sriraman Tallam To: Jason Merrill Cc: Jan Hubicka , Diego Novillo , Jan Hubicka , Xinliang David Li , Mark Mitchell , Nathan Sidwell , "H.J. Lu" , Richard Guenther , Uros Bizjak , reply@codereview.appspotmail.com, GCC Patches Content-Type: text/plain; charset=ISO-8859-1 X-System-Of-Record: true X-Gm-Message-State: ALoCoQk70jQPgkfyNTkyusReDDY+xJK1KbdJ7kCqdg4lkqzyVmM1hnxR5HFcCW8zhI9nSjNJdD2yFYqAxVeqxynw97sYhUD+aZDwSXN9BFaHmNZcCjGFjD3XKbNIpHwysWcaUC1GEnDvR77gJ8U2IjrPI+SUsh4HBw5UfB/1U2eUknnvmVzFIctSK4jGU2dUnSk0IPLks6C7 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2012-10/txt/msg02842.txt.bz2 On Tue, Oct 30, 2012 at 12:10 PM, Jason Merrill wrote: > On 10/27/2012 09:16 PM, Sriraman Tallam wrote: >> >> + /* See if there's a match. For functions that are >> multi-versioned, >> + all the versions match. */ >> if (same_type_p (target_fn_type, static_fn_type (fn))) >> - matches = tree_cons (fn, NULL_TREE, matches); >> + { >> + matches = tree_cons (fn, NULL_TREE, matches); >> + /*If versioned, push all possible versions into a vector. >> */ >> + if (DECL_FUNCTION_VERSIONED (fn)) >> + { >> + if (fn_ver_vec == NULL) >> + fn_ver_vec = VEC_alloc (tree, heap, 2); >> + VEC_safe_push (tree, heap, fn_ver_vec, fn); >> + } >> + } > > > Why do we need to keep both a list and vector of the matches? Right, but we later call the target hook get_function_versions_dispatcher which takes a vector. I could change that to accept a list instead if that is preferable? > >> + Call decls_match to make sure they are different because they are >> + versioned. */ >> + if (DECL_FUNCTION_VERSIONED (fn)) >> + { >> + for (match = TREE_CHAIN (matches); match; match = TREE_CHAIN >> (match)) >> + if (decls_match (fn, TREE_PURPOSE (match))) >> + break; >> + } > > > What if you have multiple matches that aren't all versions of the same > function? Right, I should really check if there are versions by comparing params too. I fixed this in joust but missed out here. I will make the change so that any matches with functions that do not belong to the semantically identical group of function versions will be caught and the ambiguity will be flagged. > > Why would it be a problem to have two separate declarations of the same > function? AFAIU, this should not be a problem. For duplicate declarations, duplicate_decls should merge them and they should never be seen here. Did I miss something? > >> + dispatcher_decl = targetm.get_function_versions_dispatcher >> (fn_ver_vec); > > > Is the idea here that if you have some versions declared, then a call, then > more versions declared, then another call, you will call two different > dispatchers, No, I thought about this but I did not want to handle this case in this iteration. The dispatcher is created only once and if more functions are declared later, they will not be dispatched atleast in this iteration. > where the first one will only dispatch to the versions declared > before the first call? If not, why do we care about the set of declarations > at this point? I am taking the address of a multi-versioned function here. The front-end is returning the address of the dispatcher decl instead. Since, I am building the dispatcher here, why not construct the cgraph datastructures for these versions too? That is why I aggregate all the declarations here. > >> + /* Mark this functio to be output. */ >> + node->local.finalized = 1; > > > Missing 'n' in "function". > >> @@ -14227,7 +14260,11 @@ cxx_comdat_group (tree decl) >> else >> break; >> } >> - name = DECL_ASSEMBLER_NAME (decl); >> + if (TREE_CODE (decl) == FUNCTION_DECL >> + && DECL_FUNCTION_VERSIONED (decl)) >> + name = DECL_NAME (decl); > > > This would mean that f in the global namespace and f in namespace foo would > end up in the same comdat group. Why do we need special handling here at > all? Right, we do not need special handling. It is ok for each function version to be in its own comdat group, I will remove this. > >> dump_function_name (tree t, int flags) >> { >> - tree name = DECL_NAME (t); >> + tree name; >> >> + /* For function versions, use the assembler name as the decl name is >> + the same for all versions. */ >> + if (TREE_CODE (t) == FUNCTION_DECL >> + && DECL_FUNCTION_VERSIONED (t)) >> + name = DECL_ASSEMBLER_NAME (t); > > > This shouldn't be necessary; we should print the target attribute when > printing the function declaration. Ok. > >> + Also, mark this function as needed if it is marked inline but >> + is a multi-versioned function. */ >> + if (((flag_keep_inline_functions >> + || DECL_FUNCTION_VERSIONED (fn)) > > > This should be marked as needed by the code that builds the dispatcher. I had some trouble previously figuring out where to mark this as needed. I will fix it. > >> + /* For calls to a multi-versioned function, overload resolution >> + returns the function with the highest target priority, that is, >> + the version that will checked for dispatching first. If this >> + version is inlinable, a direct call to this version can be made >> + otherwise the call should go through the dispatcher. */ > > > I'm a bit confused why people would want both dispatched calls and > non-dispatched inlining; I would expect that if a function can be compiled > differently enough on newer hardware to make versioning worthwhile, that > would be a larger difference than the call overhead. Simple example: int foo () { return 1; } int __attribute__ ((target ("popcnt"))) foo () { return 0; } int __attribute__ ((target ("popcnt"))) bar () { return foo (); } Here, the call to foo () from bar () will be turned into a direct call to the popcnt version. Here, if bar is executed, then popcnt is supported and the call to foo from bar will be dispatched to the popcnt version even if it goes through the dispatcher and this is known at compile time. So, why not make a direct call? I am only making direct calls to versions when I am sure the dispatcher would do the same. > >> + if (DECL_FUNCTION_VERSIONED (fn) >> + && !targetm.target_option.can_inline_p (current_function_decl, fn)) >> + { >> + struct cgraph_node *dispatcher_node = NULL; >> + fn = get_function_version_dispatcher (fn); >> + if (fn == NULL) >> + return NULL; >> + dispatcher_node = cgraph_get_create_node (fn); >> + gcc_assert (dispatcher_node != NULL); >> + /* Mark this function to be output. */ >> + dispatcher_node->local.finalized = 1; >> + } > > > Why do you need to mark this here? If you generate a call to the > dispatcher, cgraph should mark it to be output automatically. dispatcher_node does not have a body until it is generated in cgraphunit.c, so cgraph does not mark this field before this is processed in cgraph_analyze_function. > >> + /* For candidates of a multi-versioned function, make the version with >> + the highest priority win. This version will be checked for >> dispatching >> + first. If this version can be inlined into the caller, the >> front-end >> + will simply make a direct call to this function. */ > > > This is still too high in joust. I believe I said before that this code > should come just above > > /* If the two function declarations represent the same function (this can > happen with declarations in multiple scopes and arg-dependent lookup), > arbitrarily choose one. But first make sure the default args we're > using match. */ Yes, I missed this the last time around. Will fix it this time. > >> + /* For multiversioned functions, aggregate all the versions here for >> + generating the dispatcher body later if necessary. Check to see if >> + the dispatcher is already generated to avoid doing this more than >> + once. */ > > > This caching seems to assume that you'll always be considering the same > group of declarations, which goes back to my earlier question. Yes, for now I want to be only considering the same group of declarations. I am assuming that all declarations/definitions of all versions of foo are seen before the first call to foo. I do not want multiple dispatcher support complexity in this iteration. Is it ok to delay this to the next patch iteration? Your earlier question on this was: "This seems to assume that all the functions in the list of candidates are versioned, but there might be unrelated functions from different namespaces. Also, doing this every time someone calls a versioned function seems like the wrong place; I would think it would be better to build up a list of versions as you seed declarations, and then use that list to define the dispatcher at EOF if it's needed." I have fixed the problem of unrelated functions by always checking the type (same_type_p) and params (comp_params) in get_function_version_dispatcher. You talked about doing the dispatcher building later, but I did it here since I am doing it only once. Thanks, -Sri. > > Jason >