From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 31216 invoked by alias); 5 May 2011 17:03:08 -0000 Received: (qmail 30998 invoked by uid 22791); 5 May 2011 17:03:05 -0000 X-SWARE-Spam-Status: No, hits=-2.4 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,SPF_HELO_PASS,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from smtp-out.google.com (HELO smtp-out.google.com) (74.125.121.67) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 05 May 2011 17:02:38 +0000 Received: from kpbe19.cbf.corp.google.com (kpbe19.cbf.corp.google.com [172.25.105.83]) by smtp-out.google.com with ESMTP id p45H2alg020944 for ; Thu, 5 May 2011 10:02:37 -0700 Received: from gxk21 (gxk21.prod.google.com [10.202.11.21]) by kpbe19.cbf.corp.google.com with ESMTP id p45H2C8g028648 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=NOT) for ; Thu, 5 May 2011 10:02:35 -0700 Received: by gxk21 with SMTP id 21so1233666gxk.33 for ; Thu, 05 May 2011 10:02:35 -0700 (PDT) MIME-Version: 1.0 Received: by 10.151.137.1 with SMTP id p1mr2486788ybn.73.1304614955010; Thu, 05 May 2011 10:02:35 -0700 (PDT) Received: by 10.150.192.11 with HTTP; Thu, 5 May 2011 10:02:34 -0700 (PDT) In-Reply-To: References: <20110429025248.90D61B21AB@azwildcat.mtv.corp.google.com> Date: Thu, 05 May 2011 17:08:00 -0000 Message-ID: Subject: Re: [google] Patch to support calling multi-versioned functions via new GCC builtin. (issue4440078) From: Xinliang David Li To: Richard Guenther Cc: Sriraman Tallam , reply@codereview.appspotmail.com, gcc-patches@gcc.gnu.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-System-Of-Record: true X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2011-05/txt/msg00441.txt.bz2 On Thu, May 5, 2011 at 2:16 AM, Richard Guenther wrote: > On Thu, May 5, 2011 at 12:19 AM, Xinliang David Li w= rote: >>> >>> I can think of some more-or-less obvious high-level forms, one would >>> for example simply stick a new DISPATCH tree into gimple_call_fn >>> (similar to how we can have OBJ_TYPE_REF there), the DISPATCH >>> tree would be of variable length, first operand the selector function >>> and further operands function addresses. =A0That would keep the >>> actual call visible (instead of a fake __builtin_dispatch call), someth= ing >>> I'd really like to see. >> >> This sounds like a good long term solution. > > Thinking about it again maybe, similar to OBJ_TYPE_REF, have the > selection itself lowered and only keep the set of functions as > additional info. =A0Thus instead of having the selector function as > first operand have a pointer to the selected function there (that also > avoids too much knowledge about the return value of the selector). > Thus, > > =A0sel =3D selector (); > =A0switch (sel) > =A0 { > =A0 case A: fn =3D &bar; > =A0 case B: fn =3D &foo; > =A0 } > =A0val =3D (*DISPATCH (fn, bar, foo)) (...); > > that way regular optimizations can apply to the selection, eventually > discard the dispatch if fn becomes a known direct function (similar > to devirtualization). =A0At expansion time the call address is simply > taken from the first operand and an indirect call is assembled. > > Does the above still provide enough knowledge for the IPA path isolation? > I like your original proposal (extending call) better because related information are tied together and is easier to hoist and clean up. I want propose a more general solution. 1) Generic Annotation Support for gcc IR -- it is used attach to application/optimization specific annotation to gimple statements and annotations can be passed around across passes. In gcc, I only see HISTOGRAM annotation for value profiling, which is not general enough 2) Support of CallInfo for each callsite. This is an annotation, but more standardized. The callinfo can be used to record information such as call attributes, call side effects, mod-ref information etc --- current gimple_call_flags can be folded into this Info structure. Similarly (not related to this discussion), LoopInfo structure can be introduced to annotate loop back edge jumps to allow FE to pass useful information at loop level. For floating pointer operations, things like the precision constraint, sensitivity to floating environment etc can be recorded in FPInfo. T >>> Restricting ourselves to use the existing target attribute at the >>> beginning (with a single, compiler-generated selector function) >>> is probably good enough to get a prototype up and running. >>> Extending it to arbitrary selector-function, value pairs using a >>> new attribute is then probably easy (I don't see the exact use-case >>> for that yet, but I suppose it exists if you say so). >> >> For the use cases, CPU model will be looked at instead of just the >> core architecture -- this will give use more information about the >> numbrer of cores, size of caches etc. Intel's runtime library does >> this checkiing at start up time so that the multi-versioned code can >> look at those and make the appropriate decisions. >> >> It will be even more complicated for arm processors -- which can have >> the same processor cores but configured differently w.r.t VFP, NEON >> etc. > > Ah, indeed. =A0I hadn't thought about the tuning for different variants > as opposed to enabling HW features. =A0So the interface for overloading > would be sth like > > enum X { Foo =3D 0, Bar =3D 5 }; > > enum X select () { return Bar; } > > void foo (void) __attribute__((dispatch(select, Bar))); > Yes, for overloading -- something like this looks good. Thanks, David