From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-return-170694-listarch-gcc=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 9812 invoked by alias); 17 Oct 2011 15:58:16 -0000
Received: (qmail 9801 invoked by uid 22791); 17 Oct 2011 15:58:14 -0000
X-SWARE-Spam-Status: No, hits=-6.6 required=5.0	tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,SPF_HELO_PASS,TW_BJ
X-Spam-Check-By: sourceware.org
Received: from mail.commerzbank.com (HELO mail.commerzbank.com) (212.149.50.150)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 17 Oct 2011 15:57:52 +0000
Received: by mail.commerzbank.com (Commerzbank Mail-System, from userid 1002)	id DEA6E2F60D8; Mon, 17 Oct 2011 17:57:48 +0200 (CEST)
X-Spam-Virus: No
Received: from mail.commerzbank.com (localhost [127.0.0.1])	by mail.commerzbank.com (Commerzbank Mail-System) with ESMTP id 952F02F6276;	Mon, 17 Oct 2011 17:57:46 +0200 (CEST)
Received: from pfx2.commerzbank.com (intern.postfix.commerzbank.com [172.16.71.213])	by mail.commerzbank.com (Commerzbank Mail-System) with ESMTPS id 903592F61FB;	Mon, 17 Oct 2011 17:57:46 +0200 (CEST)
Received: by pfx2.commerzbank.com (Commerzbank Internal Mail-System, from userid 1001)	id 778C943346; Mon, 17 Oct 2011 17:57:46 +0200 (CEST)
Received: from SE002520.cs.commerzbank.com (se002520.cs.commerzbank.com [140.26.81.20]) 	by pfx2.commerzbank.com (Commerzbank Internal Mail-System) with ESMTPS id 6EC8343339; 	Mon, 17 Oct 2011 17:57:46 +0200 (CEST)
Received: from SE002568.cs.commerzbank.com ([fe80::3ccb:5984:6f44:b0cd]) by SE002520.cs.commerzbank.com ([fe80::d5d3:f347:fd24:da1e%10]) with mapi; Mon, 17 Oct 2011 17:57:45 +0200
From: "Hite, Christopher" <Christopher.Hite@partner.commerzbank.com>
To: "gcc@gcc.gnu.org" <gcc@gcc.gnu.org>
CC: Richard Guenther <richard.guenther@gmail.com>, 'Piotr Rak' 	<piotr.rak@gmail.com>
Date: Mon, 17 Oct 2011 16:35:00 -0000
Subject: RE: optimization question: mpl
Message-ID: <56888C35366397458A8FA0626BA836F119A4E0CDB5@SE002568.cs.commerzbank.com>
References: <824586E7387D1443B5DD024DAC75B91C139049@SE000319.ztb.icb.commerzbank.com> <AANLkTi=5t20XgJ-m4Zbe3F2f7OUBaHOEfkxoHK27pF8R@mail.gmail.com>
In-Reply-To: <AANLkTi=5t20XgJ-m4Zbe3F2f7OUBaHOEfkxoHK27pF8R@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc/>
List-Post: <mailto:gcc@gcc.gnu.org>
List-Help: <http://gcc.gnu.org/ml/>
Sender: gcc-owner@gcc.gnu.org
X-SW-Source: 2011-10/txt/msg00252.txt.bz2


I'm sorry I haven't gotten around to chasing this sooner.  I needed to lear=
n how to use objdump and read the assembler.  The code now lets you define =
the depth.

It generates some pretty crappy code even with gcc-4.6.0 on O3.  The code w=
hen generating a 20 deep template inlines every 6 levels.  At 50 depth it o=
nly does 2 levels at a time.

Someone mentioned taking control of this myself and implementing array look=
ups.  Sure, I could also do binary searches and hash tables when the data n=
umbers aren't continuous.  The problem with doing these things is that it t=
akes control away from the compiler's optimizer.  Theoretically an optimize=
r should know best when to go from if/elses to binary lookups to hash table=
s.  I could do this kind of stuff at the metaprogramming level, but that co=
uld break an optimizer in the future.  It reminds me of all the "if x=3D=3D=
0" you see in old Fortran code which made things run faster on old machines=
 were branching was cheap.

Theoretically I might know which ids are most probable, but profile feedbac=
k optimization could be used for this.

Would gcc join switch/case statements?
	switch(id) { case 1: foo(); break;}
	switch(id) { case 2: bar(); break;}

Another idea to revive the switch/case is to implement a utility with a N s=
witch cases, one for each size.  I could maybe use self including headers o=
r generate such code.


By the way this kind of code exists.  Boost::variant::apply_visitor uses re=
peated switch/case statements.  Changing foo() below to switch/case doesn't=
 seem to help.

Chris

------------------ improved code
#include <iostream>
#include <cstdlib>
#include <typeinfo>

struct DecodeContext;

struct M1{
	static const int id=3D13;
	static void decode(DecodeContext& ){ std::cout<<"1\n";}
};

struct M2{
	static const int id=3D17;
	static void decode(DecodeContext& ){ std::cout<<"2\n";}
};

struct M3{
	static const int id=3D19;
	static void decode(DecodeContext& ){ std::cout<<"3\n";}
};


template<int N>
struct M{
	static const int id=3DN*3+1024;
	static void decode(DecodeContext& ){ std::cout<<typeid(M).name()<<" "<<N<<=
"!\n";}
};


template <int> struct ListMember;

template <> struct ListMember<1>{ typedef M1 type; };
template <> struct ListMember<2>{ typedef M2 type; };
template <> struct ListMember<3>{ typedef M3 type; };

template <int N> struct ListMember { typedef M<N> type; };


template<int i>
void foo(int id, DecodeContext& dc) {
	typedef typename ListMember<i>::type M;
	if(M::id=3D=3Did)
		M::decode(dc);
	else
		foo<i-1>(id,dc);
}

template<>
void foo<0>(int id, DecodeContext& dc) {}


int main(){
	DecodeContext& dc=3D *(DecodeContext*)0;// junk for now
	int id=3Dstd::rand(); // runtime dependent
	foo<50>(id,dc);
	return 0;
}

-----Original Message-----
From: Piotr Rak [mailto:piotr.rak@gmail.com]=20
Sent: Wednesday, July 28, 2010 8:17 PM
To: Hite, Christopher
Cc: Richard Guenther; gcc@gcc.gnu.org
Subject: Re: optimization question: mpl

Hi,

2010/7/28 Hite, Christopher <Christopher.Hite@partner.commerzbank.com>:
>> Generally without knowing the compiler version you are using it is=20
>> hard to tell.
> I'll use whatever's best. =A0Right now I'm still on 4.4.3. =A0I'll=20
> probably upgrade soon to 4.5.
>
>> The same is true without a complete compilable testcase.
> I didn't want to make a test case that depends on boost::mpl. How's
> this:
>
> struct DecodeContext;
>
> struct M1{
> =A0 =A0 =A0 =A0static const int id=3D1;
> =A0 =A0 =A0 =A0static void decode(DecodeContext& ){} };
>
> struct M2{
> =A0 =A0 =A0 =A0static const int id=3D2;
> =A0 =A0 =A0 =A0static void decode(DecodeContext& ){} };
>
> struct M3{
> =A0 =A0 =A0 =A0static const int id=3D3;
> =A0 =A0 =A0 =A0static void decode(DecodeContext& ){} };
>
>
> template <int> struct ListMember;
>
> template <> struct ListMember<1>{ typedef M1 type; };
> template <> struct ListMember<2>{ typedef M2 type; };
> template <> struct ListMember<3>{ typedef M3 type; };
>
> template<int i>
> void foo(int id, DecodeContext& dc) {
> =A0 =A0 =A0 =A0typedef typename ListMember<i>::type M;
> =A0 =A0 =A0 =A0if(M::id=3D=3Did)
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0M::decode(dc);
> =A0 =A0 =A0 =A0else
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0foo<i-1>(id,dc);
> }
>
> template<>
> void foo<0>(int id, DecodeContext& dc) {}
>
>
>
> int main(){
> =A0 =A0 =A0 =A0DecodeContext& dc=3D *(DecodeContext*)0;// junk for now
> =A0 =A0 =A0 =A0int id=3D2; //sometime runtime dependent
> =A0 =A0 =A0 =A0foo<3>(id,dc);
> =A0 =A0 =A0 =A0return 0;
> }
>
>
>> You can use the flatten attribute to tell the compiler to inline all
>> calls in a given function, like
>>
>> void __attribute__((flatten)) foo(void)
>> {
>> ...
>> decode1();
>> ...
>> }
>
> That would cause decode1() to be inlined, which might not be what you
> want.
>
> Hmm maybe I could rewrite things so the switch case returns a function
> pointer. =A0I'm guessing that would make things slower though.
>

Or you could just initialize static array of pointers, like that
(please note, that I never compiled code below):

typedef void (*pfn) (DeclContext&);

tempate <int I>
struct InitDispatchHelper
{
  static void init(std::tr1::array<pfn,N>& a)
  {
    a[I-1] =3D ListMemeber<I-1>::foo;
    InitPointersHelper<I-1>::init(a);
  }
};

// End recursion
template<>
struct InitDispatchHelper
{
  static void init(std::tr1::array<pfn,N>& a)
  {    /*does nothing */  }
};

// Dispatch table;
template <int N>
struct DispatchFoo : std::tr1::array<pfn, N>
{
  DispatchFoo() : std::tr1::array<pfn, N>()
  {
     InitDispatchHelper<N>::init(*this);
  }
};

then your call would look like:

static const int X =3D /* your maximal I in ListMembers meta-class */

void call_foo(int i, DeclContext& c)
{
  static const DispatchFoo<X> foo_dispatch;
  foo_dispatch[i] (c);
}

Bonus points for benchmarking both solutions and making it policy
class choosing between those two solutions depending on X.

Good luck,

Piotr