From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 3756 invoked by alias); 18 Apr 2006 20:52:25 -0000 Received: (qmail 3748 invoked by uid 22791); 18 Apr 2006 20:52:25 -0000 X-Spam-Check-By: sourceware.org Received: from bay113-f32.bay113.hotmail.com (HELO hotmail.com) (65.54.168.42) by sourceware.org (qpsmtpd/0.31) with ESMTP; Tue, 18 Apr 2006 20:52:23 +0000 Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC; Tue, 18 Apr 2006 13:52:21 -0700 Message-ID: Received: from 65.54.168.200 by by113fd.bay113.hotmail.msn.com with HTTP; Tue, 18 Apr 2006 20:52:18 GMT X-Sender: hillcino368@hotmail.com In-Reply-To: <17476.48141.553967.447112@zapata.pink> From: "cino hilliard" To: gcc-help@gcc.gnu.org Bcc: Subject: Re: Free Lunch Date: Tue, 18 Apr 2006 20:52:00 -0000 Mime-Version: 1.0 Content-Type: text/plain; format=flowed X-IsSubscribed: yes Mailing-List: contact gcc-help-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-help-owner@gcc.gnu.org X-SW-Source: 2006-04/txt/msg00176.txt.bz2 >From: Andrew Haley >To: "cino hilliard" >CC: gcc-help@gcc.gnu.org >Subject: Re: Free Lunch >Date: Tue, 18 Apr 2006 11:14:37 +0100 > > >There's a good reason for this. You can discover it by reading gcc's >documentation. See ``Deleting "empty" loops''. Ok here it is. Historically, GCC has not deleted “empty” loops under the assumption that the most likely reason you would put one in a program is to have a delay, so deleting them will not make real programs run any faster. However, the rationale here is that optimization of a nonempty loop cannot produce an empty one. This held for carefully written C compiled with less powerful optimizers but is not always the case for carefully written C++ or with more powerful optimizers. Thus GCC will remove operations from loops whenever it can determine those operations are not externally visible (apart from the time taken to execute them, of course). As GCC improves, it will remove the loop itself. Indeed, with ‘-funroll-loops’ small loops can already be removed, so leaving an empty non-unrolled loop is both suboptimal and inconsistent. Be aware of this when performing timing tests, for instance the following loop can be completely removed, provided some_expression can provably not change any global state. { int sum = 0; int ix; for (ix = 0; ix != 10000; ix++) sum += some_expression; } Now based on this, I modifed the code shown below to produce the following timing. 2000000000 Sec = 1.640625 2000000001 Sec = 1.000000 So count1 loop which has the assignment statement z=x takes 1.64 sec while count 2 which which has 3 assignment statements z=j; y+=1; s=y+1; Yet count 2 executes in .64 sec secs less time. That is the free lunch. The best I can do in just an empty loop in assembly language A386 is 2.47. Some of the time is probably the timer and output routines. On my p3 1 ghz system I get count1 4.171875 sec count2 4.187500 sec1 a small price for 2 billion loops of y+=1; s=y+1; The gcc assembly code appears to produce 2 bill loops for both functions Also if we change the type to unsigned long and use 4000000000UL, we get 4000000000 Sec = 3.234375 4000000001 Sec = 0.000000 If we relax the -O3 optimization switch, the free lunch goes BY By as we get, 4000000000 Sec = 8.125000 4000000001 Sec = 17.531250 Similarly for type long we get 2000000000 Sec = 4.031250 2000000001 Sec = 8.828125 And the free lunch goes BY By. To Optimize or not to Optimize, that is the question. Dog gone it TANSTAAFL! Have fun, Cino ***********************************code******************************** // Free Lunch //> Executing: c:\context\ConExec.exe "gc2.bat" COUNTgcc //C:\gcc\examples>f:\gcc\bin\g++ COUNTgcc.c -o COUNTgcc.exe -s -O3 -mtune=pentium4 //C:\gcc\examples>rem g++ -c -g -Wa,-a,-ad COUNTgcc.c > COUNTgcc.lst //C:\gcc\examples>g++ -S -c COUNTgcc.c //C:\gcc\examples>g++ -dumpversion //3.4.4 //> Execution finished. //System: p4 2.53 ghz xp pro #include #include #define timer GetTickCount()/1000.0 float t1,t2; long count1(long); long count2(long); int main() { long j1,j2; t1=timer; j1 = count1(2000000000); t2=timer; printf("%u\n",j1); printf("%s%f\n"," Sec = ",t2-t1); printf("\n"); t1=timer; j2 = count2(2000000000); t2=timer; printf("%u\n",j2); printf("%s%f\n"," Sec = ",t2-t1); getchar(); return 0; } long count1(long y) { long x,z; for(x=1;x<=y;x++) { z=x; } return z; } long count2(long i) { long j,y=0,s=0,z; for(j=1;j<=i;j++) { z=j; y+=1; s=y+1; } return s; }