From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-help-return-33936-listarch-gcc-help=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 4439 invoked by alias); 2 Sep 2008 21:21:08 -0000
Received: (qmail 4430 invoked by uid 22791); 2 Sep 2008 21:21:07 -0000
X-Spam-Check-By: sourceware.org
Received: from vms173003pub.verizon.net (HELO vms173003pub.verizon.net) (206.46.173.3)     by sourceware.org (qpsmtpd/0.31) with ESMTP; Tue, 02 Sep 2008 21:20:22 +0000
Received: from [10.10.1.168] ([209.190.166.162]) by vms173003.mailsrvcs.net  (Sun Java System Messaging Server 6.2-6.01 (built Apr  3 2006))  with ESMTPA id <0K6L00HQH791TOJ8@vms173003.mailsrvcs.net> for  gcc-help@gcc.gnu.org; Tue, 02 Sep 2008 16:19:50 -0500 (CDT)
Date: Tue, 02 Sep 2008 21:21:00 -0000
From: John Fine <johnsfine@verizon.net>
Subject: Re: "may or may not", that is the question
In-reply-to: <48BDA665.9010802@enseirb.fr>
To: David Bruant <bruant@enseirb.fr>
Cc: gcc-help@gcc.gnu.org
Message-id: <48BDADED.1060503@verizon.net>
MIME-version: 1.0
Content-type: text/plain; charset=ISO-8859-1; format=flowed
Content-transfer-encoding: 7bit
References: <48BDA665.9010802@enseirb.fr>
User-Agent: Thunderbird 2.0.0.16 (Windows/20080708)
X-IsSubscribed: yes
Mailing-List: contact gcc-help-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-help.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-help/>
List-Post: <mailto:gcc-help@gcc.gnu.org>
List-Help: <mailto:gcc-help-help@gcc.gnu.org>
Sender: gcc-help-owner@gcc.gnu.org
X-SW-Source: 2008-09/txt/msg00012.txt.bz2

David Bruant wrote:
> Finally, I would like to know some reason that could make the code
> slower by unrolling loops.
>
> And, maybe, that we (I) could write to the people that write the manual
> to add what will be said here to improve the manula, because I find the
> "may or may not" quite weak for a manual.
>
>
>   
Try a few experiments before jumping to your conclusions.  I have, and 
unrolling loops usually makes those loops slower.

It is a complicated situation and you may find that the option to unroll 
loops makes the total program faster despite making most loops slower (a 
few inner loops that took a lot of time might get faster while loops 
that took less time get slower).  But even that much is far from 
certain.  The total program might get slower.

Depending on details of compiler behavior that I don't know for gcc, 
there might be much stronger reasons than the following for loops to get 
slower when unrolled, but the following is sometimes enough:

1) Modern CPUs overlap a lot of work, so all the counting and jumping 
involved in a loop might happen to be fully overlapped and free, so 
there is nothing to be saved by unrolling.

2) By unrolling, you are always giving the L1 instruction cache more 
work to do.  Depending on complex issues of the instruction mix and 
decode overheads etc. the cost of fetching all those extra instructions 
might outweigh everything else.  So after saving nothing because of 
factor (1) you then pay a lot for it by factor (2).