From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4439 invoked by alias); 2 Sep 2008 21:21:08 -0000 Received: (qmail 4430 invoked by uid 22791); 2 Sep 2008 21:21:07 -0000 X-Spam-Check-By: sourceware.org Received: from vms173003pub.verizon.net (HELO vms173003pub.verizon.net) (206.46.173.3) by sourceware.org (qpsmtpd/0.31) with ESMTP; Tue, 02 Sep 2008 21:20:22 +0000 Received: from [10.10.1.168] ([209.190.166.162]) by vms173003.mailsrvcs.net (Sun Java System Messaging Server 6.2-6.01 (built Apr 3 2006)) with ESMTPA id <0K6L00HQH791TOJ8@vms173003.mailsrvcs.net> for gcc-help@gcc.gnu.org; Tue, 02 Sep 2008 16:19:50 -0500 (CDT) Date: Tue, 02 Sep 2008 21:21:00 -0000 From: John Fine Subject: Re: "may or may not", that is the question In-reply-to: <48BDA665.9010802@enseirb.fr> To: David Bruant Cc: gcc-help@gcc.gnu.org Message-id: <48BDADED.1060503@verizon.net> MIME-version: 1.0 Content-type: text/plain; charset=ISO-8859-1; format=flowed Content-transfer-encoding: 7bit References: <48BDA665.9010802@enseirb.fr> User-Agent: Thunderbird 2.0.0.16 (Windows/20080708) X-IsSubscribed: yes Mailing-List: contact gcc-help-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-help-owner@gcc.gnu.org X-SW-Source: 2008-09/txt/msg00012.txt.bz2 David Bruant wrote: > Finally, I would like to know some reason that could make the code > slower by unrolling loops. > > And, maybe, that we (I) could write to the people that write the manual > to add what will be said here to improve the manula, because I find the > "may or may not" quite weak for a manual. > > > Try a few experiments before jumping to your conclusions. I have, and unrolling loops usually makes those loops slower. It is a complicated situation and you may find that the option to unroll loops makes the total program faster despite making most loops slower (a few inner loops that took a lot of time might get faster while loops that took less time get slower). But even that much is far from certain. The total program might get slower. Depending on details of compiler behavior that I don't know for gcc, there might be much stronger reasons than the following for loops to get slower when unrolled, but the following is sometimes enough: 1) Modern CPUs overlap a lot of work, so all the counting and jumping involved in a loop might happen to be fully overlapped and free, so there is nothing to be saved by unrolling. 2) By unrolling, you are always giving the L1 instruction cache more work to do. Depending on complex issues of the instruction mix and decode overheads etc. the cost of fetching all those extra instructions might outweigh everything else. So after saving nothing because of factor (1) you then pay a lot for it by factor (2).