public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* "may or may not", that is the question
@ 2008-09-02 20:48 David Bruant
  2008-09-02 21:16 ` Brian Dessent
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: David Bruant @ 2008-09-02 20:48 UTC (permalink / raw)
  To: gcc-help

Hi !

In the gcc manual, p.276 (p.288 of the pdf version), we can read for
-funroll-loops : "This option makes code larger, and may or may not make
it run faster". My first idea is that if you unroll the loops that have
a determined number of iteration, you don't have to jump a lot of time,
you can replace a variable by several constants and consequently
optimize more. Another thing is that variables that control loops are
often on registers and the fact that they disappear provides another
register for another variable what can only improve the speed, I think.
Finally, I would like to know some reason that could make the code
slower by unrolling loops.

And, maybe, that we (I) could write to the people that write the manual
to add what will be said here to improve the manula, because I find the
"may or may not" quite weak for a manual.

Thanks

David Bruant

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: "may or may not", that is the question
  2008-09-02 20:48 "may or may not", that is the question David Bruant
@ 2008-09-02 21:16 ` Brian Dessent
  2008-09-02 21:21 ` John Fine
  2008-09-03 11:22 ` Tom St Denis
  2 siblings, 0 replies; 5+ messages in thread
From: Brian Dessent @ 2008-09-02 21:16 UTC (permalink / raw)
  To: David Bruant; +Cc: gcc-help

David Bruant wrote:

> In the gcc manual, p.276 (p.288 of the pdf version), we can read for

This is a meaningless number without specifying which version of the
documentation you're referring to.  (And the PDF is autogenerated from
the texinfo anyway, which is where any change would have to be made.)

> Finally, I would like to know some reason that could make the code
> slower by unrolling loops.

Unrolling significantly increases code size, which means less space in
the cache for other things.  On most modern architectures main memory is
at least an order of magnitude slower than core execution speed, which
means cache misses result in large stalls.  Depending on the degree of
unrolling this can completely overshadow any gains from the things you
listed.  The documentation can't really be any more specific because it
depends enormously on the specific circumstances involved: the code
being compiled, the compiler options in effect, the target architecture,
the specific hardware on which it's executing, etc.

Brian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: "may or may not", that is the question
  2008-09-02 20:48 "may or may not", that is the question David Bruant
  2008-09-02 21:16 ` Brian Dessent
@ 2008-09-02 21:21 ` John Fine
  2008-09-03  7:06   ` Jeroen Demeyer
  2008-09-03 11:22 ` Tom St Denis
  2 siblings, 1 reply; 5+ messages in thread
From: John Fine @ 2008-09-02 21:21 UTC (permalink / raw)
  To: David Bruant; +Cc: gcc-help

David Bruant wrote:
> Finally, I would like to know some reason that could make the code
> slower by unrolling loops.
>
> And, maybe, that we (I) could write to the people that write the manual
> to add what will be said here to improve the manula, because I find the
> "may or may not" quite weak for a manual.
>
>
>   
Try a few experiments before jumping to your conclusions.  I have, and 
unrolling loops usually makes those loops slower.

It is a complicated situation and you may find that the option to unroll 
loops makes the total program faster despite making most loops slower (a 
few inner loops that took a lot of time might get faster while loops 
that took less time get slower).  But even that much is far from 
certain.  The total program might get slower.

Depending on details of compiler behavior that I don't know for gcc, 
there might be much stronger reasons than the following for loops to get 
slower when unrolled, but the following is sometimes enough:

1) Modern CPUs overlap a lot of work, so all the counting and jumping 
involved in a loop might happen to be fully overlapped and free, so 
there is nothing to be saved by unrolling.

2) By unrolling, you are always giving the L1 instruction cache more 
work to do.  Depending on complex issues of the instruction mix and 
decode overheads etc. the cost of fetching all those extra instructions 
might outweigh everything else.  So after saving nothing because of 
factor (1) you then pay a lot for it by factor (2).


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: "may or may not", that is the question
  2008-09-02 21:21 ` John Fine
@ 2008-09-03  7:06   ` Jeroen Demeyer
  0 siblings, 0 replies; 5+ messages in thread
From: Jeroen Demeyer @ 2008-09-03  7:06 UTC (permalink / raw)
  To: John Fine; +Cc: David Bruant, gcc-help

John Fine wrote:
> It is a complicated situation and you may find that the option to unroll 
> loops makes the total program faster despite making most loops slower (a 
> few inner loops that took a lot of time might get faster while loops 
> that took less time get slower).  But even that much is far from 
> certain.  The total program might get slower.

You could also consider compiling your program with 
-fprofile-generate/-fprofile-use.  That way the compiler (hopefully) 
knows which loops to unroll and which not.  I usually get a speed gain 
of several percent that way.

Jeroen.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: "may or may not", that is the question
  2008-09-02 20:48 "may or may not", that is the question David Bruant
  2008-09-02 21:16 ` Brian Dessent
  2008-09-02 21:21 ` John Fine
@ 2008-09-03 11:22 ` Tom St Denis
  2 siblings, 0 replies; 5+ messages in thread
From: Tom St Denis @ 2008-09-03 11:22 UTC (permalink / raw)
  To: David Bruant; +Cc: gcc-help

David Bruant wrote:
> Hi !
>
> In the gcc manual, p.276 (p.288 of the pdf version), we can read for
> -funroll-loops : "This option makes code larger, and may or may not make
> it run faster". My first idea is that if you unroll the loops that have
> a determined number of iteration, you don't have to jump a lot of time,
> you can replace a variable by several constants and consequently
> optimize more. Another thing is that variables that control loops are
> often on registers and the fact that they disappear provides another
> register for another variable what can only improve the speed, I think.
> Finally, I would like to know some reason that could make the code
> slower by unrolling loops.
>
> And, maybe, that we (I) could write to the people that write the manual
> to add what will be said here to improve the manula, because I find the
> "may or may not" quite weak for a manual.
>   

It may be slower because you may evict things from the instruction cache 
as a result of unrolling the code. 

Tom

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-09-03 11:22 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-09-02 20:48 "may or may not", that is the question David Bruant
2008-09-02 21:16 ` Brian Dessent
2008-09-02 21:21 ` John Fine
2008-09-03  7:06   ` Jeroen Demeyer
2008-09-03 11:22 ` Tom St Denis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).