public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Does gcc automatically lower optimization level for very large routines?
@ 2019-12-19 16:37 Qing Zhao
  2019-12-19 22:51 ` Dmitry Mikushin
  2020-01-01  5:25 ` Andi Kleen
  0 siblings, 2 replies; 13+ messages in thread
From: Qing Zhao @ 2019-12-19 16:37 UTC (permalink / raw)
  To: gcc

Hi,

When using GCC to compile a very large routine with -O2, it failed with out of memory during run time.  (O1 is Okay)

As I checked within gdb,  when “cc1” was consuming around 95% of the memory,  it’s at :

(gdb) where 
#0  0x0000000000ddbcb3 in df_chain_create (src=0x631006480f08, 
    dst=0x63100f306288) at ../../gcc-8.2.1-20180905/gcc/df-problems.c:2267 
#1  0x0000000000dddd1a in df_chain_create_bb_process_use ( 
    local_rd=0x7ffc109bfaf0, use=0x63100f306288, top_flag=0) 
    at ../../gcc-8.2.1-20180905/gcc/df-problems.c:2441 
#2  0x0000000000dde5a7 in df_chain_create_bb (bb_index=16413) 
    at ../../gcc-8.2.1-20180905/gcc/df-problems.c:2490 
#3  0x0000000000ddeaa9 in df_chain_finalize (all_blocks=0x63100097ac28) 
    at ../../gcc-8.2.1-20180905/gcc/df-problems.c:2519 
#4  0x0000000000dbe95e in df_analyze_problem (dflow=0x60600027f740, 
    blocks_to_consider=0x63100097ac28, postorder=0x7f23761f1800, 
    n_blocks=40768) at ../../gcc-8.2.1-20180905/gcc/df-core.c:1179 
#5  0x0000000000dbedac in df_analyze_1 () 
….

The routine that was compiled is very big, has about 119258 lines of code. 
I suspected that GCC’s data flow analysis might not handle very large routine very well, consume too much memory, therefore out of memory for very big routines. 

Currently, I found one GCC’s source level pragma, 

#pragma GCC optimize ("O1”) 

And added it before the large routine (also added another one #pragma GCC reset_options after the routine), this workaround the out of memory issue for now.

However, manually locating large routines is time consuming, I am wondering whether GCC can automatically detect large routines and lower the optimization for those
Routines automatically? Or is there any internal parameters inside GCC’s data flow analysis that compute the complexity of the routine, if it’s very big, then will turn off
The aggressive analysis automatically?  Or any option provided to end user to control the aggressive data flow manually ?


Thanks a lot for any help.

Qing

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Does gcc automatically lower optimization level for very large routines?
  2019-12-19 16:37 Does gcc automatically lower optimization level for very large routines? Qing Zhao
@ 2019-12-19 22:51 ` Dmitry Mikushin
  2019-12-19 23:07   ` Qing Zhao
  2020-01-01  5:25 ` Andi Kleen
  1 sibling, 1 reply; 13+ messages in thread
From: Dmitry Mikushin @ 2019-12-19 22:51 UTC (permalink / raw)
  To: Qing Zhao; +Cc: GCC

This issue is well-known in research/scientific software. The problem of
compiler hang or RAM overconsumption is actually not about the routine
size, but about too complicated control flow. When optimizing, the compiler
traverses the control flow graph, which may have the misfortune to explode
in terms of complexity. So you may want to check whether your routine
heavily deploys nested cascades of "if ... else" or goto-s. That is, the
routine size is not a good metric to catch this behavior. GCC may rather
attempt "reversible" strategy of optimizations to stop and undo those that
get beyond a certain threshold.

Kind regards,
- Dmitry.


чт, 19 дек. 2019 г. в 17:38, Qing Zhao <QING.ZHAO@oracle.com>:

> Hi,
>
> When using GCC to compile a very large routine with -O2, it failed with
> out of memory during run time.  (O1 is Okay)
>
> As I checked within gdb,  when “cc1” was consuming around 95% of the
> memory,  it’s at :
>
> (gdb) where
> #0  0x0000000000ddbcb3 in df_chain_create (src=0x631006480f08,
>     dst=0x63100f306288) at ../../gcc-8.2.1-20180905/gcc/df-problems.c:2267
> #1  0x0000000000dddd1a in df_chain_create_bb_process_use (
>     local_rd=0x7ffc109bfaf0, use=0x63100f306288, top_flag=0)
>     at ../../gcc-8.2.1-20180905/gcc/df-problems.c:2441
> #2  0x0000000000dde5a7 in df_chain_create_bb (bb_index=16413)
>     at ../../gcc-8.2.1-20180905/gcc/df-problems.c:2490
> #3  0x0000000000ddeaa9 in df_chain_finalize (all_blocks=0x63100097ac28)
>     at ../../gcc-8.2.1-20180905/gcc/df-problems.c:2519
> #4  0x0000000000dbe95e in df_analyze_problem (dflow=0x60600027f740,
>     blocks_to_consider=0x63100097ac28, postorder=0x7f23761f1800,
>     n_blocks=40768) at ../../gcc-8.2.1-20180905/gcc/df-core.c:1179
> #5  0x0000000000dbedac in df_analyze_1 ()
> ….
>
> The routine that was compiled is very big, has about 119258 lines of code.
> I suspected that GCC’s data flow analysis might not handle very large
> routine very well, consume too much memory, therefore out of memory for
> very big routines.
>
> Currently, I found one GCC’s source level pragma,
>
> #pragma GCC optimize ("O1”)
>
> And added it before the large routine (also added another one #pragma GCC
> reset_options after the routine), this workaround the out of memory issue
> for now.
>
> However, manually locating large routines is time consuming, I am
> wondering whether GCC can automatically detect large routines and lower the
> optimization for those
> Routines automatically? Or is there any internal parameters inside GCC’s
> data flow analysis that compute the complexity of the routine, if it’s very
> big, then will turn off
> The aggressive analysis automatically?  Or any option provided to end user
> to control the aggressive data flow manually ?
>
>
> Thanks a lot for any help.
>
> Qing

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Does gcc automatically lower optimization level for very large routines?
  2019-12-19 22:51 ` Dmitry Mikushin
@ 2019-12-19 23:07   ` Qing Zhao
  2019-12-20  0:41     ` Jeff Law
  0 siblings, 1 reply; 13+ messages in thread
From: Qing Zhao @ 2019-12-19 23:07 UTC (permalink / raw)
  To: Dmitry Mikushin; +Cc: GCC

Hi, Dmitry,

Thanks for the responds. 

Yes, routine size only cannot determine the complexity of the routine. Different compiler analysis might have different formula with multiple parameters to compute its complexity. 

However, the common issue is: when the complexity of a specific routine for a specific compiler analysis exceeds a threshold, the compiler might consume all the available memory and abort the compilation. 

Therefore,  in order to avoid the failed compilation due to out of memory, some compilers might set a threshold for the complexity of a specific compiler analysis (for example, the more aggressive data flow analysis), when the threshold is met, the specific aggressive analysis will be turned off for this specific routine. Or the optimization level will be lowered for the specific routine (and given a warning during compilation time for such adjustment).  

I am wondering whether GCC has such capability? Or any option provided to increase or decrease the threshold for some of the common analysis (for example, data flow)?

Thanks.

Qing

> On Dec 19, 2019, at 4:50 PM, Dmitry Mikushin <dmitry@kernelgen.org> wrote:
> 
> This issue is well-known in research/scientific software. The problem of
> compiler hang or RAM overconsumption is actually not about the routine
> size, but about too complicated control flow. When optimizing, the compiler
> traverses the control flow graph, which may have the misfortune to explode
> in terms of complexity. So you may want to check whether your routine
> heavily deploys nested cascades of "if ... else" or goto-s. That is, the
> routine size is not a good metric to catch this behavior. GCC may rather
> attempt "reversible" strategy of optimizations to stop and undo those that
> get beyond a certain threshold.
> 
> Kind regards,
> - Dmitry.
> 
> 
> чт, 19 дек. 2019 г. в 17:38, Qing Zhao <QING.ZHAO@oracle.com>:
> 
>> Hi,
>> 
>> When using GCC to compile a very large routine with -O2, it failed with
>> out of memory during run time.  (O1 is Okay)
>> 
>> As I checked within gdb,  when “cc1” was consuming around 95% of the
>> memory,  it’s at :
>> 
>> (gdb) where
>> #0  0x0000000000ddbcb3 in df_chain_create (src=0x631006480f08,
>>    dst=0x63100f306288) at ../../gcc-8.2.1-20180905/gcc/df-problems.c:2267
>> #1  0x0000000000dddd1a in df_chain_create_bb_process_use (
>>    local_rd=0x7ffc109bfaf0, use=0x63100f306288, top_flag=0)
>>    at ../../gcc-8.2.1-20180905/gcc/df-problems.c:2441
>> #2  0x0000000000dde5a7 in df_chain_create_bb (bb_index=16413)
>>    at ../../gcc-8.2.1-20180905/gcc/df-problems.c:2490
>> #3  0x0000000000ddeaa9 in df_chain_finalize (all_blocks=0x63100097ac28)
>>    at ../../gcc-8.2.1-20180905/gcc/df-problems.c:2519
>> #4  0x0000000000dbe95e in df_analyze_problem (dflow=0x60600027f740,
>>    blocks_to_consider=0x63100097ac28, postorder=0x7f23761f1800,
>>    n_blocks=40768) at ../../gcc-8.2.1-20180905/gcc/df-core.c:1179
>> #5  0x0000000000dbedac in df_analyze_1 ()
>> ….
>> 
>> The routine that was compiled is very big, has about 119258 lines of code.
>> I suspected that GCC’s data flow analysis might not handle very large
>> routine very well, consume too much memory, therefore out of memory for
>> very big routines.
>> 
>> Currently, I found one GCC’s source level pragma,
>> 
>> #pragma GCC optimize ("O1”)
>> 
>> And added it before the large routine (also added another one #pragma GCC
>> reset_options after the routine), this workaround the out of memory issue
>> for now.
>> 
>> However, manually locating large routines is time consuming, I am
>> wondering whether GCC can automatically detect large routines and lower the
>> optimization for those
>> Routines automatically? Or is there any internal parameters inside GCC’s
>> data flow analysis that compute the complexity of the routine, if it’s very
>> big, then will turn off
>> The aggressive analysis automatically?  Or any option provided to end user
>> to control the aggressive data flow manually ?
>> 
>> 
>> Thanks a lot for any help.
>> 
>> Qing

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Does gcc automatically lower optimization level for very large routines?
  2019-12-19 23:07   ` Qing Zhao
@ 2019-12-20  0:41     ` Jeff Law
  2019-12-20  1:32       ` David Edelsohn
  2019-12-20 11:13       ` Richard Biener
  0 siblings, 2 replies; 13+ messages in thread
From: Jeff Law @ 2019-12-20  0:41 UTC (permalink / raw)
  To: Qing Zhao, Dmitry Mikushin; +Cc: GCC

On Thu, 2019-12-19 at 17:06 -0600, Qing Zhao wrote:
> Hi, Dmitry,
> 
> Thanks for the responds. 
> 
> Yes, routine size only cannot determine the complexity of the routine. Different compiler analysis might have different formula with multiple parameters to compute its complexity. 
> 
> However, the common issue is: when the complexity of a specific routine for a specific compiler analysis exceeds a threshold, the compiler might consume all the available memory and abort the compilation. 
> 
> Therefore,  in order to avoid the failed compilation due to out of memory, some compilers might set a threshold for the complexity of a specific compiler analysis (for example, the more aggressive data flow analysis), when the threshold is met, the specific aggressive analysis will be turned off for this specific routine. Or the optimization level will be lowered for the specific routine (and given a warning during compilation time for such adjustment).  
> 
> I am wondering whether GCC has such capability? Or any option provided to increase or decrease the threshold for some of the common analysis (for example, data flow)?
> 
There are various places where if we hit a limit, then we throttle
optimization.  But it's not done consistently or pervasively.

Those limits are typically around things like CFG complexity.

We do _not_ try to recover after an out of memory error, or anything
like that.

jeff
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Does gcc automatically lower optimization level for very large routines?
  2019-12-20  0:41     ` Jeff Law
@ 2019-12-20  1:32       ` David Edelsohn
  2019-12-20  1:58         ` Dmitry Mikushin
  2019-12-20 11:13       ` Richard Biener
  1 sibling, 1 reply; 13+ messages in thread
From: David Edelsohn @ 2019-12-20  1:32 UTC (permalink / raw)
  To: Jeffrey Law; +Cc: Qing Zhao, Dmitry Mikushin, GCC

On Thu, Dec 19, 2019 at 7:41 PM Jeff Law <law@redhat.com> wrote:
>
> On Thu, 2019-12-19 at 17:06 -0600, Qing Zhao wrote:
> > Hi, Dmitry,
> >
> > Thanks for the responds.
> >
> > Yes, routine size only cannot determine the complexity of the routine. Different compiler analysis might have different formula with multiple parameters to compute its complexity.
> >
> > However, the common issue is: when the complexity of a specific routine for a specific compiler analysis exceeds a threshold, the compiler might consume all the available memory and abort the compilation.
> >
> > Therefore,  in order to avoid the failed compilation due to out of memory, some compilers might set a threshold for the complexity of a specific compiler analysis (for example, the more aggressive data flow analysis), when the threshold is met, the specific aggressive analysis will be turned off for this specific routine. Or the optimization level will be lowered for the specific routine (and given a warning during compilation time for such adjustment).
> >
> > I am wondering whether GCC has such capability? Or any option provided to increase or decrease the threshold for some of the common analysis (for example, data flow)?
> >
> There are various places where if we hit a limit, then we throttle
> optimization.  But it's not done consistently or pervasively.
>
> Those limits are typically around things like CFG complexity.
>
> We do _not_ try to recover after an out of memory error, or anything
> like that.

I have mentioned a few times before that IBM XL Compiler allows the
user to specify the maximum memory utilization for the compiler
(including "unlimmited").  The compiler optimization passes estimate
the memory usage for the data structures of each optimization pass.
The the memory usage is too high, the pass attempts to sub-divide the
region and calculates the estimated memory usage again, recursing
until it can apply the optimization within the memory limit or the
optimization would not be effective.  IBM XL Compiler does not try to
recover from an out of memory error, but it explicitly considers
memory use of optimization passes.  It does not adjust the complexity
of the optimization, but it does adjust the size of the region or
other parameters to reduce the memory usage of the data structures for
an optimization.

Thanks, David

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Does gcc automatically lower optimization level for very large routines?
  2019-12-20  1:32       ` David Edelsohn
@ 2019-12-20  1:58         ` Dmitry Mikushin
  2019-12-20 22:52           ` Segher Boessenkool
  0 siblings, 1 reply; 13+ messages in thread
From: Dmitry Mikushin @ 2019-12-20  1:58 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Jeffrey Law, Qing Zhao, GCC

Trying to plan memory consumption ahead-of-work contradicts with the nature
of the graph traversal. Estimation may work very well for something simple
like linear or log-linear behavior. But many compiler algorithms are known
to be polynomial or exponential (or even worse in case of bugs). So,
estimation is a nice step ahead, but only fallback & recover can ultimately
solve the problem.

пт, 20 дек. 2019 г. в 02:32, David Edelsohn <dje.gcc@gmail.com>:

> On Thu, Dec 19, 2019 at 7:41 PM Jeff Law <law@redhat.com> wrote:
> >
> > On Thu, 2019-12-19 at 17:06 -0600, Qing Zhao wrote:
> > > Hi, Dmitry,
> > >
> > > Thanks for the responds.
> > >
> > > Yes, routine size only cannot determine the complexity of the routine.
> Different compiler analysis might have different formula with multiple
> parameters to compute its complexity.
> > >
> > > However, the common issue is: when the complexity of a specific
> routine for a specific compiler analysis exceeds a threshold, the compiler
> might consume all the available memory and abort the compilation.
> > >
> > > Therefore,  in order to avoid the failed compilation due to out of
> memory, some compilers might set a threshold for the complexity of a
> specific compiler analysis (for example, the more aggressive data flow
> analysis), when the threshold is met, the specific aggressive analysis will
> be turned off for this specific routine. Or the optimization level will be
> lowered for the specific routine (and given a warning during compilation
> time for such adjustment).
> > >
> > > I am wondering whether GCC has such capability? Or any option provided
> to increase or decrease the threshold for some of the common analysis (for
> example, data flow)?
> > >
> > There are various places where if we hit a limit, then we throttle
> > optimization.  But it's not done consistently or pervasively.
> >
> > Those limits are typically around things like CFG complexity.
> >
> > We do _not_ try to recover after an out of memory error, or anything
> > like that.
>
> I have mentioned a few times before that IBM XL Compiler allows the
> user to specify the maximum memory utilization for the compiler
> (including "unlimmited").  The compiler optimization passes estimate
> the memory usage for the data structures of each optimization pass.
> The the memory usage is too high, the pass attempts to sub-divide the
> region and calculates the estimated memory usage again, recursing
> until it can apply the optimization within the memory limit or the
> optimization would not be effective.  IBM XL Compiler does not try to
> recover from an out of memory error, but it explicitly considers
> memory use of optimization passes.  It does not adjust the complexity
> of the optimization, but it does adjust the size of the region or
> other parameters to reduce the memory usage of the data structures for
> an optimization.
>
> Thanks, David
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Does gcc automatically lower optimization level for very large routines?
  2019-12-20  0:41     ` Jeff Law
  2019-12-20  1:32       ` David Edelsohn
@ 2019-12-20 11:13       ` Richard Biener
  2019-12-20 16:05         ` Qing Zhao
  1 sibling, 1 reply; 13+ messages in thread
From: Richard Biener @ 2019-12-20 11:13 UTC (permalink / raw)
  To: law, Jeff Law, Qing Zhao, Dmitry Mikushin; +Cc: GCC

On December 20, 2019 1:41:19 AM GMT+01:00, Jeff Law <law@redhat.com> wrote:
>On Thu, 2019-12-19 at 17:06 -0600, Qing Zhao wrote:
>> Hi, Dmitry,
>> 
>> Thanks for the responds. 
>> 
>> Yes, routine size only cannot determine the complexity of the
>routine. Different compiler analysis might have different formula with
>multiple parameters to compute its complexity. 
>> 
>> However, the common issue is: when the complexity of a specific
>routine for a specific compiler analysis exceeds a threshold, the
>compiler might consume all the available memory and abort the
>compilation. 
>> 
>> Therefore,  in order to avoid the failed compilation due to out of
>memory, some compilers might set a threshold for the complexity of a
>specific compiler analysis (for example, the more aggressive data flow
>analysis), when the threshold is met, the specific aggressive analysis
>will be turned off for this specific routine. Or the optimization level
>will be lowered for the specific routine (and given a warning during
>compilation time for such adjustment).  
>> 
>> I am wondering whether GCC has such capability? Or any option
>provided to increase or decrease the threshold for some of the common
>analysis (for example, data flow)?
>> 
>There are various places where if we hit a limit, then we throttle
>optimization.  But it's not done consistently or pervasively.
>
>Those limits are typically around things like CFG complexity.

Note we also have (not consistently used) -Wmissed-optimizations which is supposed to warn when we run into this kind of limiting telling the user which knob he might be able to tune. 

Richard. 

>We do _not_ try to recover after an out of memory error, or anything
>like that.
>
>jeff
>> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Does gcc automatically lower optimization level for very large routines?
  2019-12-20 11:13       ` Richard Biener
@ 2019-12-20 16:05         ` Qing Zhao
  2019-12-20 16:22           ` Jonathan Wakely
  0 siblings, 1 reply; 13+ messages in thread
From: Qing Zhao @ 2019-12-20 16:05 UTC (permalink / raw)
  To: Richard Biener, law, Dmitry Mikushin, David Edelsohn; +Cc: GCC

Thanks a lot for all these help.

So, currently, if GCC compilation aborts due to this reason, what’s the best way for the user to resolve it? 
I added “#pragma GCC optimize (“O1”) to the large routine in order to workaround this issue.  
Is there other better way to do it?

Is GCC planning to resolve such issue better in the future?

thanks.

Qing

> On Dec 20, 2019, at 5:13 AM, Richard Biener <richard.guenther@gmail.com> wrote:
> 
> On December 20, 2019 1:41:19 AM GMT+01:00, Jeff Law <law@redhat.com <mailto:law@redhat.com>> wrote:
>> On Thu, 2019-12-19 at 17:06 -0600, Qing Zhao wrote:
>>> Hi, Dmitry,
>>> 
>>> Thanks for the responds. 
>>> 
>>> Yes, routine size only cannot determine the complexity of the
>> routine. Different compiler analysis might have different formula with
>> multiple parameters to compute its complexity. 
>>> 
>>> However, the common issue is: when the complexity of a specific
>> routine for a specific compiler analysis exceeds a threshold, the
>> compiler might consume all the available memory and abort the
>> compilation. 
>>> 
>>> Therefore,  in order to avoid the failed compilation due to out of
>> memory, some compilers might set a threshold for the complexity of a
>> specific compiler analysis (for example, the more aggressive data flow
>> analysis), when the threshold is met, the specific aggressive analysis
>> will be turned off for this specific routine. Or the optimization level
>> will be lowered for the specific routine (and given a warning during
>> compilation time for such adjustment).  
>>> 
>>> I am wondering whether GCC has such capability? Or any option
>> provided to increase or decrease the threshold for some of the common
>> analysis (for example, data flow)?
>>> 
>> There are various places where if we hit a limit, then we throttle
>> optimization.  But it's not done consistently or pervasively.
>> 
>> Those limits are typically around things like CFG complexity.
> 
> Note we also have (not consistently used) -Wmissed-optimizations which is supposed to warn when we run into this kind of limiting telling the user which knob he might be able to tune. 
> 
> Richard. 
> 
>> We do _not_ try to recover after an out of memory error, or anything
>> like that.
>> 
>> jeff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Does gcc automatically lower optimization level for very large routines?
  2019-12-20 16:05         ` Qing Zhao
@ 2019-12-20 16:22           ` Jonathan Wakely
  0 siblings, 0 replies; 13+ messages in thread
From: Jonathan Wakely @ 2019-12-20 16:22 UTC (permalink / raw)
  To: Qing Zhao; +Cc: Richard Biener, Jeff Law, Dmitry Mikushin, David Edelsohn, GCC

On Fri, 20 Dec 2019 at 16:05, Qing Zhao wrote:
>
> Thanks a lot for all these help.
>
> So, currently, if GCC compilation aborts due to this reason, what’s the best way for the user to resolve it?
> I added “#pragma GCC optimize (“O1”) to the large routine in order to workaround this issue.
> Is there other better way to do it?

Make your functions smaller, or get more RAM.

> Is GCC planning to resolve such issue better in the future?

Nobody has stated any such plans, so you can assume there are no such plans.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Does gcc automatically lower optimization level for very large routines?
  2019-12-20  1:58         ` Dmitry Mikushin
@ 2019-12-20 22:52           ` Segher Boessenkool
  2019-12-20 23:00             ` Dmitry Mikushin
  0 siblings, 1 reply; 13+ messages in thread
From: Segher Boessenkool @ 2019-12-20 22:52 UTC (permalink / raw)
  To: Dmitry Mikushin; +Cc: David Edelsohn, Jeffrey Law, Qing Zhao, GCC

On Fri, Dec 20, 2019 at 02:57:57AM +0100, Dmitry Mikushin wrote:
> Trying to plan memory consumption ahead-of-work contradicts with the nature
> of the graph traversal. Estimation may work very well for something simple
> like linear or log-linear behavior.

Almost everything we do is (almost) linear.

> But many compiler algorithms are known
> to be polynomial or exponential

Many?  There are a few (register allocation is a well-known example),
but anything more than almost linear is quite easy to make blow up.  It
is also not super hard in most cases to make things linear, it just
needs careful attention.

> (or even worse in case of bugs).

Well, sure, if there is a bug *anything* can go wrong ;-)


Segher

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Does gcc automatically lower optimization level for very large routines?
  2019-12-20 22:52           ` Segher Boessenkool
@ 2019-12-20 23:00             ` Dmitry Mikushin
  0 siblings, 0 replies; 13+ messages in thread
From: Dmitry Mikushin @ 2019-12-20 23:00 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: David Edelsohn, Jeffrey Law, Qing Zhao, GCC

Yes, much more. When you traverse a CFG, the analysis develops into a tree
(for example a tree of uses). That is, every basic block could be
*recursively* a root of an individual linear iteration for up to all basic
blocks. Sum them up, and you will get a polynomial expression. I don't
insist that this is the best way, but often the way it is.

пт, 20 дек. 2019 г. в 23:52, Segher Boessenkool <segher@kernel.crashing.org
>:

> On Fri, Dec 20, 2019 at 02:57:57AM +0100, Dmitry Mikushin wrote:
> > Trying to plan memory consumption ahead-of-work contradicts with the
> nature
> > of the graph traversal. Estimation may work very well for something
> simple
> > like linear or log-linear behavior.
>
> Almost everything we do is (almost) linear.
>
> > But many compiler algorithms are known
> > to be polynomial or exponential
>
> Many?  There are a few (register allocation is a well-known example),
> but anything more than almost linear is quite easy to make blow up.  It
> is also not super hard in most cases to make things linear, it just
> needs careful attention.
>
> > (or even worse in case of bugs).
>
> Well, sure, if there is a bug *anything* can go wrong ;-)
>
>
> Segher
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Does gcc automatically lower optimization level for very large routines?
  2019-12-19 16:37 Does gcc automatically lower optimization level for very large routines? Qing Zhao
  2019-12-19 22:51 ` Dmitry Mikushin
@ 2020-01-01  5:25 ` Andi Kleen
  2020-01-01 15:20   ` Segher Boessenkool
  1 sibling, 1 reply; 13+ messages in thread
From: Andi Kleen @ 2020-01-01  5:25 UTC (permalink / raw)
  To: Qing Zhao; +Cc: gcc

Qing Zhao <QING.ZHAO@ORACLE.COM> writes:

> (gdb) where 
> #0  0x0000000000ddbcb3 in df_chain_create (src=0x631006480f08, 
>     dst=0x63100f306288) at ../../gcc-8.2.1-20180905/gcc/df-problems.c:2267 
> #1  0x0000000000dddd1a in df_chain_create_bb_process_use ( 
>     local_rd=0x7ffc109bfaf0, use=0x63100f306288, top_flag=0) 
>     at ../../gcc-8.2.1-20180905/gcc/df-problems.c:2441 
> #2  0x0000000000dde5a7 in df_chain_create_bb (bb_index=16413) 
>     at ../../gcc-8.2.1-20180905/gcc/df-problems.c:2490 
> #3  0x0000000000ddeaa9 in df_chain_finalize (all_blocks=0x63100097ac28) 
>     at ../../gcc-8.2.1-20180905/gcc/df-problems.c:2519 
> #4  0x0000000000dbe95e in df_analyze_problem (dflow=0x60600027f740, 
>     blocks_to_consider=0x63100097ac28, postorder=0x7f23761f1800, 
>     n_blocks=40768) at ../../gcc-8.2.1-20180905/gcc/df-core.c:1179 
> #5  0x0000000000dbedac in df_analyze_1 () 

AFAIK df_analyze is used even with O1 by various passes.  Probably tbe bulk of
the memory consumption is somewhere else, and it just happens to hit
there by chance.

Would be useful to figure out in more details where the memory
consumption goes in your test case.

Unfortunately gcc doesn't have a good general heap profiler,
but I usually do (if you're on Linux). Whoever causes most page
faults likely allocates most memory.

perf record --call-graph dwarf -e page-faults gcc ...
perf report --no-children --percent-limit 5 --stdio > file.txt

and post file.txt into a bug in bugzilla.

-Andi

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Does gcc automatically lower optimization level for very large routines?
  2020-01-01  5:25 ` Andi Kleen
@ 2020-01-01 15:20   ` Segher Boessenkool
  0 siblings, 0 replies; 13+ messages in thread
From: Segher Boessenkool @ 2020-01-01 15:20 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Qing Zhao, gcc

On Tue, Dec 31, 2019 at 09:25:01PM -0800, Andi Kleen wrote:
> Would be useful to figure out in more details where the memory
> consumption goes in your test case.
> 
> Unfortunately gcc doesn't have a good general heap profiler,
> but I usually do (if you're on Linux). Whoever causes most page
> faults likely allocates most memory.
> 
> perf record --call-graph dwarf -e page-faults gcc ...
> perf report --no-children --percent-limit 5 --stdio > file.txt
> 
> and post file.txt into a bug in bugzilla.

There also is the last column of -ftime-report (amount of GC memory
alocated in each pass), it often is helpful.


Segher

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-01-01 15:20 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-19 16:37 Does gcc automatically lower optimization level for very large routines? Qing Zhao
2019-12-19 22:51 ` Dmitry Mikushin
2019-12-19 23:07   ` Qing Zhao
2019-12-20  0:41     ` Jeff Law
2019-12-20  1:32       ` David Edelsohn
2019-12-20  1:58         ` Dmitry Mikushin
2019-12-20 22:52           ` Segher Boessenkool
2019-12-20 23:00             ` Dmitry Mikushin
2019-12-20 11:13       ` Richard Biener
2019-12-20 16:05         ` Qing Zhao
2019-12-20 16:22           ` Jonathan Wakely
2020-01-01  5:25 ` Andi Kleen
2020-01-01 15:20   ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).