* TREE_CODE mania
@ 2002-09-03 20:18 Devang Patel
2002-09-03 20:50 ` Daniel Berlin
2002-09-04 1:51 ` Neil Booth
0 siblings, 2 replies; 8+ messages in thread
From: Devang Patel @ 2002-09-03 20:18 UTC (permalink / raw)
To: gcc
Hi All,
Recently there was a discussion about faster compile time. And
memory usage and/or allocation is considered one probable
culprit behind the slow compiler.
To understand actual memory usage pattern, I instrumented GCC
by replacing TREE_CODE macro with function TREE_CODE_read().
I just wanted to see what is the usage pattern and how bad/good is it.
I collected profiled data for following one line program.
int foo() { return 1;}
Now, gprof tells me that to compile this, cc1 calls TREE_CODE_read()
37572 times! I was expecting that number to be in couple of thousands
range but 37k seems high to me.
I think, such a high number of indirect memory references puts
high pressure on VM and GCC's memory manager to maintain locality.
May be we can do simple code reorganizations using few extra
local variables to reduce this pressure. Or may be I am unnecessarily
surprised by this number.
Here is relevant gprof data...
0.00 0.00 694/37572 _convert <cycle 1>
[219]
0.00 0.00 826/37572 _pushdecl [211]
0.00 0.00 927/37572 _int_const_binop
[23]
0.00 0.00 1019/37572 _round_type_align
[171]
0.00 0.00 1027/37572
_darwin_encode_section_info [194]
0.00 0.00 1155/37572 _layout_type
<cycle 3> [10]
0.00 0.00 1306/37572 _tree_size [122]
0.00 0.00 1499/37572
_finalize_type_size [28]
0.00 0.00 1674/37572 _integer_zerop
[156]
0.00 0.00 2286/37572 _fold <cycle 1>
[229]
0.00 0.00 2304/37572 _make_decl_rtl [16]
0.00 0.00 3193/37572 _integer_onep [135]
0.00 0.00 4488/37572 _size_binop [22]
0.00 0.00 5258/37572 _is_attribute_p
[101]
0.00 0.00 6537/37572 _force_fit_type
[114]
[91] 0.0 0.00 0.00 37572 _TREE_CODE_read [91]
Now in this data size_binop looks interesting. It is expensive and
according
to gprof it is consuming 16% of total compile time.
[22] 16.7 0.00 0.01 1496 _size_binop [22]
-Devang
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: TREE_CODE mania
2002-09-03 20:18 TREE_CODE mania Devang Patel
@ 2002-09-03 20:50 ` Daniel Berlin
2002-09-04 10:19 ` Devang Patel
2002-09-04 10:23 ` Devang Patel
2002-09-04 1:51 ` Neil Booth
1 sibling, 2 replies; 8+ messages in thread
From: Daniel Berlin @ 2002-09-03 20:50 UTC (permalink / raw)
To: Devang Patel; +Cc: gcc
On Tue, 3 Sep 2002, Devang Patel wrote:
> Hi All,
>
> Recently there was a discussion about faster compile time. And
> memory usage and/or allocation is considered one probable
> culprit behind the slow compiler.
>
> To understand actual memory usage pattern, I instrumented GCC
> by replacing TREE_CODE macro with function TREE_CODE_read().
> I just wanted to see what is the usage pattern and how bad/good is it.
>
> I collected profiled data for following one line program.
>
> int foo() { return 1;}
>
> Now, gprof tells me that to compile this, cc1 calls TREE_CODE_read()
> 37572 times! I was expecting that number to be in couple of thousands
> range but 37k seems high to me.
>
> I think, such a high number of indirect memory references puts
> high pressure on VM and GCC's memory manager to maintain locality.
Which it doesn't.
Can't we attack this problem directly?
By maybe using object based bins rather than size based ones, at least for
trees and RTL?
ggc_alloc_rtx and ggc_alloc_tree are just defined to call ggc_alloc, but
we could change this to do that, no?
Maybe these types of objects are special enough that even though they may
have different sizes (IE different RTL objects have different sizes), they
should be in the same bins anyway.
Then we'd have the locality for our most used objects.
Do we really care about locality for many other things anyway?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: TREE_CODE mania
2002-09-03 20:18 TREE_CODE mania Devang Patel
2002-09-03 20:50 ` Daniel Berlin
@ 2002-09-04 1:51 ` Neil Booth
2002-09-04 10:15 ` Devang Patel
1 sibling, 1 reply; 8+ messages in thread
From: Neil Booth @ 2002-09-04 1:51 UTC (permalink / raw)
To: Devang Patel; +Cc: gcc
Devang Patel wrote:-
> Now in this data size_binop looks interesting. It is expensive and
> according
> to gprof it is consuming 16% of total compile time.
>
> [22] 16.7 0.00 0.01 1496 _size_binop [22]
For really short runs, isn't the time bucketing fairly useless?
Or do you have reason to believe it's accurate in your case?
Neil.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: TREE_CODE mania
2002-09-04 1:51 ` Neil Booth
@ 2002-09-04 10:15 ` Devang Patel
0 siblings, 0 replies; 8+ messages in thread
From: Devang Patel @ 2002-09-04 10:15 UTC (permalink / raw)
To: Neil Booth; +Cc: gcc
On Wednesday, September 4, 2002, at 01:50 AM, Neil Booth wrote:
> Devang Patel wrote:-
>
>> Now in this data size_binop looks interesting. It is expensive and
>> according
>> to gprof it is consuming 16% of total compile time.
>>
>> [22] 16.7 0.00 0.01 1496 _size_binop [22]
>
> For really short runs, isn't the time bucketing fairly useless?
> Or do you have reason to believe it's accurate in your case?
I collected data by compiling large source[1] and
size_binop drops to 3.7%. It counted for 300095 TREE_CODE
references out of total 5391019. But in this new example,
TREE_CODE itself costs 3.2%
-Devang
[1] For this new example, line count for preprocessed source
collected using -E is 81434.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: TREE_CODE mania
2002-09-03 20:50 ` Daniel Berlin
@ 2002-09-04 10:19 ` Devang Patel
2002-09-04 16:04 ` Daniel Berlin
2002-09-04 10:23 ` Devang Patel
1 sibling, 1 reply; 8+ messages in thread
From: Devang Patel @ 2002-09-04 10:19 UTC (permalink / raw)
To: Daniel Berlin; +Cc: gcc
[-- Attachment #1: Type: text/plain, Size: 1414 bytes --]
On Tuesday, September 3, 2002, at 08:51 PM, Daniel Berlin wrote:
> On Tue, 3 Sep 2002, Devang Patel wrote:
>
>> Hi All,
>>
>> Recently there was a discussion about faster compile time. And
>> memory usage and/or allocation is considered one probable
>> culprit behind the slow compiler.
>>
>> To understand actual memory usage pattern, I instrumented GCC
>> by replacing TREE_CODE macro with function TREE_CODE_read().
>> I just wanted to see what is the usage pattern and how bad/good is it.
>>
>> I collected profiled data for following one line program.
>>
>> int foo() { return 1;}
>>
>> Now, gprof tells me that to compile this, cc1 calls TREE_CODE_read()
>> 37572 times! I was expecting that number to be in couple of thousands
>> range but 37k seems high to me.
>>
>> I think, such a high number of indirect memory references puts
>> high pressure on VM and GCC's memory manager to maintain locality.
>
> Which it doesn't.
> Can't we attack this problem directly?
> By maybe using object based bins rather than size based ones, at least
> for
> trees and RTL?
Sure, we can try using different allocation schemes to achieve better
compile
time performance. But this approach is like -- earn more money and
allocate
funds in better way to meet the budget. I am thinking in terms, can we
reduce
expenditure ? I think, we need to work in both direction to achieve
better
compile time speedup.
-Devang
[-- Attachment #2: Type: text/enriched, Size: 1410 bytes --]
On Tuesday, September 3, 2002, at 08:51 PM, Daniel Berlin wrote:
<excerpt>On Tue, 3 Sep 2002, Devang Patel wrote:
<excerpt>Hi All,
Recently there was a discussion about faster compile time. And
memory usage and/or allocation is considered one probable
culprit behind the slow compiler.
To understand actual memory usage pattern, I instrumented GCC
by replacing TREE_CODE macro with function TREE_CODE_read().
I just wanted to see what is the usage pattern and how bad/good is it.
I collected profiled data for following one line program.
int foo() { return 1;}
Now, gprof tells me that to compile this, cc1 calls TREE_CODE_read()
37572 times! I was expecting that number to be in couple of thousands
range but 37k seems high to me.
I think, such a high number of indirect memory references puts
high pressure on VM and GCC's memory manager to maintain locality.
</excerpt>
Which it doesn't.
Can't we attack this problem directly?
By maybe using object based bins rather than size based ones, at least
for
trees and RTL?
</excerpt>
Sure, we can try using different allocation schemes to achieve better
compile
time performance. But this approach is like -- earn more money and
allocate
funds in better way to meet the budget. I am thinking in terms, can we
reduce
expenditure ? I think, we need to work in both direction to achieve
better
compile time speedup.
-Devang
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: TREE_CODE mania
2002-09-03 20:50 ` Daniel Berlin
2002-09-04 10:19 ` Devang Patel
@ 2002-09-04 10:23 ` Devang Patel
1 sibling, 0 replies; 8+ messages in thread
From: Devang Patel @ 2002-09-04 10:23 UTC (permalink / raw)
To: Daniel Berlin; +Cc: gcc
On Tuesday, September 3, 2002, at 08:51 PM, Daniel Berlin wrote:
> Do we really care about locality for many other things anyway?
I do not have any answer. But to get an answer we need to actually
know/measure memory usage pattern. I think, before using different
memory allocation and memory management schemes we need to
know what are the requirements. And using TREE_CODE function
is one of many experiments in an attempt to answer that question.
-Devang
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: TREE_CODE mania
2002-09-04 10:19 ` Devang Patel
@ 2002-09-04 16:04 ` Daniel Berlin
2002-09-04 19:51 ` Devang Patel
0 siblings, 1 reply; 8+ messages in thread
From: Daniel Berlin @ 2002-09-04 16:04 UTC (permalink / raw)
To: Devang Patel; +Cc: gcc
On Wednesday, September 4, 2002, at 01:19 PM, Devang Patel wrote:
>
> On Tuesday, September 3, 2002, at 08:51 PM, Daniel Berlin wrote:
>
>> On Tue, 3 Sep 2002, Devang Patel wrote:
>>
>>> Hi All,
>>>
>>> Recently there was a discussion about faster compile time. And
>>> memory usage and/or allocation is considered one probable
>>> culprit behind the slow compiler.
>>>
>>> To understand actual memory usage pattern, I instrumented GCC
>>> by replacing TREE_CODE macro with function TREE_CODE_read().
>>> I just wanted to see what is the usage pattern and how bad/good is
>>> it.
>>>
>>> I collected profiled data for following one line program.
>>>
>>> int foo() { return 1;}
>>>
>>> Now, gprof tells me that to compile this, cc1 calls TREE_CODE_read()
>>> 37572 times! I was expecting that number to be in couple of thousands
>>> range but 37k seems high to me.
>>>
>>> I think, such a high number of indirect memory references puts
>>> high pressure on VM and GCC's memory manager to maintain locality.
>>
>> Which it doesn't.
>> Can't we attack this problem directly?
>> By maybe using object based bins rather than size based ones, at
>> least for
>> trees and RTL?
>
> Sure, we can try using different allocation schemes to achieve better
> compile
> time performance. But this approach is like -- earn more money and
> allocate
> funds in better way to meet the budget. I am thinking in terms, can we
> reduce
> expenditure ?
By the by, did you mark the TREE_CODE_read function as const/pure (i'm
not sure tree_code's aren't modified in place, if they are, it's both,
if they aren't, it's at least one of them), so that it accurately
simulates the macro in terms of actual accesses?
> I think, we need to work in both direction to achieve better
> compile time speedup.
>
> -Devang
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: TREE_CODE mania
2002-09-04 16:04 ` Daniel Berlin
@ 2002-09-04 19:51 ` Devang Patel
0 siblings, 0 replies; 8+ messages in thread
From: Devang Patel @ 2002-09-04 19:51 UTC (permalink / raw)
To: Daniel Berlin; +Cc: gcc
[-- Attachment #1: Type: text/plain, Size: 716 bytes --]
On Wednesday, September 4, 2002, at 04:04 PM, Daniel Berlin wrote:
>>
>> Sure, we can try using different allocation schemes to achieve better
>> compile
>> time performance. But this approach is like -- earn more money and
>> allocate
>> funds in better way to meet the budget. I am thinking in terms, can
>> we reduce
>> expenditure ?
>
> By the by, did you mark the TREE_CODE_read function as const/pure (i'm
> not sure tree_code's aren't modified in place, if they are, it's both,
> if they aren't, it's at least one of them), so that it accurately
> simulates the macro in terms of actual accesses?
>
Well, TREE_CODE_read() name is misleading. It is recording read as well
as write accesses.
-Devang
[-- Attachment #2: Type: text/enriched, Size: 726 bytes --]
On Wednesday, September 4, 2002, at 04:04 PM, Daniel Berlin wrote:
<excerpt><excerpt>
Sure, we can try using different allocation schemes to achieve better
compile
time performance. But this approach is like -- earn more money and
allocate
funds in better way to meet the budget. I am thinking in terms, can we
reduce
expenditure ?
</excerpt>
By the by, did you mark the TREE_CODE_read function as const/pure (i'm
not sure tree_code's aren't modified in place, if they are, it's both,
if they aren't, it's at least one of them), so that it accurately
simulates the macro in terms of actual accesses?
</excerpt>
Well, TREE_CODE_read() name is misleading. It is recording read as
well as write accesses.
-Devang
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2002-09-05 2:51 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-09-03 20:18 TREE_CODE mania Devang Patel
2002-09-03 20:50 ` Daniel Berlin
2002-09-04 10:19 ` Devang Patel
2002-09-04 16:04 ` Daniel Berlin
2002-09-04 19:51 ` Devang Patel
2002-09-04 10:23 ` Devang Patel
2002-09-04 1:51 ` Neil Booth
2002-09-04 10:15 ` Devang Patel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).