* Interesting paper from Perdue
@ 2004-09-20 13:31 Steven Bosscher
2004-09-21 7:21 ` tm_gccmail
2004-09-21 16:01 ` Vladimir Makarov
0 siblings, 2 replies; 5+ messages in thread
From: Steven Bosscher @ 2004-09-20 13:31 UTC (permalink / raw)
To: gcc
I don't know if anyone has ever seen/read/mentioned this paper
before, I might have missed it. Otherwise, interesting reading:
https://engineering.purdue.edu/ECE/Research/TR/2004pdfs/TR-ECE-04-01.pdf
Gr.
Steven
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Interesting paper from Perdue
2004-09-20 13:31 Interesting paper from Perdue Steven Bosscher
@ 2004-09-21 7:21 ` tm_gccmail
2004-09-21 17:59 ` Vladimir Makarov
2004-09-21 16:01 ` Vladimir Makarov
1 sibling, 1 reply; 5+ messages in thread
From: tm_gccmail @ 2004-09-21 7:21 UTC (permalink / raw)
To: Steven Bosscher; +Cc: gcc
On Mon, 20 Sep 2004, Steven Bosscher wrote:
> I don't know if anyone has ever seen/read/mentioned this paper
> before, I might have missed it. Otherwise, interesting reading:
> https://engineering.purdue.edu/ECE/Research/TR/2004pdfs/TR-ECE-04-01.pdf
>
> Gr.
> Steven
I'll digress and rant a bit; apologizes in advance.
This is just the tip of the iceberg, really. There are many other
instances where various optimizations are improved in isolation and
degrade performance because they don't consider the effects on the other
optimization passes.
For example, Some of the recent work on alias analysis really worries me
because I believe this will result in a medium-term net performance
decrease on many targets.
Consider:
1. Improved alias analysis allows better disambiguation of memory
references.
2. The current scheduler is overly aggressive about hoisting loads, and
is only restrained by the inadequacy of the current alias analysis.
When alias analysis is improved, the first scheduling pass will
greatly increase register pressure.
3. The register allocator inserts code suboptimially (in particular,
restores are too early) and lacks basic fatures such as live-range
splitting and rematerialization.
Therefore, it exhibits increasingly bad behavior as register pressure
increases.
I think the following will occur:
1. Targets with the first instruction scheduling pass enabled will exhibit
a net decrease in performance due to increased register pressure. This
will be exacerbated if the target has fewer registers (e.g. slightly
worse on IA64, much worse on PPC). The SH is unlikely to be affected
due to scheduler modifications already implemented.
1. Targets without the first scheduling pass enabled will exhibit a net
decrease in performance only if the register set is very small
(fewer than 16 registers). This includes the x86 and most embedded
processors such as the H8/300, M68HC11, 8051, etc.
As I see it, the register allocator and the instruction scheduler are
really the base of the foundations for GCC optimization.
We keep adding improvements which:
1. Allow more intermediate values to be kept in registers which increase
register pressure
2. Allow memory to be retained in registers longer, which increases
register pressure
3. Create larger basic blocks, which increases register pressure
4. Allow more loop unrolling, which increases register pressure
5. etc
...and the register allocator doesn't handle the increased register
pressure well, so the net result is very little improvement.
We really spend some time improving the foundation of GCC instead of
piling more and more optimizations on top of it.
Toshi
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Interesting paper from Perdue
2004-09-20 13:31 Interesting paper from Perdue Steven Bosscher
2004-09-21 7:21 ` tm_gccmail
@ 2004-09-21 16:01 ` Vladimir Makarov
1 sibling, 0 replies; 5+ messages in thread
From: Vladimir Makarov @ 2004-09-21 16:01 UTC (permalink / raw)
To: Steven Bosscher; +Cc: gcc
Steven Bosscher wrote:
>I don't know if anyone has ever seen/read/mentioned this paper
>before, I might have missed it. Otherwise, interesting reading:
>https://engineering.purdue.edu/ECE/Research/TR/2004pdfs/TR-ECE-04-01.pdf
>
>
>
The most interesting thing about the article is that they spent a lot of
machine time (which I have no in my disposal) to investigate individual
options to get a better SPECInt2000 results.
But I see they used a black box approach because they don't know gcc
internals at all (they tried -fschedule-insns for p4 which does nothing,
they also did not use -mtune=pentium4, etc).
Their most complex algorithm (3rd algorithm) to choose better option
combination is just oversimplified taboo search algorithm (with list of
taboo moves which never expire). I think that an algorithm based on
taboo metaheuristic would achieve better results for the same number of
tries. Imho the taboo algorithm is the best fit approach for solution
of the task (genetic apporach used by Scott Ladd or more random
semulated annealing approach would work much worse on my opinion).
In any case, the approach is not practical (on my evaluation it needs
about 15 hours to choose options by the 3rd algorithm for one
SPECInt2000 test -- three 3 minutes runs, 20 options, 4 iteration as
they reported). Alhough it could be used to get a better (peak)
SPECInt2000 report.
Vlad
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Interesting paper from Perdue
2004-09-21 7:21 ` tm_gccmail
@ 2004-09-21 17:59 ` Vladimir Makarov
2004-09-21 18:39 ` Daniel Berlin
0 siblings, 1 reply; 5+ messages in thread
From: Vladimir Makarov @ 2004-09-21 17:59 UTC (permalink / raw)
To: tm_gccmail; +Cc: Steven Bosscher, gcc
tm_gccmail@kloo.net wrote:
>On Mon, 20 Sep 2004, Steven Bosscher wrote:
>
>
>
>>I don't know if anyone has ever seen/read/mentioned this paper
>>before, I might have missed it. Otherwise, interesting reading:
>>https://engineering.purdue.edu/ECE/Research/TR/2004pdfs/TR-ECE-04-01.pdf
>>
>>Gr.
>>Steven
>>
>>
>
>I'll digress and rant a bit; apologizes in advance.
>
>This is just the tip of the iceberg, really. There are many other
>instances where various optimizations are improved in isolation and
>degrade performance because they don't consider the effects on the other
>optimization passes.
>
>
>
I think the approach mentioned in article has a merit for any compiler.
Any optimized compiler is bunch of pass because of complexity task.
Many passes are trying to solve subproblem not taking other passes into
account. It creates unpredictable compiler behaviour for given program
when an optimization is on or off.
>...and the register allocator doesn't handle the increased register
>pressure well, so the net result is very little improvement.
>
>We really spend some time improving the foundation of GCC instead of
>piling more and more optimizations on top of it.
>
>
>
I agree with this. Gcc probably is a compiler with very upredictabe
behaviour because inadequate register allocator/scheduler. But writing
a good register allocator is not easy task in gcc because of very rich
register file model and a lot of machine-dependent macros used by gcc
ports. The colour-based register allocator project is an example of
this. I know about this because I worked on register allocator
improvements and I am still working on it. I think the key component is
reload pass. Tasks solved by reload should be combined with the
register allocator. We should rid off reload. But it is an eneormous task.
Vlad
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Interesting paper from Perdue
2004-09-21 17:59 ` Vladimir Makarov
@ 2004-09-21 18:39 ` Daniel Berlin
0 siblings, 0 replies; 5+ messages in thread
From: Daniel Berlin @ 2004-09-21 18:39 UTC (permalink / raw)
To: Vladimir Makarov; +Cc: gcc, tm_gccmail, Steven Bosscher
On Sep 21, 2004, at 12:01 PM, Vladimir Makarov wrote:
> tm_gccmail@kloo.net wrote:
>
>> On Mon, 20 Sep 2004, Steven Bosscher wrote:
>>
>>
>>> I don't know if anyone has ever seen/read/mentioned this paper
>>> before, I might have missed it. Otherwise, interesting reading:
>>> https://engineering.purdue.edu/ECE/Research/TR/2004pdfs/TR-ECE-04
>>> -01.pdf
>>>
>>> Gr.
>>> Steven
>>>
>>
>> I'll digress and rant a bit; apologizes in advance.
>>
>> This is just the tip of the iceberg, really. There are many other
>> instances where various optimizations are improved in isolation and
>> degrade performance because they don't consider the effects on the
>> other
>> optimization passes.
>>
>>
> I think the approach mentioned in article has a merit for any
> compiler. Any optimized compiler is bunch of pass because of
> complexity task. Many passes are trying to solve subproblem not taking
> other passes into account. It creates unpredictable compiler
> behaviour for given program when an optimization is on or off.
>
>> ...and the register allocator doesn't handle the increased register
>> pressure well, so the net result is very little improvement.
>>
>> We really spend some time improving the foundation of GCC instead of
>> piling more and more optimizations on top of it.
>>
>>
> I agree with this. Gcc probably is a compiler with very upredictabe
> behaviour because inadequate register allocator/scheduler. But
> writing a good register allocator is not easy task in gcc because of
> very rich register file model and a lot of machine-dependent macros
> used by gcc ports.
We have a register file model so rich that no single architecture is
described well enough to get good results.
There is something ironic about this.
:)
> The colour-based register allocator project is an example of this. I
> know about this because I worked on register allocator improvements
> and I am still working on it. I think the key component is reload
> pass. Tasks solved by reload should be combined with the register
> allocator. We should rid off reload. But it is an eneormous task.
Yes, which is one of a myriad of reasons new-ra never succeeded.
The goals were too ambitious
Getting rid of reload is a project all itself.
>
> Vlad
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2004-09-21 17:59 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-09-20 13:31 Interesting paper from Perdue Steven Bosscher
2004-09-21 7:21 ` tm_gccmail
2004-09-21 17:59 ` Vladimir Makarov
2004-09-21 18:39 ` Daniel Berlin
2004-09-21 16:01 ` Vladimir Makarov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).