* ia64 projects
@ 2001-01-09 15:48 Janis Johnson
2001-01-09 16:14 ` Neil Booth
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Janis Johnson @ 2001-01-09 15:48 UTC (permalink / raw)
To: gcc
Steve Christiansen and I recently joined IBM's Linux Technology Center,
where our assignment is to improve the code that gcc generates for the
Intel IA-64 architecture (yes, we get paid for it!). We're interested
in the performance of Linux system software.
Steve and I are new to IA-64 and to gcc although we've both worked on
compilers for years, particularly for Sequent machines (ix86) running
DYNIX/ptx (sysv4). Our first tasks are to identify what changes to gcc
would have the most impact on IA-64 performance and to learn our way
around gcc.
Is there an existing wish list of projects for IA-64 code generation
and/or optimization? If there's not a formal list, are there
suggestions for what should be on such a list? We could come up with
projects on our own, but if other people have already identified what
work should be done then we would prefer to work with them. Is there
a separate forum for communication about IA-64 work in gcc?
For learning our way around gcc, we'd appreciate pointers to difficult
bugs that might provide a tour of the compiler. If we don't get any
suggestions we'll just pick out some open bugs from the database.
Another task is identifying benchmarks that will show that we are indeed
improving the code. Bench++ and the integer component of SPEC CPU2000
look worthwhile. What other benchmarks are used by gcc developers?
Janis Johnson
IBM Linux Technology Center
(503) 578-5458
janis@us.ibm.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ia64 projects
2001-01-09 15:48 ia64 projects Janis Johnson
@ 2001-01-09 16:14 ` Neil Booth
2001-01-09 19:10 ` Yu Xuanwei
2001-01-10 15:23 ` Jim Wilson
2 siblings, 0 replies; 6+ messages in thread
From: Neil Booth @ 2001-01-09 16:14 UTC (permalink / raw)
To: Janis Johnson; +Cc: gcc
Janis Johnson wrote:-
> For learning our way around gcc, we'd appreciate pointers to difficult
> bugs that might provide a tour of the compiler. If we don't get any
> suggestions we'll just pick out some open bugs from the database.
You could try one of my favourites, I don't think it's easy though I
wouldn't know for sure:-
http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view&pr=161&database=gcc
The nice thing about this is the test case is only about 5 lines :-)
There's plenty of bugs in GNATS to be fixed in general.
Neil.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ia64 projects
2001-01-09 15:48 ia64 projects Janis Johnson
2001-01-09 16:14 ` Neil Booth
@ 2001-01-09 19:10 ` Yu Xuanwei
2001-01-10 15:23 ` Jim Wilson
2 siblings, 0 replies; 6+ messages in thread
From: Yu Xuanwei @ 2001-01-09 19:10 UTC (permalink / raw)
To: gcc, Janis Johnson
----- Original Message -----
From: Janis Johnson <janis@us.ibm.com>
To: <gcc@gcc.gnu.org>
Sent: Wednesday, January 10, 2001 7:48 AM
Subject: ia64 projects
> Steve Christiansen and I recently joined IBM's Linux Technology Center,
> where our assignment is to improve the code that gcc generates for the
> Intel IA-64 architecture (yes, we get paid for it!). We're interested
> in the performance of Linux system software.
Here in China, Intel is pushing the similar project on IA64 compiler by
employing Chinese engineers.
>
> Steve and I are new to IA-64 and to gcc although we've both worked on
> compilers for years, particularly for Sequent machines (ix86) running
> DYNIX/ptx (sysv4). Our first tasks are to identify what changes to gcc
> would have the most impact on IA-64 performance and to learn our way
> around gcc.
>
> Is there an existing wish list of projects for IA-64 code generation
> and/or optimization? If there's not a formal list, are there
> suggestions for what should be on such a list? We could come up with
> projects on our own, but if other people have already identified what
> work should be done then we would prefer to work with them. Is there
> a separate forum for communication about IA-64 work in gcc?
>
Why not inquire Intel? Intel pays more attention to Linux.
> For learning our way around gcc, we'd appreciate pointers to difficult
> bugs that might provide a tour of the compiler. If we don't get any
> suggestions we'll just pick out some open bugs from the database.
>
> Another task is identifying benchmarks that will show that we are indeed
> improving the code. Bench++ and the integer component of SPEC CPU2000
> look worthwhile. What other benchmarks are used by gcc developers?
>
GNU libprofile may be what you need. But I can't assure it.
> Janis Johnson
> IBM Linux Technology Center
> (503) 578-5458
> janis@us.ibm.com
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ia64 projects
2001-01-09 15:48 ia64 projects Janis Johnson
2001-01-09 16:14 ` Neil Booth
2001-01-09 19:10 ` Yu Xuanwei
@ 2001-01-10 15:23 ` Jim Wilson
2001-01-11 5:16 ` Bernd Schmidt
2 siblings, 1 reply; 6+ messages in thread
From: Jim Wilson @ 2001-01-10 15:23 UTC (permalink / raw)
To: janis; +Cc: gcc
In article < OFA91E2757.0853FAB8-ON882569CF.0082A6E5@LocalDomain > you write:
>Is there an existing wish list of projects for IA-64 code generation
>and/or optimization?
There is no formal list. There are things that we know that we don't have
that we would like to have.
We don't have control speculation or data speculation. There is a patch
written for control speculation about a year ago, but I don't think anyone
ever looked at it other than the author Stan Cox. It would likely need a lot
of rewriting to be useable again. There was one aborted attempt to add data
speculation by Bernd Schmidt.
We don't have a good Itanium hardware description. There is an improved
pipeline description scheme that was written by Vladimir Makarov. However,
it hasn't been submitted to the FSF yet, and it is unclear when it will be,
so for now we aren't using it. When we did try to use it, we didn't get
any noticable speedup. It was unclear whether this was due to limitations
with the scheduler, or whether there was something wrong with the Itanium
pipeline description.
We don't have software pipelining. There is code written by Vladimir Makarov
for modulo contrained software pipelining (or something like that), but I
don't know how well it works, and it has no IA-64 specific knowledge, so it
doesn't support interesting features like register rotation. This code has
not been submitted to the FSF, and requires the new pipeline description scheme
which is also not submitted to the FSF, so we can't use it.
The support for predication could be improved. There is currently little or
no knowledge of predication outside the if-cvt.c file, so there are a number
of optimization passes that are suboptimal when predicated code is present.
Register allocation for instance, I don't think it will reuse registers in
mutex code blocks because it doesn't know about predication. The scheduler
was creating lots of false dependencies for a while, but that may have been
fixed already, I'm not sure.
That is kind of a general wish list based on interesting processor features.
So far, I haven't had any time to look at code and figure out what needs
to be improved. Just getting the toolchain working well enough, and complete
enough, for OS releases has taken almost all of my time so far.
There are ??? comments scattered through the ia64 backend, some of which
point out optimization opportunities. Most of these are small local opts
though, so we may not get much performance from them.
I suspect that we aren't generating enough ILP to really take advantage
of the architecture. So improving optimizations that create ILP might be
a good first step. Perhaps some code replication would be useful. Or perhaps
improving predication/speculation support would help. I would expect that
scheduler improvements tie into this. We probably need an agressive cross
block scheduler to get much benefit from any of these areas. I doubt we want
to go as far as trace scheduling, but there is probably something intermediate
that would be useful.
I also suspect that the, um, idiosyncracies of the Itanium pipeline are
causing problems. Some operations are a lot slower than one would expect.
Like dynamic shifts, which are effectively 10 cycles unless you schedule
them right or emit some nops after them in which case they are 4 cycles.
We don't get this right. There are a lot of cache pipeline flush cases
if you put stores to close. Some coalescing of stores might be useful to
avoid this. Using alternate code sequences might be useful in some cases.
The cost of moving integer values to/from the FP registers for the multiply
(xma) instruction might make other code sequences faster. I saw a message
today pointing out that we are moving values into branch registers too late.
There are likely other things that could be improved in this area.
We recently started emitting entire functions instead in a high level IL
instead of one statement at a time. This introduces the possibility of
adding high level optimizations such as loop transformations. We have
a little bit of dependence analysis, but it really isn't hooked into anything
yet. The C++ front end does function inlining on the high level IL, but the
C front end doesn't yet. This needs to be fixed. The C front end is still
using a low level function inliner which can't handle any complicated call
sequence, which includes HFA arguments. This means functions using the
complex FP types can't be inlined. This hurts glibc math library performance.
We have a little bit of infrastructure for profile directed feedback, but
we haven't tried to use it yet, and it probably needs some maintenance work
before it will be usable again. We probably should only rely on this as
a last resort though. This is inconvenient for many applications.
I just recently got back from a 3 week vacation, and haven't caught up on
my gcc reading yet, so some of this info might be out of date.
>Is there a separate forum for communication about IA-64 work in gcc?
The ia64-linux (formerly Trillian) group has a private mailing list for
the toolchain, but I'd rather not use it anymore, because it is private. There
is a public list linux-ia64@linuxia64.org but it mostly gets used by kernel
developers. It might be feasible to use it for some linux/gcc discussions
though. For now, the regular gcc lists are probably the best choice. If we
remember to put ia64 in the subject line, it should be possible for people
to follow just the ia64 threads if that is all they want to read.
Jim
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ia64 projects
2001-01-10 15:23 ` Jim Wilson
@ 2001-01-11 5:16 ` Bernd Schmidt
0 siblings, 0 replies; 6+ messages in thread
From: Bernd Schmidt @ 2001-01-11 5:16 UTC (permalink / raw)
To: Jim Wilson; +Cc: janis, gcc
On Wed, 10 Jan 2001, Jim Wilson wrote:
> In article < OFA91E2757.0853FAB8-ON882569CF.0082A6E5@LocalDomain > you write:
> >Is there an existing wish list of projects for IA-64 code generation
> >and/or optimization?
>
> We don't have a good Itanium hardware description. There is an improved
> pipeline description scheme that was written by Vladimir Makarov. However,
> it hasn't been submitted to the FSF yet, and it is unclear when it will be,
> so for now we aren't using it. When we did try to use it, we didn't get
> any noticable speedup. It was unclear whether this was due to limitations
> with the scheduler, or whether there was something wrong with the Itanium
> pipeline description.
I think Vlad's Itanium description is actually a bit less accurate than what
we now do via MD_SCHED_REORDER. I believe it also wasn't as good at placing
stop bits.
> The scheduler
> was creating lots of false dependencies for a while, but that may have been
> fixed already, I'm not sure.
That's mostly fixed.
> We probably need an agressive cross
> block scheduler to get much benefit from any of these areas. I doubt we want
> to go as far as trace scheduling, but there is probably something intermediate
> that would be useful.
The code I've added to schedule across extended basic blocks seems moderately
effective (maybe 8% on SPEC with the other improvements to reduce false
dependencies).
Bernd
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ia64 projects
@ 2001-01-10 15:41 dewar
0 siblings, 0 replies; 6+ messages in thread
From: dewar @ 2001-01-10 15:41 UTC (permalink / raw)
To: janis, wilson; +Cc: gcc
<<I suspect that we aren't generating enough ILP to really take advantage
of the architecture. So improving optimizations that create ILP might be
a good first step. Perhaps some code replication would be useful. Or perhaps
improving predication/speculation support would help. I would expect that
>>
Conventional wisdom for these kinds of architectures would say that you
have no hope of generating enough ILP if you do not do speculation.
Trace scheduling + speculation seems really required to extract the
promise of EPIC architectures of this kind.
SOunds like there are a LOT of opportunities here. It is probably worth
taking a close look at what is going on in the Trimaran project.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2001-01-11 5:16 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-01-09 15:48 ia64 projects Janis Johnson
2001-01-09 16:14 ` Neil Booth
2001-01-09 19:10 ` Yu Xuanwei
2001-01-10 15:23 ` Jim Wilson
2001-01-11 5:16 ` Bernd Schmidt
2001-01-10 15:41 dewar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).