public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/30688] Branch registers loaded too late on ia64
[not found] <bug-30688-4@http.gcc.gnu.org/bugzilla/>
@ 2024-06-11 17:37 ` andi-gcc at firstfloor dot org
2024-06-13 9:24 ` xry111 at gcc dot gnu.org
1 sibling, 0 replies; 8+ messages in thread
From: andi-gcc at firstfloor dot org @ 2024-06-11 17:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30688
Andi Kleen <andi-gcc at firstfloor dot org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |WONTFIX
--- Comment #7 from Andi Kleen <andi-gcc at firstfloor dot org> ---
ia64 is obsolete
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/30688] Branch registers loaded too late on ia64
[not found] <bug-30688-4@http.gcc.gnu.org/bugzilla/>
2024-06-11 17:37 ` [Bug rtl-optimization/30688] Branch registers loaded too late on ia64 andi-gcc at firstfloor dot org
@ 2024-06-13 9:24 ` xry111 at gcc dot gnu.org
1 sibling, 0 replies; 8+ messages in thread
From: xry111 at gcc dot gnu.org @ 2024-06-13 9:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30688
Xi Ruoyao <xry111 at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|WONTFIX |---
Status|RESOLVED |REOPENED
--- Comment #8 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
Added the gentleman applying as the new IA64 maintainer into CC and reopened.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/30688] Branch registers loaded too late on ia64
2007-02-03 11:17 [Bug rtl-optimization/30688] New: " ak at muc dot de
` (4 preceding siblings ...)
2009-03-16 8:46 ` steven at gcc dot gnu dot org
@ 2009-03-16 19:07 ` wilson at codesourcery dot com
5 siblings, 0 replies; 8+ messages in thread
From: wilson at codesourcery dot com @ 2009-03-16 19:07 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from wilson at codesourcery dot com 2009-03-16 19:07 -------
Subject: Re: Branch registers loaded too late
on ia64
steven at gcc dot gnu dot org wrote:
> ------- Comment #5 from steven at gcc dot gnu dot org 2009-03-16 08:46 -------
> Can someone point me to the IA64 optimiation manuals mentioned in comment #0?
You can find manuals on the Intel web site. You want the Intel Itanium
2 Processor Reference Manual (For Software Development and
Optimization). Chapter 7 talks about branch instructions.
> * Which branch registers can I use?
Any one of the 8 special branch registers, class BR_REGS.
> * What does "as early as possible" mean in comment #0?
The manual says there should be several cycles between the branch
register write and the branch for correct prediction. There is probably
no "too early" to worry about, as long as you don't use more than the
available 8 registers. You want to avoid reloads here. Some of the
regs are call clobbered, some are preserved, and probably some are
reserved for call/return. I don't recall all of the ABI details. You
can look them up in the manuals. See the Itanium Software Conventions
and Runtime Architecture Guide.
> * What happens if a value is assigned to a branch register on IA64? Is the
> prefetcher always triggered? What is the latency of the prefetching after a
> branch register has been assigned a value?
This is complicated. I suggest downloading the docs and reading them.
> * Is there a possibility to add a prediction hint to say "branch register A is
> more likely to be used than branch register B" when multiple branch registers
> are assigned a value in the same basic block?
There is separate predication support for each branch register, but I
assume this is about priority for prefetching? Yes, there are branch
hints for that. See the Itanium Architecture Software Developer's
Manual, Volume 1, section 4.5 is for branch instructions. There is a
"few" completer for prefetching a few lines, and a "many" completer for
prefetching many lines. ia64.md uses "many" for call and return.
Jim
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30688
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/30688] Branch registers loaded too late on ia64
2007-02-03 11:17 [Bug rtl-optimization/30688] New: " ak at muc dot de
` (3 preceding siblings ...)
2009-03-10 6:48 ` steven at gcc dot gnu dot org
@ 2009-03-16 8:46 ` steven at gcc dot gnu dot org
2009-03-16 19:07 ` wilson at codesourcery dot com
5 siblings, 0 replies; 8+ messages in thread
From: steven at gcc dot gnu dot org @ 2009-03-16 8:46 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from steven at gcc dot gnu dot org 2009-03-16 08:46 -------
Can someone point me to the IA64 optimiation manuals mentioned in comment #0?
I'm looking for some answers, for example:
* Which branch registers can I use? bt-load can actually perform register
renaming. It has to, of course, because bt-load runs after the register
allocator. The register allocator prefers to always use tr0 on sh64, and it
probably always tries to use the same branch register on ia64 too. So register
renaming is a Good Thing here. But which regs can I use on IA64?
* What does "as early as possible" mean in comment #0? Are there
recommendations for what is considered "too early" (for example due to
interactions with calls and such)?
* What happens if a value is assigned to a branch register on IA64? Is the
prefetcher always triggered? What is the latency of the prefetching after a
branch register has been assigned a value?
* Is there a possibility to add a prediction hint to say "branch register A is
more likely to be used than branch register B" when multiple branch registers
are assigned a value in the same basic block?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30688
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/30688] Branch registers loaded too late on ia64
2007-02-03 11:17 [Bug rtl-optimization/30688] New: " ak at muc dot de
` (2 preceding siblings ...)
2009-02-24 0:01 ` sje at cup dot hp dot com
@ 2009-03-10 6:48 ` steven at gcc dot gnu dot org
2009-03-16 8:46 ` steven at gcc dot gnu dot org
2009-03-16 19:07 ` wilson at codesourcery dot com
5 siblings, 0 replies; 8+ messages in thread
From: steven at gcc dot gnu dot org @ 2009-03-10 6:48 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from steven at gcc dot gnu dot org 2009-03-10 06:48 -------
The load to the general register should also be moved by bt-load, then.
The bt-load pass is "designed" for SH only, in its current state, but I think
extending it to move a small group of insns instead of just one shouldn't be
very difficult.
Alternatively it could be done in ia64 machine-reorg, just before
scheduling/bundling.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30688
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/30688] Branch registers loaded too late on ia64
2007-02-03 11:17 [Bug rtl-optimization/30688] New: " ak at muc dot de
2007-02-03 11:23 ` [Bug rtl-optimization/30688] " ak at muc dot de
2009-02-06 21:21 ` steven at gcc dot gnu dot org
@ 2009-02-24 0:01 ` sje at cup dot hp dot com
2009-03-10 6:48 ` steven at gcc dot gnu dot org
` (2 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: sje at cup dot hp dot com @ 2009-02-24 0:01 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from sje at cup dot hp dot com 2009-02-24 00:01 -------
More work is needed then just setting flag_branch_target_optimize{,2}, we need
to define TARGET_BRANCH_TARGET_REGISTER_CLASS (return BR_REGS) and
TARGET_BRANCH_TARGET_REGISTER_CALLEE_SAVED (return 1) but even then it doesn't
seem to help. I think bt-load wants to move the load of the branch register up
but is constrained because we load it from a general register and we load the
general register just before loading the branch register. Unless bt-load tries
to move the load of the general register up it cannot move the load of the
branch register up and I don't think it will try to move the load of the
general register up because that is not what it is designed to do.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30688
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/30688] Branch registers loaded too late on ia64
2007-02-03 11:17 [Bug rtl-optimization/30688] New: " ak at muc dot de
2007-02-03 11:23 ` [Bug rtl-optimization/30688] " ak at muc dot de
@ 2009-02-06 21:21 ` steven at gcc dot gnu dot org
2009-02-24 0:01 ` sje at cup dot hp dot com
` (3 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: steven at gcc dot gnu dot org @ 2009-02-06 21:21 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from steven at gcc dot gnu dot org 2009-02-06 21:21 -------
GCC has the bt-load optimization for this. But this code is not enabled for
ia64. It could be so simple as just setting flag_branch_target_optimize{,2} to
true in the ia64 backend, but maybe more work is needed (I have never looked at
bt-load.c).
--
steven at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever Confirmed|0 |1
Keywords| |missed-optimization
Last reconfirmed|0000-00-00 00:00:00 |2009-02-06 21:21:19
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30688
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/30688] Branch registers loaded too late on ia64
2007-02-03 11:17 [Bug rtl-optimization/30688] New: " ak at muc dot de
@ 2007-02-03 11:23 ` ak at muc dot de
2009-02-06 21:21 ` steven at gcc dot gnu dot org
` (4 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: ak at muc dot de @ 2007-02-03 11:23 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from ak at muc dot de 2007-02-03 11:22 -------
Here's a simple test case:
void f(int k, int (*fptr)(int i))
{
int i;
/* Do something useless */
for (i = 0; i < 5; i++)
k *= 10;
fptr(k);
}
compiled with 4.3.0 20070203 gives
...
;;
.mmi
nop 0
mov r1 = r36
mov b0 = r34
.mib
nop 0
mov ar.pfs = r35
br.ret.sptk.many b0
Note b0 is only loaded directly in front of the branch, even though it could
have been loaded much earlier in front of the loop.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30688
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-06-13 9:24 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <bug-30688-4@http.gcc.gnu.org/bugzilla/>
2024-06-11 17:37 ` [Bug rtl-optimization/30688] Branch registers loaded too late on ia64 andi-gcc at firstfloor dot org
2024-06-13 9:24 ` xry111 at gcc dot gnu.org
2007-02-03 11:17 [Bug rtl-optimization/30688] New: " ak at muc dot de
2007-02-03 11:23 ` [Bug rtl-optimization/30688] " ak at muc dot de
2009-02-06 21:21 ` steven at gcc dot gnu dot org
2009-02-24 0:01 ` sje at cup dot hp dot com
2009-03-10 6:48 ` steven at gcc dot gnu dot org
2009-03-16 8:46 ` steven at gcc dot gnu dot org
2009-03-16 19:07 ` wilson at codesourcery dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).