public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/30688] Branch registers loaded too late on ia64
       [not found] <bug-30688-4@http.gcc.gnu.org/bugzilla/>
@ 2024-06-11 17:37 ` andi-gcc at firstfloor dot org
  2024-06-13  9:24 ` xry111 at gcc dot gnu.org
  1 sibling, 0 replies; 8+ messages in thread
From: andi-gcc at firstfloor dot org @ 2024-06-11 17:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30688

Andi Kleen <andi-gcc at firstfloor dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |WONTFIX

--- Comment #7 from Andi Kleen <andi-gcc at firstfloor dot org> ---
ia64 is obsolete

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/30688] Branch registers loaded too late on ia64
       [not found] <bug-30688-4@http.gcc.gnu.org/bugzilla/>
  2024-06-11 17:37 ` [Bug rtl-optimization/30688] Branch registers loaded too late on ia64 andi-gcc at firstfloor dot org
@ 2024-06-13  9:24 ` xry111 at gcc dot gnu.org
  1 sibling, 0 replies; 8+ messages in thread
From: xry111 at gcc dot gnu.org @ 2024-06-13  9:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30688

Xi Ruoyao <xry111 at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|WONTFIX                     |---
             Status|RESOLVED                    |REOPENED

--- Comment #8 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
Added the gentleman applying as the new IA64 maintainer into CC and reopened.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/30688] Branch registers loaded too late on ia64
  2007-02-03 11:17 [Bug rtl-optimization/30688] New: " ak at muc dot de
                   ` (4 preceding siblings ...)
  2009-03-16  8:46 ` steven at gcc dot gnu dot org
@ 2009-03-16 19:07 ` wilson at codesourcery dot com
  5 siblings, 0 replies; 8+ messages in thread
From: wilson at codesourcery dot com @ 2009-03-16 19:07 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from wilson at codesourcery dot com  2009-03-16 19:07 -------
Subject: Re:  Branch registers loaded too late
 on ia64

steven at gcc dot gnu dot org wrote:
> ------- Comment #5 from steven at gcc dot gnu dot org  2009-03-16 08:46 -------
> Can someone point me to the IA64 optimiation manuals mentioned in comment #0?  

You can find manuals on the Intel web site.  You want the Intel Itanium 
2 Processor Reference Manual (For Software Development and 
Optimization).  Chapter 7 talks about branch instructions.

> * Which branch registers can I use?

Any one of the 8 special branch registers, class BR_REGS.

> * What does "as early as possible" mean in comment #0?

The manual says there should be several cycles between the branch 
register write and the branch for correct prediction.  There is probably 
no "too early" to worry about, as long as you don't use more than the 
available 8 registers.  You want to avoid reloads here.  Some of the 
regs are call clobbered, some are preserved, and probably some are 
reserved for call/return.  I don't recall all of the ABI details.  You 
can look them up in the manuals.  See the Itanium Software Conventions 
and Runtime Architecture Guide.

> * What happens if a value is assigned to a branch register on IA64?  Is the
> prefetcher always triggered?  What is the latency of the prefetching after a
> branch register has been assigned a value?

This is complicated.  I suggest downloading the docs and reading them.

> * Is there a possibility to add a prediction hint to say "branch register A is
> more likely to be used than branch register B" when multiple branch registers
> are assigned a value in the same basic block?

There is separate predication support for each branch register, but I 
assume this is about priority for prefetching?  Yes, there are branch 
hints for that.  See the Itanium Architecture Software Developer's 
Manual, Volume 1, section 4.5 is for branch instructions.  There is a 
"few" completer for prefetching a few lines, and a "many" completer for 
prefetching many lines.  ia64.md uses "many" for call and return.

Jim


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30688


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/30688] Branch registers loaded too late on ia64
  2007-02-03 11:17 [Bug rtl-optimization/30688] New: " ak at muc dot de
                   ` (3 preceding siblings ...)
  2009-03-10  6:48 ` steven at gcc dot gnu dot org
@ 2009-03-16  8:46 ` steven at gcc dot gnu dot org
  2009-03-16 19:07 ` wilson at codesourcery dot com
  5 siblings, 0 replies; 8+ messages in thread
From: steven at gcc dot gnu dot org @ 2009-03-16  8:46 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from steven at gcc dot gnu dot org  2009-03-16 08:46 -------
Can someone point me to the IA64 optimiation manuals mentioned in comment #0?  

I'm looking for some answers, for example:

* Which branch registers can I use? bt-load can actually perform register
renaming.  It has to, of course, because bt-load runs after the register
allocator. The register allocator prefers to always use tr0 on sh64, and it
probably always tries to use the same branch register on ia64 too.  So register
renaming is a Good Thing here.  But which regs can I use on IA64?

* What does "as early as possible" mean in comment #0?  Are there
recommendations for what is considered "too early" (for example due to
interactions with calls and such)?

* What happens if a value is assigned to a branch register on IA64?  Is the
prefetcher always triggered?  What is the latency of the prefetching after a
branch register has been assigned a value?

* Is there a possibility to add a prediction hint to say "branch register A is
more likely to be used than branch register B" when multiple branch registers
are assigned a value in the same basic block?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30688


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/30688] Branch registers loaded too late on ia64
  2007-02-03 11:17 [Bug rtl-optimization/30688] New: " ak at muc dot de
                   ` (2 preceding siblings ...)
  2009-02-24  0:01 ` sje at cup dot hp dot com
@ 2009-03-10  6:48 ` steven at gcc dot gnu dot org
  2009-03-16  8:46 ` steven at gcc dot gnu dot org
  2009-03-16 19:07 ` wilson at codesourcery dot com
  5 siblings, 0 replies; 8+ messages in thread
From: steven at gcc dot gnu dot org @ 2009-03-10  6:48 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from steven at gcc dot gnu dot org  2009-03-10 06:48 -------
The load to the general register should also be moved by bt-load, then.

The bt-load pass is "designed" for SH only, in its current state, but I think
extending it to move a small group of insns instead of just one shouldn't be
very difficult.

Alternatively it could be done in ia64 machine-reorg, just before
scheduling/bundling.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30688


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/30688] Branch registers loaded too late on ia64
  2007-02-03 11:17 [Bug rtl-optimization/30688] New: " ak at muc dot de
  2007-02-03 11:23 ` [Bug rtl-optimization/30688] " ak at muc dot de
  2009-02-06 21:21 ` steven at gcc dot gnu dot org
@ 2009-02-24  0:01 ` sje at cup dot hp dot com
  2009-03-10  6:48 ` steven at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: sje at cup dot hp dot com @ 2009-02-24  0:01 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from sje at cup dot hp dot com  2009-02-24 00:01 -------
More work is needed then just setting flag_branch_target_optimize{,2}, we need
to define TARGET_BRANCH_TARGET_REGISTER_CLASS (return BR_REGS) and 
TARGET_BRANCH_TARGET_REGISTER_CALLEE_SAVED (return 1) but even then it doesn't
seem to help.  I think bt-load wants to move the load of the branch register up
but is constrained because we load it from a general register and we load the
general register just before loading the branch register.  Unless bt-load tries
to move the load of the general register up it cannot move the load of the
branch register up and I don't think it will try to move the load of the
general register up because that is not what it is designed to do.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30688


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/30688] Branch registers loaded too late on ia64
  2007-02-03 11:17 [Bug rtl-optimization/30688] New: " ak at muc dot de
  2007-02-03 11:23 ` [Bug rtl-optimization/30688] " ak at muc dot de
@ 2009-02-06 21:21 ` steven at gcc dot gnu dot org
  2009-02-24  0:01 ` sje at cup dot hp dot com
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: steven at gcc dot gnu dot org @ 2009-02-06 21:21 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from steven at gcc dot gnu dot org  2009-02-06 21:21 -------
GCC has the bt-load optimization for this.  But this code is not enabled for
ia64.  It could be so simple as just setting flag_branch_target_optimize{,2} to
true in the ia64 backend, but maybe more work is needed (I have never looked at
bt-load.c).


-- 

steven at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
           Keywords|                            |missed-optimization
   Last reconfirmed|0000-00-00 00:00:00         |2009-02-06 21:21:19
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30688


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/30688] Branch registers loaded too late on ia64
  2007-02-03 11:17 [Bug rtl-optimization/30688] New: " ak at muc dot de
@ 2007-02-03 11:23 ` ak at muc dot de
  2009-02-06 21:21 ` steven at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: ak at muc dot de @ 2007-02-03 11:23 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from ak at muc dot de  2007-02-03 11:22 -------
Here's a simple test case:

void f(int k, int (*fptr)(int i))
{ 
        int i;

        /* Do something useless */
        for (i = 0; i < 5; i++) 
                k *= 10;

        fptr(k); 
} 


compiled with  4.3.0 20070203 gives


...
        ;;
        .mmi
        nop 0
        mov r1 = r36
        mov b0 = r34
        .mib
        nop 0
        mov ar.pfs = r35
        br.ret.sptk.many b0


Note b0 is only loaded directly in front of the branch, even though it could
have been loaded much earlier in front of the loop.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30688


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-06-13  9:24 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-30688-4@http.gcc.gnu.org/bugzilla/>
2024-06-11 17:37 ` [Bug rtl-optimization/30688] Branch registers loaded too late on ia64 andi-gcc at firstfloor dot org
2024-06-13  9:24 ` xry111 at gcc dot gnu.org
2007-02-03 11:17 [Bug rtl-optimization/30688] New: " ak at muc dot de
2007-02-03 11:23 ` [Bug rtl-optimization/30688] " ak at muc dot de
2009-02-06 21:21 ` steven at gcc dot gnu dot org
2009-02-24  0:01 ` sje at cup dot hp dot com
2009-03-10  6:48 ` steven at gcc dot gnu dot org
2009-03-16  8:46 ` steven at gcc dot gnu dot org
2009-03-16 19:07 ` wilson at codesourcery dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).