public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* ia64 indirect call optimization
       [not found]   ` <15667.30671.558469.613310@napali.hpl.hp.com>
@ 2002-07-16 12:47     ` Richard Henderson
  0 siblings, 0 replies; only message in thread
From: Richard Henderson @ 2002-07-16 12:47 UTC (permalink / raw)
  To: davidm; +Cc: gcc-patches

[ For gcc-patches, the topic is getting ia64 branch registers loaded
  early enough for branch prediction hardware to work.  In particular,

	extern void nop (void);
	void foo (long num_iterations)
	{
	  void (*func1) (void) = nop;
	  while (num_iterations-- > 0)
	    (*func1) ();
	}

  should use one of the 5 call-saved branch registers to hold the
  function address across the loop.  ]

I can't think of a way to distinguish between the ptr-from-symbol
and ptr-from-elsewhere case.  Indeed, a good optimizer we ought to
never have the ptr-from-symbol case, since that should be constant
propagated to call-direct instead of call-indirect.

Which leaves us in a kind of pickle when it comes to emitting rtl
that definitely will not alias with a descriptor that a user just
constructed by hand eg on the stack.  I'm inclined to suggest that
this is rare enough that those few places that do this sort of thing
can be bothered to use extra markup to tell the compiler that
something odd is going on.

Anyway, give this a shot and let us know what kind of performance
impact it has on both Merced and McKinley.  If it doesn't help a lot,
maybe we just forget the whole thing.


r~


	* ia64.c (ia64_expand_call): Mark descriptor loads unchanging.

Index: config/ia64/ia64.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/ia64/ia64.c,v
retrieving revision 1.180
diff -c -p -d -u -r1.180 ia64.c
--- config/ia64/ia64.c	16 Jul 2002 16:07:13 -0000	1.180
+++ config/ia64/ia64.c	16 Jul 2002 19:08:05 -0000
@@ -1452,9 +1452,20 @@ ia64_expand_call (retval, addr, nextarg,
   /* If this is an indirect call, then we have the address of a descriptor.  */
   if (indirect_p)
     {
-      dest = force_reg (DImode, gen_rtx_MEM (DImode, addr));
-      emit_move_insn (pic_offset_table_rtx,
-		      gen_rtx_MEM (DImode, plus_constant (addr, 8)));
+      rtx mem;
+
+      /* ??? Unconditionally marking these unchanging makes it less
+	 trivial to write C code that builds descriptors.  Should be
+	 possible to use asm("" : : : "memory") to force the stores
+	 before the reads.  */
+
+      mem = gen_rtx_MEM (DImode, addr);
+      RTX_UNCHANGING_P (mem) = 1;
+      dest = force_reg (DImode, mem);
+
+      mem = gen_rtx_MEM (DImode, plus_constant (addr, 8));
+      RTX_UNCHANGING_P (mem) = 1;
+      emit_move_insn (pic_offset_table_rtx, mem);
     }
   else
     dest = addr;

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2002-07-16 19:29 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <200207151925.g6FJPgZR023037@napali.hpl.hp.com>
     [not found] ` <20020715131946.E20403@redhat.com>
     [not found]   ` <15667.30671.558469.613310@napali.hpl.hp.com>
2002-07-16 12:47     ` ia64 indirect call optimization Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).