public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: egcs: A new compiler project to merge the existing GCC forks (fwd)
@ 1997-08-19  7:36 Robert Wilhelm
  1997-08-19  8:08 ` Reload patch to improve 386 code Bernd Schmidt
                   ` (4 more replies)
  0 siblings, 5 replies; 33+ messages in thread
From: Robert Wilhelm @ 1997-08-19  7:36 UTC (permalink / raw)
  To: egcs

> 
> Can someone point me the location of mdbench?
>

http://www.sissa.it/furio/Mdbnch/info.html

Robert

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Reload patch to improve 386 code
  1997-08-19  7:36 egcs: A new compiler project to merge the existing GCC forks (fwd) Robert Wilhelm
  1997-08-19  8:08 ` Reload patch to improve 386 code Bernd Schmidt
@ 1997-08-19  8:08 ` Bernd Schmidt
  1997-08-19  8:08 ` Testsuite stuff Jeffrey A Law
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 33+ messages in thread
From: Bernd Schmidt @ 1997-08-19  8:08 UTC (permalink / raw)
  To: egcs

> My comment on reload is meant for more
> reload as we currently know it.  I think it is an horrible kludge that reload
> is run as a pass after global-alloc, and that it forces reload registers not to
> be used for any other purpose (which is murder on the x86 with each register
> being special purpose in some way).

That problem can be solved; actually it is mostly solved in my patch.

> I think it should be integrated into global-alloc, taking lifetimes, ranges,
> etc. into consideration, possibly leaving the fossil reload for when you are
> not optimizing.  Given that reload has pretty much been the place where we all
> fear to tread, at least since RMS handed over the gcc2 reigns to Kenner (and
> even in the 1.3x time frame, I got the sense that RMS no longer had a good
> handle on reload anymore), I think it is time for a rethought.  Now, given
> it is a rather critical piece of the compiler, it may take months if not
> years to get it better than it currently is.

I have some ideas for changing reload that could go on top of what I have. I'll
describe what I'm planning to do, if you have any additional ideas I'd be happy
to hear about them.
First, I'd like to eliminate the code that counts the needs for registers
globally. The patch I sent is a first step in that direction; it could probably
easily be extended to spill registers locally for every instruction. This would
require a slightly different way of calculating the possible damage from
spilling a register, but I think it would not be hard to do.
It would be nice to have code that detects that an insn needs a spill reg in a
one-register class (like ecx for variable shifts on the 386), but a different
register is free. In that case, reload should move ecx to the free register
before the instruction, and back afterwards, if this is profitable. I'm not
sure how hard that would be, and I haven't experimented with that kind of
thing yet.
Then, there are some simplifications that could be done. I don't like the
inheritance code, find_equiv_reg and all that. IMHO reload shouldn't try to be
very clever about this sort of thing - the reload_cse_regs pass can be made
more clever. I've already submitted a patch to Kenner that enables
reload_cse_regs to generate optional reloads. If we could add some more
cleverness (e.g. deleting redundant stores into spill slots or eliminating
register-register copies), quite a bit of code in reload could be deleted.
I've already made some experiments in this direction which indicate that this
approach may be feasible.
Another way of simplifying reload would be to try to generate auto-inc
addressing after reload has run, not before. That would eliminate some more
special-case code (and it might make other parts of the compiler simpler as
well).

Bernd

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Testsuite stuff
  1997-08-19  7:36 egcs: A new compiler project to merge the existing GCC forks (fwd) Robert Wilhelm
  1997-08-19  8:08 ` Reload patch to improve 386 code Bernd Schmidt
  1997-08-19  8:08 ` Bernd Schmidt
@ 1997-08-19  8:08 ` Jeffrey A Law
  1997-08-19  8:08 ` Reload patch to improve 386 code Jeffrey A Law
  1997-08-19  9:34 ` Some Haifa scheduler bugs Bernd Schmidt
  4 siblings, 0 replies; 33+ messages in thread
From: Jeffrey A Law @ 1997-08-19  8:08 UTC (permalink / raw)
  To: egcs

I'm hoping to include the basic testsuite for c & c++ in the
next snapshot.

The testsuites use the dejagnu testing framework; therefore,
you'll need a copy of dejagnu installed to be able to run the
testsuites.

So, I've put a link to a recent dejagnu snapshot in
ftp.cygnus.com:/pub/egcs/infrastructure/dejagnu-970707.tar.gz

[ Note, old releases of dejagnu will not work -- a great deal of
  dejagnu and the testsuite harnesses changed recently to improve
  testing of cross compilers, dos hosted toolchains, etc etc. ]


Running the testsuite is easy.  After you've built the compiler
in theory all you have to do is say "make check" and wait.


This is one small step in the rather large job of exporting
Cygnus's internal testing infrastructure to the egcs community.


Jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Reload patch to improve 386 code
  1997-08-19  7:36 egcs: A new compiler project to merge the existing GCC forks (fwd) Robert Wilhelm
                   ` (2 preceding siblings ...)
  1997-08-19  8:08 ` Testsuite stuff Jeffrey A Law
@ 1997-08-19  8:08 ` Jeffrey A Law
  1997-08-19  9:34 ` Some Haifa scheduler bugs Bernd Schmidt
  4 siblings, 0 replies; 33+ messages in thread
From: Jeffrey A Law @ 1997-08-19  8:08 UTC (permalink / raw)
  To: egcs

  In message <Pine.SOL.3.90.970819094031.291D-100000@starsky.informatik.rwth-aa
chen.de>you write:
  > The idea of running sched before reload seems to be to improve code like
  > this:
  > move mem1 => pseudo1
  > move pseudo1 => mem2
  > move mem3 => pseudo2
  > move pseudo2 => mem4
  > move mem5 => pseudo3
  > move pseudo3 => mem6
Or, more generally:

  * The scheduling pass before reload has fewer data dependencies (because
  it works with pseudos), and thus can more instructions more freely to
  produce better schedules.  However, the downside is potentially increased
  register pressure.

  * The scheduling pass after reload primarily helps with scheduling of
  spill code and prologue/epilogue insns.  With pseudos mapped to hard
  registers, there are far fewer opportunities for the scheduler to
  move instructions.

  > If this is left as it stands, register allocation will most likely allocate
  > the pseudo to the same hard register. This means the post-reload sched pass
  > can't do anything with it, and the CPU can't either because there is no
  > parallelism in the code (well, at least the Pentium can't).
Yup.

  > you suddenly have two blocks of three independent instructions which could
  > run in parallel. However, this will lose badly once you don't have three
  > instructions of that kind, but a hundred (since your average CPU doesn't
  > have a hundred hard registers).
Yup.  Hence my message about throttling the scheduler once the ratio of
pseudos to hard registers hits some magic number.


  > Another approach I've been thinking about is to add code that analyzes code
  > like this after reload
  > 
  > move mem1 => hardreg1
  > move hardreg1 => mem2
  > move mem3 => hardreg1
  > move hardreg1 => mem4
  > move mem5 => hardreg1
  > move hardreg1 => mem6
  > 
  > and tries to make it use as many independent hard registers as possible.
  > That would make the scheduling opportunities available without the risk
  > of over-scheduling before reload. I don't know how feasible this is.
Yes, I've considered doing similar things myself, but I didn't think it
was really worth the effort..

Jeff
Jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Reload patch to improve 386 code
  1997-08-19  7:36 egcs: A new compiler project to merge the existing GCC forks (fwd) Robert Wilhelm
@ 1997-08-19  8:08 ` Bernd Schmidt
  1997-08-19  8:08 ` Bernd Schmidt
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 33+ messages in thread
From: Bernd Schmidt @ 1997-08-19  8:08 UTC (permalink / raw)
  To: egcs

> Before this leaves my head, I wanted to point something out which
> you've reminded me of.  When the scheduler (this applies to both the
> original and Haifa versions equally) becomes aggressive, it produces a
> large number of reloads in certain situations.  

The idea of running sched before reload seems to be to improve code like this:
move mem1 => pseudo1
move pseudo1 => mem2
move mem3 => pseudo2
move pseudo2 => mem4
move mem5 => pseudo3
move pseudo3 => mem6

If this is left as it stands, register allocation will most likely allocate
the pseudo to the same hard register. This means the post-reload sched pass
can't do anything with it, and the CPU can't either because there is no
parallelism in the code (well, at least the Pentium can't).

If sched modifies the above to look like this

move mem1 => pseudo1
move mem3 => pseudo2
move mem5 => pseudo3
move pseudo1 => mem2
move pseudo2 => mem4
move pseudo3 => mem6

you suddenly have two blocks of three independent instructions which could
run in parallel. However, this will lose badly once you don't have three
instructions of that kind, but a hundred (since your average CPU doesn't
have a hundred hard registers).

Another approach I've been thinking about is to add code that analyzes code
like this after reload

move mem1 => hardreg1
move hardreg1 => mem2
move mem3 => hardreg1
move hardreg1 => mem4
move mem5 => hardreg1
move hardreg1 => mem6

and tries to make it use as many independent hard registers as possible.
That would make the scheduling opportunities available without the risk
of over-scheduling before reload. I don't know how feasible this is.

Bernd

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Some Haifa scheduler bugs
  1997-08-19  7:36 egcs: A new compiler project to merge the existing GCC forks (fwd) Robert Wilhelm
                   ` (3 preceding siblings ...)
  1997-08-19  8:08 ` Reload patch to improve 386 code Jeffrey A Law
@ 1997-08-19  9:34 ` Bernd Schmidt
  4 siblings, 0 replies; 33+ messages in thread
From: Bernd Schmidt @ 1997-08-19  9:34 UTC (permalink / raw)
  To: egcs

I've run c-torture on the egcs snapshot, using the Haifa scheduler with
all flags turned on. My system is an i586-linux one. Here's a patch to fix
some of the failures.

There seems to be at least one additional problem, which I haven't really
investigated yet. It appears as if the scheduler extends the lifetime of
hard registers, which is deadly on SMALL_REGISTER_CLASSES machines. More
specifically, the broken code looks like this:

	call somefunction
	do a multiplication; clobbers eax
	movl eax,ebx

while the correct code moves eax to ebx directly after the call. I have not
found any code in either haifa-sched.c or sched.c to prevent this problem,
so it's quite possible that the normal scheduler may also suffer from this.

	* haifa-sched.c (target_bb, bbset_size, dom, prob, rgn_nr_edges,
	rgn_edges, edgeset_size, edge_to_bit, pot_split, ancestor_edges):
	Make static.
	(move_insn): Call reemit_notes for every insn in a SCHED_GROUP.
	(schedule_block): When checking whether an insn is the 
	basic_block_head of a different block, use the first insn in the
	same SCHED_GROUP instead of the insn itself.
	(debug_dependencies): GET_RTX_NAME takes an rtx code, not an rtx.

*** haifa-sched.c.orig-1	Mon Aug 18 13:03:47 1997
--- haifa-sched.c	Mon Aug 18 21:00:48 1997
*************** int *bblst_table, bblst_size, bblst_last
*** 626,632 ****
  #define SRC_PROB(src) ( candidate_table[src].src_prob )
  
  /* The bb being currently scheduled.  */
! int target_bb;
  
  /* List of edges.  */
  typedef bitlst edgelst;
--- 626,632 ----
  #define SRC_PROB(src) ( candidate_table[src].src_prob )
  
  /* The bb being currently scheduled.  */
! static int target_bb;
  
  /* List of edges.  */
  typedef bitlst edgelst;
*************** void debug_candidates PROTO ((int));
*** 642,652 ****
  typedef bitset bbset;
  
  /* Number of words of the bbset.  */
! int bbset_size;
  
  /* Dominators array: dom[i] contains the bbset of dominators of
     bb i in the region.  */
! bbset *dom;
  
  /* bb 0 is the only region entry */
  #define IS_RGN_ENTRY(bb) (!bb)
--- 642,652 ----
  typedef bitset bbset;
  
  /* Number of words of the bbset.  */
! static int bbset_size;
  
  /* Dominators array: dom[i] contains the bbset of dominators of
     bb i in the region.  */
! static bbset *dom;
  
  /* bb 0 is the only region entry */
  #define IS_RGN_ENTRY(bb) (!bb)
*************** bbset *dom;
*** 657,663 ****
  
  /* Probability: Prob[i] is a float in [0, 1] which is the probability
     of bb i relative to the region entry.  */
! float *prob;
  
  /*  The probability of bb_src, relative to bb_trg.  Note, that while the
     'prob[bb]' is a float in [0, 1], this macro returns an integer
--- 657,663 ----
  
  /* Probability: Prob[i] is a float in [0, 1] which is the probability
     of bb i relative to the region entry.  */
! static float *prob;
  
  /*  The probability of bb_src, relative to bb_trg.  Note, that while the
     'prob[bb]' is a float in [0, 1], this macro returns an integer
*************** float *prob;
*** 669,684 ****
  typedef bitset edgeset;
  
  /* Number of edges in the region.  */
! int rgn_nr_edges;
  
  /* Array of size rgn_nr_edges.    */
! int *rgn_edges;
  
  /* Number of words in an edgeset.    */
! int edgeset_size;
  
  /* Mapping from each edge in the graph to its number in the rgn.  */
! int *edge_to_bit;
  #define EDGE_TO_BIT(edge) (edge_to_bit[edge])
  
  /* The split edges of a source bb is different for each target
--- 669,684 ----
  typedef bitset edgeset;
  
  /* Number of edges in the region.  */
! static int rgn_nr_edges;
  
  /* Array of size rgn_nr_edges.    */
! static int *rgn_edges;
  
  /* Number of words in an edgeset.    */
! static int edgeset_size;
  
  /* Mapping from each edge in the graph to its number in the rgn.  */
! static int *edge_to_bit;
  #define EDGE_TO_BIT(edge) (edge_to_bit[edge])
  
  /* The split edges of a source bb is different for each target
*************** int *edge_to_bit;
*** 687,696 ****
     the split edges of each bb relative to the region entry.
  
     pot_split[bb] is the set of potential split edges of bb.  */
! edgeset *pot_split;
  
  /* For every bb, a set of its ancestor edges.  */
! edgeset *ancestor_edges;
  
  static void compute_dom_prob_ps PROTO ((int));
  
--- 687,696 ----
     the split edges of each bb relative to the region entry.
  
     pot_split[bb] is the set of potential split edges of bb.  */
! static edgeset *pot_split;
  
  /* For every bb, a set of its ancestor edges.  */
! static edgeset *ancestor_edges;
  
  static void compute_dom_prob_ps PROTO ((int));
  
*************** build_control_flow ()
*** 1277,1284 ****
  /* construct edges in the control flow graph, from 'source' block, to
     blocks refered to by 'pattern'.  */
  
! static
! void 
  build_jmp_edges (pattern, source)
       rtx pattern;
       int source;
--- 1277,1283 ----
  /* construct edges in the control flow graph, from 'source' block, to
     blocks refered to by 'pattern'.  */
  
! static void
  build_jmp_edges (pattern, source)
       rtx pattern;
       int source;
*************** move_insn (insn, last)
*** 6512,6520 ****
        move_insn1 (insn, last);
        insn = prev;
      }
- 
    move_insn1 (insn, last);
!   return reemit_notes (new_last, new_last);
  }
  
  /* Return an insn which represents a SCHED_GROUP, which is
--- 6511,6524 ----
        move_insn1 (insn, last);
        insn = prev;
      }
    move_insn1 (insn, last);
!   while (insn != new_last)
!     {
!       rtx next = NEXT_INSN (insn);
!       reemit_notes (insn, insn);
!       insn = next;
!     }
!   return reemit_notes (insn, insn);
  }
  
  /* Return an insn which represents a SCHED_GROUP, which is
*************** schedule_block (bb, rgn, rgn_n_insns)
*** 6840,6848 ****
  	      /* an interblock motion? */
  	      if (INSN_BB (insn) != target_bb)
  		{
  		  if (IS_SPECULATIVE_INSN (insn))
  		    {
- 
  		      if (!check_live (insn, INSN_BB (insn), target_bb))
  			{
  			  /* speculative motion, live check failed, remove
--- 6844,6853 ----
  	      /* an interblock motion? */
  	      if (INSN_BB (insn) != target_bb)
  		{
+ 		  rtx tmp;
+ 
  		  if (IS_SPECULATIVE_INSN (insn))
  		    {
  		      if (!check_live (insn, INSN_BB (insn), target_bb))
  			{
  			  /* speculative motion, live check failed, remove
*************** schedule_block (bb, rgn, rgn_n_insns)
*** 6861,6878 ****
  		  nr_inter++;
  
  		  /* update source block boundaries */
  		  b1 = INSN_BLOCK (insn);
! 		  if (insn == basic_block_head[b1]
  		      && insn == basic_block_end[b1])
  		    {
! 		      emit_note_after (NOTE_INSN_DELETED, basic_block_head[b1]);
  		      basic_block_end[b1] = basic_block_head[b1] = NEXT_INSN (insn);
  		    }
  		  else if (insn == basic_block_end[b1])
  		    {
! 		      basic_block_end[b1] = PREV_INSN (insn);
  		    }
! 		  else if (insn == basic_block_head[b1])
  		    {
  		      basic_block_head[b1] = NEXT_INSN (insn);
  		    }
--- 6866,6886 ----
  		  nr_inter++;
  
  		  /* update source block boundaries */
+ 		  tmp = insn;
+ 		  while (SCHED_GROUP_P (tmp))
+ 		    tmp = PREV_INSN (tmp);
  		  b1 = INSN_BLOCK (insn);
! 		  if (tmp == basic_block_head[b1]
  		      && insn == basic_block_end[b1])
  		    {
! 		      emit_note_after (NOTE_INSN_DELETED, insn);
  		      basic_block_end[b1] = basic_block_head[b1] = NEXT_INSN (insn);
  		    }
  		  else if (insn == basic_block_end[b1])
  		    {
! 		      basic_block_end[b1] = PREV_INSN (tmp);
  		    }
! 		  else if (tmp == basic_block_head[b1])
  		    {
  		      basic_block_head[b1] = NEXT_INSN (insn);
  		    }
*************** debug_dependencies ()
*** 7383,7389 ****
  				 NOTE_SOURCE_FILE (insn));
  		    }
  		  else
! 		    fprintf (dump, " {%s}\n", GET_RTX_NAME (insn));
  		  continue;
  		}
  
--- 7391,7397 ----
  				 NOTE_SOURCE_FILE (insn));
  		    }
  		  else
! 		    fprintf (dump, " {%s}\n", GET_RTX_NAME (GET_CODE (insn)));
  		  continue;
  		}
  

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
  1997-09-05  7:15           ` Bernd Schmidt
  1997-09-05  7:55             ` Jeffrey A Law
@ 1997-09-05  8:33             ` Ian Lance Taylor
  1 sibling, 0 replies; 33+ messages in thread
From: Ian Lance Taylor @ 1997-09-05  8:33 UTC (permalink / raw)
  To: crux; +Cc: law, egcs

   Date: Fri, 5 Sep 1997 13:44:34 +0200 (MET DST)
   From: Bernd Schmidt <crux@Pool.Informatik.RWTH-Aachen.DE>

   >   In message < Pine.SOL.3.90.970903184242.6496A-100000@bond.informatik.rwth-aachen.de >you write:
   >   > How can reload generate incorrect code in this case except if the lifetime
   >   > of a hard register gets extended by the scheduler or some other optimizer?
   > When loading of an argument register requires a reload, which may force
   > reload to clobber a previously loaded argument register.

   Now I'm confused. Isn't the reload_address_base_reg_class stuff supposed to
   handle that case?

Yes, it is.  I wrote it to handle the MIPS16, which has only 8 general
purpose registers but, for reasons of compatibility with standard
MIPS, uses 4 of them for register passing.  (The MIPS16 support will
get into the general release one of these days--it's all sitting on a
badly hacked up branch right now.)

However, people have encountered cases which fail when using the more
baroque MIPS EABI calling convention, which uses 8 registers for
register passing (but not the same 8 that the MIPS16 uses--it's too
complicated to get into now).

My personal attitude is that people should try using the register
parameters option on the ix86, but should not depend upon it to work,
and should be prepared to fix the problems themselves rather than
merely reporting them.

Note that the ix86 is more likely to work correctly than the MIPS16,
because the ix86 can load a value directly from memory, whereas the
MIPS16 must first construct the address in a register (i.e., the ix86
has a 32 bit direct addressing mode, but the MIPS16 does not).

Ian

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
  1997-09-05  7:15           ` Bernd Schmidt
@ 1997-09-05  7:55             ` Jeffrey A Law
  1997-09-05  8:33             ` Ian Lance Taylor
  1 sibling, 0 replies; 33+ messages in thread
From: Jeffrey A Law @ 1997-09-05  7:55 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: egcs

  In message < Pine.SOL.3.90.970905134354.4356D-100000@tracy.informatik.rwth-aachen.de >you write:
  > Now I'm confused. Isn't the reload_address_base_reg_class stuff supposed to
  > handle that case?
I haven't read all the new code, but based on experiences of someone in
Cygnus porting to a register poor machine that passes arguments in registers
my impression was that the reload_address_base_reg_class stuff wasn't
complete or sufficient to prevent incorrect code from being generated.

Jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
  1997-09-03 10:07         ` Jeffrey A Law
@ 1997-09-05  7:15           ` Bernd Schmidt
  1997-09-05  7:55             ` Jeffrey A Law
  1997-09-05  8:33             ` Ian Lance Taylor
  0 siblings, 2 replies; 33+ messages in thread
From: Bernd Schmidt @ 1997-09-05  7:15 UTC (permalink / raw)
  To: law; +Cc: egcs

>   In message < Pine.SOL.3.90.970903184242.6496A-100000@bond.informatik.rwth-aachen.de >you write:
>   > How can reload generate incorrect code in this case except if the lifetime
>   > of a hard register gets extended by the scheduler or some other optimizer?
> When loading of an argument register requires a reload, which may force
> reload to clobber a previously loaded argument register.

Now I'm confused. Isn't the reload_address_base_reg_class stuff supposed to
handle that case?

Bernd

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
  1997-09-04 11:33             ` David S. Miller
@ 1997-09-04 14:38               ` David Edelsohn
  0 siblings, 0 replies; 33+ messages in thread
From: David Edelsohn @ 1997-09-04 14:38 UTC (permalink / raw)
  To: David S. Miller; +Cc: law, burley, egcs

>>>>> "David S Miller" writes:

Jeff> I'm less familiar with how this works under aix or with
Jeff> ultrasparcs/solaris, but I assume they both do something relatively
Jeff> reasonable.

David> Naturally, this depends upon OS support, and I think Solaris does map
David> page 0 in this special way, or they check the load instruction which
David> was executed by the user once the fault handling call chain determines
David> that the access was in outer space.

	The IBM XL compilers have a switch to disable the speculative load
optimization, but AIX does not have any way to make loads from page 0
generate a fault -- it is a general feature provided by the OS without
user control.  Neither the POWER/PowerPC architecture nor the executable
header have any selectable switch to enable or disable that general
functionality.

David

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
  1997-09-04 11:12           ` Jeffrey A Law
@ 1997-09-04 11:33             ` David S. Miller
  1997-09-04 14:38               ` David Edelsohn
  0 siblings, 1 reply; 33+ messages in thread
From: David S. Miller @ 1997-09-04 11:33 UTC (permalink / raw)
  To: law; +Cc: burley, egcs

   Date: Thu, 04 Sep 1997 12:12:15 -0600
   From: Jeffrey A Law <law@hurl.cygnus.com>

   I'm less familiar with how this works under aix or with
   ultrasparcs/solaris, but I assume they both do something relatively
   reasonable.

On the UltraSparc normal load/store instructions still trap on bogus
pointers like they always have.

The compiler must emit a special load instruction (with an address
space specifier for the "no-fault" address space) when it wants a
speculative load.  The chip will not fault on a NULL pointer if page 0
of the process has the "allow non-faulting load" attribute set, when
you use these special load instructions.  Non-faulting stores do not
exist.

Naturally, this depends upon OS support, and I think Solaris does map
page 0 in this special way, or they check the load instruction which
was executed by the user once the fault handling call chain determines
that the access was in outer space.

I lack the support currently on sparc64-linux, but it will be added
should a native compiler out there begin to emit the instructions ;-)

Later,
David "Sparc" Miller
davem@caip.rutgers.edu

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
@ 1997-09-04 11:27 meissner
  0 siblings, 0 replies; 33+ messages in thread
From: meissner @ 1997-09-04 11:27 UTC (permalink / raw)
  To: burley, law; +Cc: egcs

Note if people are going to use -fsched-spec-load-dangerous, I think there are
bugs in the current Haifa code.  I originally enabled it in the AIX port when
I was first testing haifa years ago, and eventually removed setting it, since
it was assuming all loads could be done speculatively, when it should have
only optimized things like:

	p != NULL && *p

(either that or somebody changed the default until AIX so that page 0 was
no longer mapped by default -- it was quite some time ago).

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
  1997-09-04  9:33         ` Craig Burley
@ 1997-09-04 11:12           ` Jeffrey A Law
  1997-09-04 11:33             ` David S. Miller
  0 siblings, 1 reply; 33+ messages in thread
From: Jeffrey A Law @ 1997-09-04 11:12 UTC (permalink / raw)
  To: Craig Burley; +Cc: egcs

  In message < 199709041633.MAA20790@melange.gnu.ai.mit.edu >you write:
  > Good summary.
It's not 100% accurate, but it's close enough for this discussion :-)

  > The way I look at it is, all things being equal, I'd prefer an OS
  > default to faulting on dereferecing NULL when running generic C code
  > simply because C code shouldn't be dereferencing NULL.
Agreed.

HPUX does it backwards -- it defaults to allowing the dereference.
However the GCC port passes the magic flags to the linker to turn on
faulting by default :-)

It's a choice that can be made on a file by file basis -- and with
a little work one can even change the magic bit in the executable
header to change the behavior of a particular file.

I'm less familiar with how this works under aix or with ultrasparcs/solaris,
but I assume they both do something relatively reasonable.

  > not scheduling NULL dereferences.  Generally it is better to
  > default to not doing potentially hurtful optimizations, since users
  > who know they want optimizations know enough to specify them;
That's my general opinion too.

  > Note that the documentation should be *very* clear that, even if
  > this option is enabled, it does *not* guarantee that C (or Fortran
  > etc.) code can then be written that explicitly dereferences NULL.
Absolutely.

Jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
  1997-09-03 13:19       ` Jeffrey A Law
@ 1997-09-04  9:33         ` Craig Burley
  1997-09-04 11:12           ` Jeffrey A Law
  0 siblings, 1 reply; 33+ messages in thread
From: Craig Burley @ 1997-09-04  9:33 UTC (permalink / raw)
  To: egcs

>Actually, we've come full circle over the years on this issue.
>
>First it was good, because "all the (unix) world was a pdp/vax" and you were
>allowed to do this on those systems.
>
>Then later as people decided depending on such behavior was a bad programming
>practice OS folks started forcing null pointer dereferences to cause faults
>and people fixed their code to avoid dereferencing null pointers.
>
>Now we have code which explicitly avoids dereferencing null pointers, but
>some OSs (hpux, aix, solaris?) have options to allow null pointer dereferences
>to allow the scheduler to perform more speculative code motions.

Good summary.

The way I look at it is, all things being equal, I'd prefer an OS
default to faulting on dereferecing NULL when running generic C code
simply because C code shouldn't be dereferencing NULL.  If it is,
maybe it'll be "lucky" and get the right values (behave the way it
happened to on the original developer's system) if the system allowed
it, or maybe it'll be "lucky" and never care, but it's too risky to
*assume* it'll be correct.  I'd rather have it crash and not be correct,
and make a specific decision to risk running it with such faulting
disabled, personally.

However, that view is basically orthagonal to whether an OS should
default to faulting for *generated* code that could run faster if
such faulting was disabled.  In this case, of course, good engineering
would mandate that the executable be marked with a flag saying "this
executable contains code that assumes dereferencing, but never using,
NULL is fast"; which in turn would suggest that objects be marked
that way and the linker cooperate accordingly; which the compiler
would in turn set when the appropriate optimizations were enabled or,
better yet, actually used; but, failing that, the usual black-box
approach we take to such thing should suffice.

Essentially this seems to me a lot like the issue of NaNs in code
using floating-point arithmetic on IEEE 754 machines.  Some people
want any generation of a NaN to immediately cause a fault, because
they want the potentially buggy code they're running to cause an
exception rather than silently produce wrong results.  OTOH, others
want NaN generation to silently and speedily continue, because
that's what makes sense for lots of code written (or compiled) to
the IEEE 754 model.

Unfortunately, we do seem to collectively refuse to provide, or use,
any kinds of clear markers in our code and utilities that explicitly
*mean* "This code should never generate NaNs" or "This code certainly
can generate NaNs and still be working well", so picking the right
defaults becomes a religious issue.

Someday I hope the free-software community will offer solutions to
this overall problem.  (Other things like tracking the licensing
requirements of compiled code through objects to executables and
libraries are part of it.)

In the meantime, I'd suggest using the approach of not defaulting
to *either* behavior, that is, leaving the default to be whatever
the target configuration wants, and maybe *that* could default to
not scheduling NULL dereferences.  Generally it is better to
default to not doing potentially hurtful optimizations, since users
who know they want optimizations know enough to specify them; but
it seems reasonable for a given target to specify that this kind
of optimization is actually not hurtful.

However, this might violate some other worthwhile principles in
general, or for egcs/gcc in particular.  For example, the documentation
would have to specify that the default depends on the target
system's configuration.

Note that the documentation should be *very* clear that, even if
this option is enabled, it does *not* guarantee that C (or Fortran
etc.) code can then be written that explicitly dereferences NULL.

That is, the issue of whether NULL-dereferencing is safe for
code *generation* should be kept very distinct from whether
NULL-dereferencing is needed by the code being compiled.  If
the latter seems like a useful thing to provide, that should
be a separate option (at the very least; in fact, it should be
a source-level language construct, per my comments above, but
that's probably beyond the scope of egcs at this point).

(Right now it might seem nonsensical for the user to specify
one option, or getting it as a default for a target, saying
"don't schedule NULL dereferences" and another saying "this
code assumes it can safely dereference NULL".  But these are
two different concepts, and, in the future, combining them might
actually produce a useful result, e.g. the compiler could
generate code to check all uncertain addresses before
dereferencing them.  In the meantime, a warning could be issued
if the combination isn't supported, something like what we do
for `-Wuninitialized -O0'.)

        tq vm, (burley)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
  1997-09-03 11:39     ` Paul Koning
  1997-09-03 11:50       ` David Edelsohn
  1997-09-03 13:13       ` Richard Henderson
@ 1997-09-03 13:19       ` Jeffrey A Law
  1997-09-04  9:33         ` Craig Burley
  2 siblings, 1 reply; 33+ messages in thread
From: Jeffrey A Law @ 1997-09-03 13:19 UTC (permalink / raw)
  To: Paul Koning; +Cc: crux, egcs

  In message < 9709031838.AA27038@kona. >you write:
  > If that switch indeed depends on not faulting references to zero, it
  > probably isn't worth having.
Maybe, maybe not.  That's one of the things I think needs to be evaluated
for haifa.

  > Or if it's kept it needs a warning that in many OS's it will fail.
Yes.  Or more along the lines of "this option should only be used on systems
which allow a dereference of a null pointer" -- explain when it will fail
instead of saying "it'll fail on many OS's".

  > The reason is that faulting on references
  > to zero is universally considered a Good Thing and many OS's go out of
  > their way to provide that feature.
Actually, we've come full circle over the years on this issue.

First it was good, because "all the (unix) world was a pdp/vax" and you were
allowed to do this on those systems.

Then later as people decided depending on such behavior was a bad programming
practice OS folks started forcing null pointer dereferences to cause faults
and people fixed their code to avoid dereferencing null pointers.

Now we have code which explicitly avoids dereferencing null pointers, but
some OSs (hpux, aix, solaris?) have options to allow null pointer dereferences
to allow the scheduler to perform more speculative code motions.

I'm undecided either way; if there's a significant win on these systems, then
I'm tempted to leave the option.  If there's no significant benefit, then I'm
tempted to remove the option (both to simplify the compiler/scheduler and to
avoid people unknowingly using the option, then sending bugs 'cuz their
system didn't support null pointer dereferences).

Jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
  1997-09-03 11:39     ` Paul Koning
  1997-09-03 11:50       ` David Edelsohn
@ 1997-09-03 13:13       ` Richard Henderson
  1997-09-03 13:19       ` Jeffrey A Law
  2 siblings, 0 replies; 33+ messages in thread
From: Richard Henderson @ 1997-09-03 13:13 UTC (permalink / raw)
  To: Paul Koning; +Cc: law, crux, egcs

> If that switch indeed depends on not faulting references to zero, it
> probably isn't worth having.

There exist CPUs, e.g. UltraSparc, that have special non-faulting loads
that are explicitly architected to support this kind of optimization.


r~

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
  1997-09-03  9:37     ` Jeffrey A Law
  1997-09-03  9:44       ` Bernd Schmidt
@ 1997-09-03 11:58       ` Jim Wilson
  1 sibling, 0 replies; 33+ messages in thread
From: Jim Wilson @ 1997-09-03 11:58 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: egcs

	Basically any machine that defines SMALL_REGISTER_CLASSES and passes args in
	registers is playing with fire -- reload can and will silently generate incorrect
	code for such targets.  For that reason I _highly_ recommend against using
	registers for parameter passing on the x86 targets.

I'll second it.  The x86 regparm attribute is unsafe, and should not be used.
It might be possible to fix the problem by hacking reload and the x86 port,
but it doesn't seem very worthwhile.  The problem is hard to fix, and exposes
people to unnecessary risks of compiler bugs.  There are probably easier
ways to get faster x86 code.

Jeff didn't mention it, but it is also unwise to schedule before
register allocation if SMALL_REGISTER_CLASSES is defined, even if you
are not using register parameters.  The goal of the scheduler is to reduce
pipeline stalls at the expense of increasing register pressure.  This is
exactly the wrong thing to do on a SMALL_REGISTER_CLASSES machine such as
the x86, because you will get so many extra spills that the code may run
slower.  A scheduler that has different goals might be able to do better,
but given the way the existing scheduler works, I would recommend avoiding
it before register allocation on the x86.

Jim

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
  1997-09-03 11:39     ` Paul Koning
@ 1997-09-03 11:50       ` David Edelsohn
  1997-09-03 13:13       ` Richard Henderson
  1997-09-03 13:19       ` Jeffrey A Law
  2 siblings, 0 replies; 33+ messages in thread
From: David Edelsohn @ 1997-09-03 11:50 UTC (permalink / raw)
  To: Paul Koning; +Cc: law, crux, egcs

>>>>> Paul Koning writes:

Paul> "law@cygnus.com" wrote:
>> However, for execution tests that fail, I wonder how many (if any) are 
>> caused by -fsched-spec-load-dangerous.  If I remember right that
>> relies on the OS to support non-faulting loads from zero.

Paul> If that switch indeed depends on not faulting references to zero, it
Paul> probably isn't worth having.  Or if it's kept it needs a warning that
Paul> in many OS's it will fail.  The reason is that faulting on references
Paul> to zero is universally considered a Good Thing and many OS's go out of
Paul> their way to provide that feature.

	I disagree with your justification that faulting on dereferncing
address zero is considered a Good Thing.  This depends on your priorities
and the correctness of your program.  I do not want to get into a general
discussion about programming styles and kernel design, but speculatively
dereferencing zero is very important for good performance on some systems;
some operating systems go out of their way specifically to make page zero
valid, as opposed to making it a VM error.  This needs to be set in the
configuation file on a system-by-system basis and not universally enabled
or disabled.

David


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
  1997-09-03  9:30   ` Bernd Schmidt
  1997-09-03  9:37     ` Jeffrey A Law
@ 1997-09-03 11:39     ` Paul Koning
  1997-09-03 11:50       ` David Edelsohn
                         ` (2 more replies)
  1 sibling, 3 replies; 33+ messages in thread
From: Paul Koning @ 1997-09-03 11:39 UTC (permalink / raw)
  To: law; +Cc: crux, egcs

"law@cygnus.com" wrote:
> However, for execution tests that fail, I wonder how many (if any) are 
> caused by -fsched-spec-load-dangerous.  If I remember right that
> relies on the OS to support non-faulting loads from zero.

If that switch indeed depends on not faulting references to zero, it
probably isn't worth having.  Or if it's kept it needs a warning that
in many OS's it will fail.  The reason is that faulting on references
to zero is universally considered a Good Thing and many OS's go out of
their way to provide that feature.

	paul

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
  1997-09-03  9:44       ` Bernd Schmidt
@ 1997-09-03 10:07         ` Jeffrey A Law
  1997-09-05  7:15           ` Bernd Schmidt
  0 siblings, 1 reply; 33+ messages in thread
From: Jeffrey A Law @ 1997-09-03 10:07 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: egcs

  In message < Pine.SOL.3.90.970903184242.6496A-100000@bond.informatik.rwth-aachen.de >you write:
  > How can reload generate incorrect code in this case except if the lifetime
  > of a hard register gets extended by the scheduler or some other optimizer?
When loading of an argument register requires a reload, which may force
reload to clobber a previously loaded argument register.


jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
  1997-09-03  9:37     ` Jeffrey A Law
@ 1997-09-03  9:44       ` Bernd Schmidt
  1997-09-03 10:07         ` Jeffrey A Law
  1997-09-03 11:58       ` Jim Wilson
  1 sibling, 1 reply; 33+ messages in thread
From: Bernd Schmidt @ 1997-09-03  9:44 UTC (permalink / raw)
  To: law; +Cc: egcs

> In general, that's not going to work.
> 
> Basically any machine that defines SMALL_REGISTER_CLASSES and passes args in
> registers is playing with fire -- reload can and will silently generate incorrect
> code for such targets.  For that reason I _highly_ recommend against using
> registers for parameter passing on the x86 targets.

How can reload generate incorrect code in this case except if the lifetime
of a hard register gets extended by the scheduler or some other optimizer?

Bernd

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
  1997-09-03  9:30   ` Bernd Schmidt
@ 1997-09-03  9:37     ` Jeffrey A Law
  1997-09-03  9:44       ` Bernd Schmidt
  1997-09-03 11:58       ` Jim Wilson
  1997-09-03 11:39     ` Paul Koning
  1 sibling, 2 replies; 33+ messages in thread
From: Jeffrey A Law @ 1997-09-03  9:37 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: egcs

  In message < Pine.SOL.3.90.970903182436.5858D-100000@bond.informatik.rwth-aachen.de >you write:
  > Ok, I'll retry the tests without that option. But this is not the only
  > case that Haifa can create incorrect code; there is an additional problem
  > that can show up on any SMALL_REGISTER_CLASSES machine. Haifa can increase
  > the lifetime of pseudo registers, which can cause incorrect code to be
  > generated. These examples were made on an i586-linux configuration with
  > egcs-970828:
[ ... ]
In general, that's not going to work.

Basically any machine that defines SMALL_REGISTER_CLASSES and passes args in
registers is playing with fire -- reload can and will silently generate incorrect
code for such targets.  For that reason I _highly_ recommend against using
registers for parameter passing on the x86 targets.

Turning on the instruction scheduler in that case is just going to make a
bad problem worse -- but the problem isn't the instruction scheduler itself.

  > It seems that at least these changes are necessary:
  > 1. Function arguments must be moved into pseudos immediately at the top
  > of a function.
Nontrivial to do in haifa.

jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
  1997-09-01 20:33 ` Jeffrey A Law
@ 1997-09-03  9:30   ` Bernd Schmidt
  1997-09-03  9:37     ` Jeffrey A Law
  1997-09-03 11:39     ` Paul Koning
  0 siblings, 2 replies; 33+ messages in thread
From: Bernd Schmidt @ 1997-09-03  9:30 UTC (permalink / raw)
  To: law; +Cc: egcs

> 
> I suspect all the cases where the compiler itself is core dumping
> is the SCHED_GROUP_P problem in move_insn.

Likely.
 
> However, for execution tests that fail, I wonder how many (if any) are 
> caused by -fsched-spec-load-dangerous.  If I remember right that
> relies on the OS to support non-faulting loads from zero.

Ok, I'll retry the tests without that option. But this is not the only
case that Haifa can create incorrect code; there is an additional problem
that can show up on any SMALL_REGISTER_CLASSES machine. Haifa can increase
the lifetime of pseudo registers, which can cause incorrect code to be
generated. These examples were made on an i586-linux configuration with
egcs-970828:

darkstar:/usr/src # cat tsth1.c
int d,e;

int foo(int, int, int) __attribute__ ((regparm (3)));

int foo(int a, int b, int c)
{
  d /= e;
  return a + b + c;
}

int main()
{
  int a;
  d = 400; e = 4;
  a = foo(3,2,1);
  if (a != 6)
    abort ();
  return 0;
}
darkstar:/usr/src # gcc -B/usr/src/egcs-970825/gcc/stage2/ -fschedule-insns -fsched-reverse-S -O tsth1.c
darkstar:/usr/src # a.out
Aborted
darkstar:/usr/src # cat tsth2.c
volatile int d,e;

int foo(int, int, int) __attribute__ ((regparm (3)));

int foo(int a, int b, int c)
{
  return a + b + c;
}

int main()
{
  int a;
  d = 400; e = 4;
  a = foo(3,2,1);
  d /= e;
  if (a != 6)
    abort ();
  return 0;
}
darkstar:/usr/src # gcc -B/usr/src/egcs-970825/gcc/stage2/ -fschedule-insns -fsched-reverse-S -O tsth2.c
darkstar:/usr/src # a.out
Aborted

It seems that at least these changes are necessary:
1. Function arguments must be moved into pseudos immediately at the top
of a function.
2. For function calls, all insns from the first one setting an argument
register to the one copying the return value out of a hard reg must be
one big SCHED_GROUP.
3. Insns setting the return value reg must not be moved.
I don't know whether it's possible for a hard reg to be explicitly mentioned
in any other way; I think it is not possible.

> Also note that the -fsched-reverse-{R,S} are for testing purposes
> only, they actually pessimize code in an attempt to test the
> scheduler more fully.  Long term I expect to rip them out of the
> scheduler.

Why, I think it's useful to have this kind of self-check code - it helps
to check for bugs in the scheduler.

Bernd

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
       [not found] <Pine.SOL.3.90.970826141030.2901K-100000@maigret.informatik.rwth-aachen.de>
@ 1997-09-01 20:42 ` Jeffrey A Law
  0 siblings, 0 replies; 33+ messages in thread
From: Jeffrey A Law @ 1997-09-01 20:42 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: egcs

  > I think you won't get the right return value. You want to return the
  > last insn in a SCHED_GROUP or a note that was placed after it.
  > I had my reasons to use two loops ;)
You're right.  Seems to me that the return value for move_insn should
always be the return value from the first call to reemit_notes.  Correct?

In which case it's pretty simple to record the return value from
the first call to reemit_notes.

And I don't doubt you had good reasons to use two loops; I just want
to make absolutely sure I understand the problem.  Handling of notes
and SCHED_GROUP_P insns have both been particularly difficult areas
of the scheduler to implement correctly.

  > The whole affair with SCHED_GROUP_P is extremely messy.
Agreed.

  > How about using extra data structures that describe which insns
  > to schedule together, which notes to put before them, and the
  > dependencies between them.
I'm all for it.

  > scheduled together with an insn might make the code clearer. I think it
  > would fit in better with the fact that Haifa schedules forward. I didn't
  > try it, though; maybe there are problems with that approach, too.
Well, haifa does both forward and backward analysis -- so it might
just move the problematical code elsewhere in the scheduler.

jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
       [not found] <Pine.SOL.3.90.970826140826.2901J-100000@maigret.informatik.rwth-aachen.de>
@ 1997-09-01 20:33 ` Jeffrey A Law
  1997-09-03  9:30   ` Bernd Schmidt
  0 siblings, 1 reply; 33+ messages in thread
From: Jeffrey A Law @ 1997-09-01 20:33 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: egcs

  In message <Pine.SOL.3.90.970826140826.2901J-100000@maigret.informatik.rwth-aa
chen.de>you write:
  > > 
  > >   > I'll re-run c-torture with the unmodified scheduler and
  > >   > report all failures.
  > > Please do.  It'll make tracking these things down easier.
  > 
  > Sorry about the delay, our mailserver didn't quite work yesterday.
  > Here are the failures I get without the reemit_notes fix on an
  > i586-linux configuration:
No problem.

I suspect all the cases where the compiler itself is core dumping
is the SCHED_GROUP_P problem in move_insn.
 
However, for execution tests that fail, I wonder how many (if any) are 
caused by -fsched-spec-load-dangerous.  If I remember right that
relies on the OS to support non-faulting loads from zero.

if (!p)
  blah = *p

The load of *p might happen before the code to check that p is
nonzero with -fsched-spec-load-dangerous.

Also note that the -fsched-reverse-{R,S} are for testing purposes
only, they actually pessimize code in an attempt to test the
scheduler more fully.  Long term I expect to rip them out of the
scheduler.

Jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
@ 1997-08-26 14:34 Bernd Schmidt
  0 siblings, 0 replies; 33+ messages in thread
From: Bernd Schmidt @ 1997-08-26 14:34 UTC (permalink / raw)
  To: egcs

> 
> So, how about this:
> 
> static rtx
> move_insn (insn, last)
>      rtx insn, last;
> {
>   while (SCHED_GROUP_P (insn))
>     {
>       rtx prev = PREV_INSN (insn);
>       move_insn1 (insn, last);
>       reemit_notes (insn, insn);
>       insn = prev;
>     }
> 
>   move_insn1 (insn, last);
>   return reemit_notes (insn, insn);
> }

I think you won't get the right return value. You want to return the
last insn in a SCHED_GROUP or a note that was placed after it.
I had my reasons to use two loops ;)

The whole affair with SCHED_GROUP_P is extremely messy. How about using
extra data structures that describe which insns to schedule together,
which notes to put before them, and the dependencies between them,
instead of trying to encode it all in the RTL with the confusion that
results?
If you don't want that, changing the meaning of SCHED_GROUP_P for the
Haifa scheduler so that it is set if the _following_ insn must be
scheduled together with an insn might make the code clearer. I think it
would fit in better with the fact that Haifa schedules forward. I didn't
try it, though; maybe there are problems with that approach, too.

Bernd

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
  1997-08-22 13:53 Building of generated parser files Andreas Schwab
  1997-08-22 15:02 ` Some Haifa scheduler bugs Jeffrey A Law
@ 1997-08-22 15:24 ` Jeffrey A Law
  1 sibling, 0 replies; 33+ messages in thread
From: Jeffrey A Law @ 1997-08-22 15:24 UTC (permalink / raw)
  To: egcs

  In message <Pine.SOL.3.90.970821173718.733B-100000@maigret.informatik.rwth-aa
chen.de>you write:
  > I've thought about something like this, but this code won't work, I think.
  > It will call reemit_notes twice for the last insn in the SCHED_GROUP and
  > not at all for the first one.
OK.  Let's use the same example again:

Consider if we have Z <-A <-B <-C, all with SCHED_GROUP_P set on A B & C,
but not Z.



  insn = C
  We start the while loop (SCHED_GROUP_P (C) == 1)
    prev = B
    move_insn1 (C, last)
    insn = B
   2nd iteration (SCHED_GROUP_P (B) == 1)
    prev = A
    move_insn1 (B, last)
    insn = A
   3rd iteration (SCHED_GROUP_P (A) == 1)
    prev = Z
    move_insn_1 (A, last)
    insn = Z
   loop terminates (SCHED_GROUP_P (Z) == 0)

  move_insn1 (Z, last)
  reemit_notes (Z, Z)


So the problem here is we don't call reemit_notes for A B & C.


So, how about this:

static rtx
move_insn (insn, last)
     rtx insn, last;
{
  while (SCHED_GROUP_P (insn))
    {
      rtx prev = PREV_INSN (insn);
      move_insn1 (insn, last);
      reemit_notes (insn, insn);
      insn = prev;
    }

  move_insn1 (insn, last);
  return reemit_notes (insn, insn);
}



  insn = C
  We start the while loop (SCHED_GROUP_P (C) == 1)
    prev = B
    move_insn1 (C, last)
    reemit_notes (C, C)
    insn = B
   2nd iteration (SCHED_GROUP_P (B) == 1)
    prev = A
    move_insn1 (B, last)
    reemit_notes (B, B)
    insn = A
   3rd iteration (SCHED_GROUP_P (A) == 1)
    prev = Z
    move_insn1 (A, last)
    reemit_notes (A, A)
    insn = Z
   loop terminates (SCHED_GROUP_P (Z) == 0)


  move_insn1 (Z, last)
  reemit_notes (Z, Z)


Jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
  1997-08-22 13:53 Building of generated parser files Andreas Schwab
@ 1997-08-22 15:02 ` Jeffrey A Law
  1997-08-22 15:24 ` Jeffrey A Law
  1 sibling, 0 replies; 33+ messages in thread
From: Jeffrey A Law @ 1997-08-22 15:02 UTC (permalink / raw)
  To: egcs

  In message <Pine.SOL.3.90.970822131829.1049C-100000@kojak.informatik.rwth-aac
hen.de>you write:
  > > So we end up moving one insn more than we should right?
  > 
  > No.
  > `SCHED_GROUP_P (INSN)'
  >      During instruction scheduling, in an insn, indicates that the
  >      previous insn must be scheduled together with this insn.  This is
  >      used to ensure that certain groups of instructions will not be
  >      split up by the instruction scheduling pass, for example, `use'
  >      insns before a `call_insn' may not be separated from the
  >      `call_insn'.  Stored in the `in_struct' field and printed as `/s'.
  > 
  > This means that the first insn in a SCHED_GROUP doesn't have the
  > SCHED_GROUP_P bit set, so Z _is_ supposed to be scheduled. The group
  > consists of all four insns.
You are absolutely correct.  My mistake.

  > I didn't make any notes, unfortunately.  I can certainly try it, but
  > I doubt it will work.
Don't bother to try it.  It's obviously wrong.

  > I'll re-run c-torture with the unmodified scheduler and
  > report all failures.
Please do.  It'll make tracking these things down easier.

Thanks,
jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
  1997-08-22 10:48 Building of generated parser files Niklas Hallqvist
@ 1997-08-22 13:28 ` Bernd Schmidt
  0 siblings, 0 replies; 33+ messages in thread
From: Bernd Schmidt @ 1997-08-22 13:28 UTC (permalink / raw)
  To: egcs

> Good point.  However, the more I look at that SCHED_GROUP_P loop,
> the more I think it must be broken.
> 
> Consider if we have Z <-A <-B <-C, all with SCHED_GROUP_P set on A B & C,
> but not Z.
> 
>   insn = C
>   We start the while loop (SCHED_GROUP_P (A) == 1)
>     prev = B
>     move_insn1 (C, last)
>     insn = B
>    2nd iteration (SCHED_GROUP_P (B) == 1)
>     prev = A
>     move_insn1 (B, last)
>     insn = A
>    3rd iteration (SCHED_GROUP_P (A) == 1)
>     prev = Z
>     move_insn_1 (A, last)
>     insn = Z
>    loop terminates (SCHED_GROUP_P (Z) == 0)
> 
>   move_insn1 (Z, last)
>   
>   
> So we end up moving one insn more than we should right?

No.
`SCHED_GROUP_P (INSN)'
     During instruction scheduling, in an insn, indicates that the
     previous insn must be scheduled together with this insn.  This is
     used to ensure that certain groups of instructions will not be
     split up by the instruction scheduling pass, for example, `use'
     insns before a `call_insn' may not be separated from the
     `call_insn'.  Stored in the `in_struct' field and printed as `/s'.

This means that the first insn in a SCHED_GROUP doesn't have the
SCHED_GROUP_P bit set, so Z _is_ supposed to be scheduled. The group
consists of all four insns.

> static rtx
> move_insn (insn, last)
>      rtx insn, last;
> {
> 
>   while (SCHED_GROUP_P (PREV_INSN (insn)))
>     {
>       rtx prev = PREV_INSN (insn);
>       move_insn1 (insn, last);
>       reemit_notes (insn, insn);
>       insn = prev;
>     }
> 
>   move_insn1 (insn, last);
>   return reemit_notes (insn, insn);
> }

Apart from what I said above, that might move insns that belong to a
SCHED_GROUP that starts before insn.

> Can you try it?  Or give me the testcase that's currently failing
> for you so that I can try it myself?

I didn't make any notes, unfortunately.  I can certainly try it, but
I doubt it will work.  I'll re-run c-torture with the unmodified
scheduler and report all failures.

Bernd

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
  1997-08-21 16:51 Problems on PowerPC David Edelsohn
@ 1997-08-21 17:43 ` Jeffrey A Law
  0 siblings, 0 replies; 33+ messages in thread
From: Jeffrey A Law @ 1997-08-21 17:43 UTC (permalink / raw)
  To: egcs

  In message <Pine.SOL.3.90.970821173718.733B-100000@maigret.informatik.rwth-aa
chen.de>you write:
  > I've thought about something like this, but this code won't work, I think.
  > It will call reemit_notes twice for the last insn in the SCHED_GROUP and
  > not at all for the first one. I've tried thinking of ways to use just one
  > loop, but they all had problems with putting all notes in the right places.
Good point.  However, the more I look at that SCHED_GROUP_P loop,
the more I think it must be broken.

Consider if we have Z <-A <-B <-C, all with SCHED_GROUP_P set on A B & C,
but not Z.



  insn = C
  We start the while loop (SCHED_GROUP_P (A) == 1)
    prev = B
    move_insn1 (C, last)
    insn = B
   2nd iteration (SCHED_GROUP_P (B) == 1)
    prev = A
    move_insn1 (B, last)
    insn = A
   3rd iteration (SCHED_GROUP_P (A) == 1)
    prev = Z
    move_insn_1 (A, last)
    insn = Z
   loop terminates (SCHED_GROUP_P (Z) == 0)

  move_insn1 (Z, last)
  
  
So we end up moving one insn more than we should right?

Seems to me move_insn should look like this:

static rtx
move_insn (insn, last)
     rtx insn, last;
{

  while (SCHED_GROUP_P (PREV_INSN (insn)))
    {
      rtx prev = PREV_INSN (insn);
      move_insn1 (insn, last);
      reemit_notes (insn, insn);
      insn = prev;
    }

  move_insn1 (insn, last);
  return reemit_notes (insn, insn);
}


Can you try it?  Or give me the testcase that's currently failing
for you so that I can try it myself?

Jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
  1997-08-21 15:20 egcs repository Joel Sherrill
@ 1997-08-21 15:47 ` Bernd Schmidt
  0 siblings, 0 replies; 33+ messages in thread
From: Bernd Schmidt @ 1997-08-21 15:47 UTC (permalink / raw)
  To: egcs

> Specific questions/comments:
> 
> In move_insn, is there some reason why you can't call reemit_notes
> during the loop on SCHED_GROUP_P insns?
> 
> ie, does this work instead?  Seems cleaner than using another loop
> if it works.
> 
> {
>   rtx new_last = insn;
> 
>   while (SCHED_GROUP_P (insn))
>     {
>       rtx prev = PREV_INSN (insn);
>       move_insn1 (insn, last);
>       reemit_notes (insn, insn);
>       insn = prev;
>     }
> 
>   move_insn1 (insn, last);
>   return reemit_notes (new_last, new_last);
> }

I've thought about something like this, but this code won't work, I think.
It will call reemit_notes twice for the last insn in the SCHED_GROUP and
not at all for the first one. I've tried thinking of ways to use just one
loop, but they all had problems with putting all notes in the right places.

Bernd

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
@ 1997-08-19 19:00 Jeffrey A Law
  0 siblings, 0 replies; 33+ messages in thread
From: Jeffrey A Law @ 1997-08-19 19:00 UTC (permalink / raw)
  To: egcs

  In message <Pine.SOL.3.90.970819092828.291B-100000@starsky.informatik.rwth-aa
chen.de>you write:
  > 	* haifa-sched.c (target_bb, bbset_size, dom, prob, rgn_nr_edges,
  > 	rgn_edges, edgeset_size, edge_to_bit, pot_split, ancestor_edges):
  > 	Make static.
I've already got these installed into egcs, plus some others that
you didn't make static.

  > 	(move_insn): Call reemit_notes for every insn in a SCHED_GROUP.
See my last message -- that's the code I'm running right now.  If it
doesn't work, let me know and we can iterate to a solution.

  > 	(schedule_block): When checking whether an insn is the 
  > 	basic_block_head of a different block, use the first insn in the
  > 	same SCHED_GROUP instead of the insn itself.
Done.  Slightly differently, but should be functionally equivalent
and a little clearer to someone who might read the code later.


  > 	(debug_dependencies): GET_RTX_NAME takes an rtx code, not an rtx.
Done.

Jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Some Haifa scheduler bugs
@ 1997-08-19 17:54 Jeffrey A Law
  0 siblings, 0 replies; 33+ messages in thread
From: Jeffrey A Law @ 1997-08-19 17:54 UTC (permalink / raw)
  To: egcs

  In message <Pine.SOL.3.90.970819092828.291B-100000@starsky.informatik.rwth-aa
chen.de>you write:
  > I've run c-torture on the egcs snapshot, using the Haifa scheduler with
  > all flags turned on. My system is an i586-linux one. Here's a patch to fix
  > some of the failures.
Thanks.

Just some comments:

  * Get a copyright assignment + disclaimer signed and sent to
  the FSF as soon as possible.  Until that time we can't take any
  of your patches and include them without rewriting them first.

  * When submitting patches, send separate patches for bugs from
  random cleanups.  The cleanups are greatly appreciated, especially
  for haifa.

  * When submitting bugfix patches, please submit a testcase, or
  refer us to a c-torture testcase and the options needed to expose
  the bug.


Specific questions/comments:

In move_insn, is there some reason why you can't call reemit_notes
during the loop on SCHED_GROUP_P insns?

ie, does this work instead?  Seems cleaner than using another loop
if it works.

{
  rtx new_last = insn;

  while (SCHED_GROUP_P (insn))
    {
      rtx prev = PREV_INSN (insn);
      move_insn1 (insn, last);
      reemit_notes (insn, insn);
      insn = prev;
    }

  move_insn1 (insn, last);
  return reemit_notes (new_last, new_last);
}

[ Of course if you had referred us to a testcase, we could check
  this ourselves.... ]

Jeff

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~1997-09-05  8:33 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1997-08-19  7:36 egcs: A new compiler project to merge the existing GCC forks (fwd) Robert Wilhelm
1997-08-19  8:08 ` Reload patch to improve 386 code Bernd Schmidt
1997-08-19  8:08 ` Bernd Schmidt
1997-08-19  8:08 ` Testsuite stuff Jeffrey A Law
1997-08-19  8:08 ` Reload patch to improve 386 code Jeffrey A Law
1997-08-19  9:34 ` Some Haifa scheduler bugs Bernd Schmidt
1997-08-19 17:54 Jeffrey A Law
1997-08-19 19:00 Jeffrey A Law
1997-08-21 15:20 egcs repository Joel Sherrill
1997-08-21 15:47 ` Some Haifa scheduler bugs Bernd Schmidt
1997-08-21 16:51 Problems on PowerPC David Edelsohn
1997-08-21 17:43 ` Some Haifa scheduler bugs Jeffrey A Law
1997-08-22 10:48 Building of generated parser files Niklas Hallqvist
1997-08-22 13:28 ` Some Haifa scheduler bugs Bernd Schmidt
1997-08-22 13:53 Building of generated parser files Andreas Schwab
1997-08-22 15:02 ` Some Haifa scheduler bugs Jeffrey A Law
1997-08-22 15:24 ` Jeffrey A Law
1997-08-26 14:34 Bernd Schmidt
     [not found] <Pine.SOL.3.90.970826140826.2901J-100000@maigret.informatik.rwth-aachen.de>
1997-09-01 20:33 ` Jeffrey A Law
1997-09-03  9:30   ` Bernd Schmidt
1997-09-03  9:37     ` Jeffrey A Law
1997-09-03  9:44       ` Bernd Schmidt
1997-09-03 10:07         ` Jeffrey A Law
1997-09-05  7:15           ` Bernd Schmidt
1997-09-05  7:55             ` Jeffrey A Law
1997-09-05  8:33             ` Ian Lance Taylor
1997-09-03 11:58       ` Jim Wilson
1997-09-03 11:39     ` Paul Koning
1997-09-03 11:50       ` David Edelsohn
1997-09-03 13:13       ` Richard Henderson
1997-09-03 13:19       ` Jeffrey A Law
1997-09-04  9:33         ` Craig Burley
1997-09-04 11:12           ` Jeffrey A Law
1997-09-04 11:33             ` David S. Miller
1997-09-04 14:38               ` David Edelsohn
     [not found] <Pine.SOL.3.90.970826141030.2901K-100000@maigret.informatik.rwth-aachen.de>
1997-09-01 20:42 ` Jeffrey A Law
1997-09-04 11:27 meissner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).