Fortran: The Road Ahead.

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Fortran: The Road Ahead.
@ 1997-12-07  6:24 Toon Moene
  1997-12-07  8:18 ` Joel Sherrill
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Toon Moene @ 1997-12-07  6:24 UTC (permalink / raw)
  To: egcs

Several weeks ago I promised to write up the optimisation  
opportunities I see as being applicable to g77 generated RTL as soon  
as the first release of egcs hit the archives.

That write-up is now in front of you.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

  I.  Introduction.

    Why is it necessary to think about optimisation opportunities  
from a Fortran point of view ?  Although superficially all  
programming languages are able to express the same kind of  
computations, Fortran has traditionally imposed the least  
restrictions on compilers as to _how_ computations are to be  
performed.  Basically, any method that (mathematically) yields the  
same result as the specified computation is allowed, disregarding  
the limitations of floating point arithmetic.  An extreme example of  
this is that Cray's compilers insert calls to optimised run-time  
library routines if they detect (via elaborate pattern matching)  
that a certain computation that the user specified in Fortran can be  
done faster using that library routine, e.g. matrix multiplication.

    Although in practice such a substitution won't be useful in a  
portable compiler (because it is very hard to ensure that optimised  
run-time libraries are available for all supported architectures),  
the challenge lying ahead is to provide optimisations in egcs that  
are well-known to the Fortran community and available in $300 *)  
compilers, without breaking things for other front-ends or making  
optimised compilation unnecessarily slow.

*) The Portland Group, Inc.'s compiler suite, full page ad in Linux  
Journal #42 (October), page 35.

 II.  Improving and Enhancing Current Algorithms.

    All of the following pertains to code in the files loop.c or  
unroll.c.

    1. Combining related General Induction Variables.

      Consider the following Fortran code:

      subroutine average(x, y, n)
      integer i, n
      real x(n), y(n)
      y(1) = x(1)
      y(n) = x(n)
      do i = 2, n-1
         y(i) = ( x(i-1)+x(i)+x(i+1) ) / 3.0
      enddo
      end

      Currently, the backend uses three registers for the addresses  
of x(i-1), x(i), x(i+1), incrementing them in lock-step through the  
loop iterations.  An obvious optimisation would be to use one  
register and to address with -4(r), (r), and 4(r).  Code exists in  
loop.c to perform this optimisation (function combine_givs), but it  
almost never triggers, even not when completely disabling the  
(wrong-headed) cost/benefit calculation.

     2. Stores to invariant addresses are not moved out of loops.

       This is the relevant comment in loop.c (just before function  
scan_loop):

/* ??? Could also move memory writes out of loops if the  
destination address
   is invariant, the source is invariant, the memory write is not  
volatile,
   and if we can prove that no read inside the loop can read this address
   before the write occurs.  If there is a read of this address after the
   write, then we can also mark the memory read as invariant.  */

       Consider the following Fortran subroutine:

       subroutine sdot(a, x, n)
       integer i, n
       real a, x(n)
       a = 0.0
       do i = 1, n
          a = a + x(i)
       enddo
       end

       The store into a could be moved out of the loop, because,  
due to the Fortran rules of aliasing, the store into a cannot change  
any element of x.

     3. Loop unrolling is maximised to 4 times.

       When doing "naive" loop unrolling, i.e. when the number of  
loop iterations is an invariant, but not a compile time constant,  
the number of unrolls is maximally 4.  Hilariously, this is  
determined _after_ unroll prints the message "Loop unrolled <number>  
times", causing no end of confusion.  The code that performs this  
restriction looks like this:

          /* Limit loop unrolling to 4, since this will make 7 copies of
             the loop body.  */
          if (unroll_number > 4)
            unroll_number = 4;

       I do not completely understand the comment - probably it  
means to say that unrolling 8 times will lead to 7 extra copies of  
the loop body (because mod(number-of-iterations, number-of-unrolls)  
is maximally 7).  It is not clear to me why this isn't implemented  
as:

       do i = 1, mod(number-of-iterations, number-of-unrolls)
          <body>
       enddo
       do i = mod(number-of-iterations, number-of-unrolls) + 1,
    x         number-of-iterations, number-of-unrolls
          <unrolled-body>
       enddo

especially since, when unrolling naively, the number-of-unrolls is  
2, 4 or 8, so that mod(number-of-iterations, number-of-unrolls)  
effectively is iand(number-of-iterations, number-of-unrolls - 1),  
which is a very cheap computation compared to mod(...).

Another limitation that bothers me is the constant  
MAX_UNROLLED_INSNS, because its value probably should vary from  
architecture to architecture (e.g. based on the number of hard  
registers).

    4. After loop unrolling, allocate intermediate variables to
       different registers.

      Consider this code:

      subroutine sum(a, b, c, n)
      integer i, n
      real a(n), b(n), c(n)
      do i = 1, n
         c(i) = a(i) + b(i)
      enddo
      end

      On your average RISC system the loop body is implemented as:

      temp1 = a(i)
      temp2 = b(i)
      temp3 = temp1 + temp2
      c(i) = temp3

      The current gcc code leaves the registers temp1, 2 and 3 the  
same after loop unrolling; unfortunately, that imposes an  
unnecessary serialisation on the unrolled loop body.  If temp1, 2  
and 3 were called temp4, 5 and 6 for the second copy of the loop  
body, etc., instruction scheduling would have more opportunities to  
move code around (Note that you need the Fortran alias assumptions  
on a, b and c for this to have any effect !).

III. New loop optimisations.

    1. Moving invariant conditionals out of loops.

      Transforms:

      do i = 1, n
         if (lcond) then
            <body1>
         else
            <body2>
         endif
      enddo

      into:

      if (lcond) then
         do i = 1, n
            <body1>
         enddo
      else
         do i = 1, n
            <body2>
         enddo
      endif

      Probably only interesting if lcond is expensive to (re-)calculate.

    2. Loop distribution.

      Transforms:

      do i = 1, n
         a(i) = <complicated-expr-1-not-containing-a-reference-to-b>
         b(i) = <complicated-expr-2-not-containing-a-reference-to-a>
      enddo

      into:

      do i = 1, n
         a(i) = <complicated-expr-1-not-containing-a-reference-to-b>
      enddo
      do i = 1, n
         b(i) = <complicated-expr-2-not-containing-a-reference-to-a>
      enddo

      to reduce register pressure.

    3. Loop fusion.

      Transforms:

      do i = 1, n
         a(i) = <some-expr-1-not-containig-a-reference-to-b>
      enddo
      do i = 1, n
         b(i) = <some-expr-2-not-containig-a-reference-to-a>
      enddo

      into:

      do i = 1, n
         a(i) = <some-expr-1-not-containig-a-reference-to-b>
         b(i) = <some-expr-2-not-containig-a-reference-to-a>
      enddo

      to enhance the work done in the loop (without resorting to  
loop unrolling, which also increases the amount of code).

      The challenge here is to also perform this optimisation if  
the bounds of the two loops are different, but the number of  
iterations is the same.

    4. Loop interchange.

      Transform:

      do i = 1, n
         do j = 1, m
            a(i, j) = b(i, j) + c(i, j)
         enddo
      enddo

      into:

      do j = 1, m
         do i = 1, n
            a(i, j) = b(i, j) + c(i, j)
         enddo
      enddo

      so as to access the array elements in adjacent-storage order  
(in Fortran, the first index should vary fastest !).  Although  
normally Fortran users won't write code like the first loop nest, it  
might result from other transformations.

    5. Loop blocking / tiling.

      In loop nests where the depth is greater than the rank of the  
arrays being worked on, it is beneficial to split up the work into  
"blocks" instead of sweeping across entire arrays, because in that  
way, the parts of the matrices being worked on can be contained in  
cache.  The obvious candidate is matrix multiplication (c assumed  
being zeroed beforehand):

      do i = 1, n
         do j = 1, n
            do k = 1, n
               c(i,j) = c(i,j) + a(i,k) * b(k,j)
            enddo
         enddo
      enddo

(Note that the first optimisation to be performed here is to move  
the i-loop inside the k-loop).

The resulting code is complicated and messy and hence not shown  
here - moreover, the optimal arrangement depends on cache size and  
replacement policy.

 IV. [Shared Memory] (Automatic) Parallellisation.

    Given the fact that (shared memory) multiprocessor machines are  
becoming the norm - at least for compute intensive applications -  
it makes sense to consider (automatic) parallellisation.

    Parallellisation in the Fortran world is obtained by either  
inserting directives in the code to guide the compiler in  
parallellising loops, or letting the compiler decide, based on its  
own analysis, which loops can be and are worthwhile to parallellise.

    Directive-based parallellisation recently got a boost because  
of the formulation of an "industry standard" - see  
http://www.openmp.org .

    Automatic parallellisation is much harder, because the compiler  
has to determine for itself whether the data dependencies in the  
loop(s) allow execution by independently scheduled threads.  This is  
more difficult than the current analysis for moving invariants and  
determining scheduling constraints, because parallellisation is best  
done on _outer_ loops of loop nests (to keep the maximum amount of  
work inside the thread to make thread synchronisation overhead  
minimal).

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Well, this has been long, and it only scratches the surface.  I  
hope to see some of the optimisations mentioned implemented in  
upcoming releases of egcs (especially those mentioned in Chapter  
II).

Regards,
Toon.

"Then I got out into the Real World.  My first task in the Real  
World was to read and understand a 200,000 line FORTRAN program,  
then speed it up by a factor of two.  Any Real Programmer will tell  
you that all the Structured Coding in the world won't help you solve  
a problem like that -- it takes actual talent."  Ed Post,  
Datamation, July '83.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Fortran: The Road Ahead.
  1997-12-07  6:24 Fortran: The Road Ahead Toon Moene
@ 1997-12-07  8:18 ` Joel Sherrill
  1997-12-07  8:57   ` Jeffrey A Law
  1997-12-08 11:06 ` Paul Koning
  1997-12-12  9:19 ` giv combination patch Richard Henderson
  2 siblings, 1 reply; 12+ messages in thread
From: Joel Sherrill @ 1997-12-07  8:18 UTC (permalink / raw)
  To: Toon Moene; +Cc: egcs

On Sun, 7 Dec 1997, Toon Moene wrote:

> III. New loop optimisations.
> 
>     1. Moving invariant conditionals out of loops.
> 
>       Transforms:
> 
>       do i = 1, n
>          if (lcond) then
>             <body1>
>          else
>             <body2>
>          endif
>       enddo
> 
>       into:
> 
>       if (lcond) then
>          do i = 1, n
>             <body1>
>          enddo
>       else
>          do i = 1, n
>             <body2>
>          enddo
>       endif
> 
>       Probably only interesting if lcond is expensive to (re-)calculate.

My gut feeling is that you may be underestimating the value of moving
the evaluation of lcond out of the loop.  lcond is going to evaluated n
times.  That makes the cost of leaving the evaluation in the loop quite a
bit higher.

Plus, if lcond if inexpensive then the values are in registers which
increases the pressure on registers. 

Overall, this seems to be more valuable an optimization than you are
giving it credit for.  

--joel
Joel Sherrill                    Director of Research & Development
joel@OARcorp.com                 On-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
   Support Available             (205) 722-9985




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Fortran: The Road Ahead.
  1997-12-07  8:18 ` Joel Sherrill
@ 1997-12-07  8:57   ` Jeffrey A Law
  1997-12-07 10:54     ` Toon Moene
  0 siblings, 1 reply; 12+ messages in thread
From: Jeffrey A Law @ 1997-12-07  8:57 UTC (permalink / raw)
  To: Joel Sherrill; +Cc: Toon Moene, egcs

  In message < Pine.BSF.3.96.971207101033.25106F-100000@vespucci.advicom.net >you
 write:
  > 
  > 
  > On Sun, 7 Dec 1997, Toon Moene wrote:
  > 
  > > III. New loop optimisations.
  > > 
  > >     1. Moving invariant conditionals out of loops.
Compiler folks often refer to this as unswitching.  Note that the
conditional must be at the top of the loop to perform this kind
of optimization.


jeff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Fortran: The Road Ahead.
  1997-12-07  8:57   ` Jeffrey A Law
@ 1997-12-07 10:54     ` Toon Moene
  1997-12-07 17:04       ` Orn E. Hansen
  0 siblings, 1 reply; 12+ messages in thread
From: Toon Moene @ 1997-12-07 10:54 UTC (permalink / raw)
  To: egcs

  > > III. New loop optimisations.
  > >
  > >     1. Moving invariant conditionals out of loops.

>  Compiler folks often refer to this as unswitching.  Note
>  that the conditional must be at the top of the loop to
>  perform this kind of optimization.

(Sigh).  OK, guys, I probably simplified too much.  The  
optimisation I really had in mind ran something like this:

   Transform:

      do i = 1, n
         <body1>
         if (lcond) then
            <body2>
         else
            <body3>
         endif
         <body4>
      enddo

   into (variant I):

      if (lcond) then
         do i = 1, n
            <body1>
            <body2>
            <body4>
         enddo
      else
         do i = 1, n
            <body1>
            <body3>
            <body4>
         enddo
      endif

    or (variant II):

      do i = 1, n
         <body1>
      enddo
      if (lcond) then
         do i = 1, n
            <body2>
         enddo
      else
         do i = 1, n
            <body3>
         enddo
      endif
      do i = 1, n
         <body4>
      enddo

It would be especially interesting to come up with a heuristic to  
choose between variant I and variant II, *given the other  
transformations I mentioned* :-)

Cheers,
Toon.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Fortran: The Road Ahead.
  1997-12-07 10:54     ` Toon Moene
@ 1997-12-07 17:04       ` Orn E. Hansen
  0 siblings, 0 replies; 12+ messages in thread
From: Orn E. Hansen @ 1997-12-07 17:04 UTC (permalink / raw)
  To: Toon Moene; +Cc: egcs

>       do i = 1, n
>          <body1>
>          if (lcond) then
>             <body2>
>          else
>             <body3>
>          endif
>          <body4>
>       enddo

  You could select variant I, if lcond will always yield the same
condition for any i.  And II if <body1> and <body4> are independant
and not products of <body2> or <body4>.  The save goes from O(4n)
downto O(3n + 1), which can be great for a large n (or expensive
lcond).  But most times, <body4> will use a product of in-loop
condition of <body2> or <body4> (hence it's position), making a
transformation impossible.

----------------------------------------------------------------------------
Orn Einar Hansen                         oe.hansen@oehansen.pp.se
                                          oehansen@daimi.aau.dk
                                        voice+fax; +46 035 217194

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Fortran: The Road Ahead.
  1997-12-07  6:24 Fortran: The Road Ahead Toon Moene
  1997-12-07  8:18 ` Joel Sherrill
@ 1997-12-08 11:06 ` Paul Koning
  1997-12-12  9:19 ` giv combination patch Richard Henderson
  2 siblings, 0 replies; 12+ messages in thread
From: Paul Koning @ 1997-12-08 11:06 UTC (permalink / raw)
  To: toon; +Cc: egcs

>>>>> "Toon" == Toon Moene <toon@moene.indiv.nluug.nl> writes:

 Toon> Several weeks ago I promised to write up the optimisation
 Toon> opportunities I see as being applicable to g77 generated RTL as
 Toon> soon as the first release of egcs hit the archives.

 Toon> That write-up is now in front of you.

Nice writeup.  I'd like to add to it that some of this is not just
valuable to Fortran code.  Recently I tried out some simple DSP code
on egcs to see what it would do with it.  That exposed a number of the
same limitations Toon mentioned.  In C it was fairly easy to work
around some of these by fiddling with pointers instead of using
indexing, but for clarity of code it certainly would have helped to
have some of the things Toon mentioned.

By the way, perhaps unrelated but observed at the same time: the
compiler seemed to produce stack references for locals even when it
didn't need to, i.e., the register allocation was missing
opportunities to use registers instead of stack memory temporaries.
And, even more curiously, at some point I managed to eliminate all
stack references in the body of my function but even so ended up with
a stack locals frame and a store into a stack temporary that was never
referenced again.

	paul

^ permalink raw reply	[flat|nested] 12+ messages in thread

* giv combination patch
  1997-12-07  6:24 Fortran: The Road Ahead Toon Moene
  1997-12-07  8:18 ` Joel Sherrill
  1997-12-08 11:06 ` Paul Koning
@ 1997-12-12  9:19 ` Richard Henderson
  1997-12-13 10:54   ` Toon Moene
  2 siblings, 1 reply; 12+ messages in thread
From: Richard Henderson @ 1997-12-12  9:19 UTC (permalink / raw)
  To: Toon Moene; +Cc: egcs

>     1. Combining related General Induction Variables.

The following code is a first step along this path.  It works correctly
for givs related by a constant as below, or a simple constant as with
higher order ranks of staticly dimensioned arrays.

It fails for givs with higher order relations, as with second-order
ranks of dynamicly dimensioned arrays.  This can be fixed in two ways.

First, I'd canonized the giv expressions to make comparing them easier
and faster.  But this yields expressions that are not necessarily 
canonized for base+index matching on machines that support that.  We
either need some heuristic for reconstructing valid index patterns or
drop the canonization and enhance the search.  

Second, we add a second-order combine pass that combines givs based
on other givs, not just on bivs as we do now.  

Finally, I should warn that -frerun-loop-opt now actually seems to
hurt optimization slightly.  In the cases that I've tried, the first
pass gets more or less optimal recognition (within the constraints
listed above) and the second pass degrades that.  I've not looked
into the cause of this.

r~

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: giv combination patch
  1997-12-12  9:19 ` giv combination patch Richard Henderson
@ 1997-12-13 10:54   ` Toon Moene
  1997-12-13 21:14     ` Richard Henderson
  0 siblings, 1 reply; 12+ messages in thread
From: Toon Moene @ 1997-12-13 10:54 UTC (permalink / raw)
  To: Richard Henderson; +Cc: egcs

I wrote:

> >     1. Combining related General Induction Variables.

RTH:

>  The following code is a first step along this path.  It
>  works correctly for givs related by a constant as below,
>  or a simple constant as with higher order ranks of
>  staticly dimensioned arrays.

Unfortunately, trying this patch against egcs-971207, it doesn't  
work, over here:

stage1/xgcc -Bstage1/  -DIN_GCC    -O2 -g  -DHAVE_CONFIG_H   -o  
genrecog \
 genrecog.o rtl.o ` case "obstack.o" in ?*) echo obstack.o ;; esac  
` ` case "" in ?*) echo  ;; esac ` ` case "" in ?*) echo  ;; esac `
./genrecog ./config/m68k/m68k.md > tmp-recog.c
/bin/sh: 25584 Memory fault
make[1]: *** [stamp-recog] Error 139
make[1]: Leaving directory `/Users/toon/Unix/compilers/egcs-971207/gcc'
make: *** [bootstrap] Error 2

I've tried whatever-you-can-think-of to get some information out of  
this, but gdb reports:

Program generated(1): Memory access exception on address 0xfed3d594  
(invalid address).
Reading in symbols for ./genrecog.c...done.
add_to_sequence (pattern=0x20222, last=0x3fffad0,  
position=0x1098<Address 0x1098 out of bounds>) at ./genrecog.c:391
391                     if (preds[i].codes[1] == 0 && new->code ==  
UNKNOWN)
(gdb) p i
$1 = 9
(gdb) p new
$2 = (struct decision *) 0x2526c
(gdb) p new->code
$3 = UNKNOWN
(gdb) p preds[i].codes[1]
$5 = UNKNOWN
(gdb) list
386                     int j;
387                     int allows_const_int = 0;
388
389                     new->pred = i;
390
391                     if (preds[i].codes[1] == 0 && new->code ==  
UNKNOWN)
392                       {
393                         new->code = preds[i].codes[0];
394                         if (! strcmp ("const_int_operand",  
new->tests))
395                           new->tests = 0, new->pred = -1;

... etc ...

and I can't get any subexpression of line 391 generating a  
segmentation fault, i.e. nothing revealing.

HTH,
Toon.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: giv combination patch
  1997-12-13 10:54   ` Toon Moene
@ 1997-12-13 21:14     ` Richard Henderson
  1997-12-14  3:46       ` giv combination patch mark 2 Richard Henderson
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Henderson @ 1997-12-13 21:14 UTC (permalink / raw)
  To: Toon Moene; +Cc: egcs

On Sat, Dec 13, 1997 at 07:52:43PM +0100, Toon Moene wrote:
> Unfortunately, trying this patch against egcs-971207, it doesn't  
> work, over here:
> 
> stage1/xgcc -Bstage1/  -DIN_GCC    -O2 -g  -DHAVE_CONFIG_H   -o  
> genrecog \
>  genrecog.o rtl.o ` case "obstack.o" in ?*) echo obstack.o ;; esac  
> ` ` case "" in ?*) echo  ;; esac ` ` case "" in ?*) echo  ;; esac `
> ./genrecog ./config/m68k/m68k.md > tmp-recog.c

Eek.  Ok, so it's generating wrong code somewhere.  This is for
m68k, right?  At least on Alpha it bootstraps; the annoying fortran
array thing is keeping me from running interesting benchmarks
though. :-P  Gotta fix that soon.

I've got some improvements I want to make this evening; perhaps
I'll find the bug by inspection, or maybe it'll just go away 
with the simplifications.  We'll see.  ;-)

At any rate, we should be able to find the bug via differences in
genrecog.s between egcs & egcs+patch.

r~

^ permalink raw reply	[flat|nested] 12+ messages in thread

* giv combination patch mark 2
  1997-12-13 21:14     ` Richard Henderson
@ 1997-12-14  3:46       ` Richard Henderson
  1997-12-14 11:32         ` Toon Moene
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Henderson @ 1997-12-14  3:46 UTC (permalink / raw)
  To: egcs; +Cc: Toon Moene

> I've got some improvements I want to make this evening; perhaps
> I'll find the bug by inspection ...

I did find a bit in which I was using an uninitialized variable,
so maybe I got it.

One current shortcoming in the code is that if it finds eg on m68k

Biv 39 initialized at insn 48: initial value 2
Insn 62: giv reg 52 src reg 39 benefit 6 used 1 replaceable mult 4 add -4
Insn 81: dest address src reg 39 benefit 10 used 1 replaceable mult 4
 add (plus:SI (reg/v/u:SI 29)
    (const_int -4))

the search in combine_givs will not search the raw bivs, and so we get 

        fmove.s %fp0,(%a2,%d1.l)
        addq.l #4,%d1

instead of a potential fmove.s %fp0,(%a2,%d0.l,4).  Hmm, well actually
fmove has a restricted set of modes doesn't it so that was a bad example.
Anyway, the case holds as well for some x86 examples where the modes
are availible.

Anyone know if anything bad would happen if a giv with mult 1 add 0
was added for every biv?  Or, more to the point, why they are kept
separately?

Unchanged from the previous patch is the lack of ability to recognize
indexes through rank 2 dynamic arrays.

r~

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: giv combination patch mark 2
  1997-12-14  3:46       ` giv combination patch mark 2 Richard Henderson
@ 1997-12-14 11:32         ` Toon Moene
  1997-12-14 11:56           ` Richard Henderson
  0 siblings, 1 reply; 12+ messages in thread
From: Toon Moene @ 1997-12-14 11:32 UTC (permalink / raw)
  To: egcs; +Cc: Richard Henderson

>  I did find a bit in which I was using an uninitialized
>  variable, so maybe I got it.

Probably, as it now dies in a different place:

stage1/xgcc -Bstage1/  -DIN_GCC    -O2 -g  -DHAVE_CONFIG_H   -o  
genextract \
 genextract.o rtl.o ` case "obstack.o" in ?*) echo obstack.o ;;  
esac ` ` case "" in ?*) echo  ;; esac ` ` case "" in ?*) echo  ;;  
esac `
/bin/ld: warning table of contents of library: stage1/libgcc.a not  
sorted slower link editing will result (use the ranlib(1) -s option)
./genextract ./config/m68k/m68k.md > tmp-extract.c
/bin/sh: 14287 Memory fault

Same symptoms:

Starting program:  
/Users/toon/Unix/compilers/egcs-971207/gcc/genextract  
./config/m68k/m68k.md > tmp-extract.c
Program generated(1): Memory access exception on address 0xf22e7bec  
(invalid address).
Reading in symbols for ./genextract.c...done.
0x34d8 in gen_insn (insn=0x6796) at ./genextract.c:187
187         p->dupnums[i] = dupnums[i], p->duplocs[i] = duplocs[i];
(gdb) p i
$1 = 0
(gdb) p p->dupnums[i]
$2 = 0
(gdb) p p->duplocs[i]
$3 = 0x0
(gdb) p p
$4 = (struct extraction *) 0x29080
(gdb) list
182
183       for (i = 0; i < op_count; i++)
184         p->oplocs[i] = oplocs[i];
185
186       for (i = 0; i < dup_count; i++)
187         p->dupnums[i] = dupnums[i], p->duplocs[i] = duplocs[i];
188     }

i.e. none of the expressions in line 187, when evalutated, core  
dumps by itself.  However, the following assembly code is  
interesting (showing the loop in lines 186-187):

0x34ca <gen_insn+478>:  movel d2,d0
0x34cc <gen_insn+480>:  lsll #2,d0
0x34ce <gen_insn+482>:  movel 0(a2)[d0.l],-232002756(a0)
0x34d8 <gen_insn+492>:  movel 0(a1)[d0.l],a0@+
0x34dc <gen_insn+496>:  addql #1,d2
0x34de <gen_insn+498>:  cmpl d2,d1
0x34e0 <gen_insn+500>:  bgt 0x34ca <gen_insn+478>

Note the use of a0 in the third and fourth line - it's extremely  
unlikely that *both* addresses are correct, as they are 230 Mbyte  
apart !  Also note that -232002756(a0) is only a valid address using  
my "m680x0 has 32-bit offsets for x >= 2" patch.

I'm going to rebuild without my patch to see if that makes a  
difference.  OTOH, it might well be a bug in your patch, as both a0  
based addresses are clearly "related" givs.

Cheers,
Toon.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: giv combination patch mark 2
  1997-12-14 11:32         ` Toon Moene
@ 1997-12-14 11:56           ` Richard Henderson
  0 siblings, 0 replies; 12+ messages in thread
From: Richard Henderson @ 1997-12-14 11:56 UTC (permalink / raw)
  To: Toon Moene; +Cc: egcs, Richard Henderson

On Sun, Dec 14, 1997 at 08:11:40PM +0100, Toon Moene wrote:
> 0x34ca <gen_insn+478>:  movel d2,d0
> 0x34cc <gen_insn+480>:  lsll #2,d0
> 0x34ce <gen_insn+482>:  movel 0(a2)[d0.l],-232002756(a0)
> 0x34d8 <gen_insn+492>:  movel 0(a1)[d0.l],a0@+
> 0x34dc <gen_insn+496>:  addql #1,d2
> 0x34de <gen_insn+498>:  cmpl d2,d1
> 0x34e0 <gen_insn+500>:  bgt 0x34ca <gen_insn+478>
> 
> Note the use of a0 in the third and fourth line - it's extremely  
> unlikely that *both* addresses are correct, as they are 230 Mbyte  
> apart !  Also note that -232002756(a0) is only a valid address using  
> my "m680x0 has 32-bit offsets for x >= 2" patch.

Do you have a pointer to that patch handy?  Here I get

        move.l (%a2,%d0.l),12(%a0)
        move.l (%a1,%d0.l),(%a0)+

Of course, the GIVs those came from were 

Insn 432: dest address src reg 29 benefit 9 used 1 lifetime 1 replaceable mult 4
 add (plus:SI (reg/v:SI 30)
    (const_int 60))
Insn 442: dest address src reg 29 benefit 9 used 1 lifetime 1 replaceable mult 4
 add (plus:SI (reg/v:SI 30)
    (const_int 48))
giv at 442 reduced to (reg:SI 112)
giv at 432 reduced to (plus:SI (reg:SI 112)
    (const_int 12))

So I don't immediately see what effect a 32-bit offset could have...


r~

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~1997-12-14 11:56 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1997-12-07  6:24 Fortran: The Road Ahead Toon Moene
1997-12-07  8:18 ` Joel Sherrill
1997-12-07  8:57   ` Jeffrey A Law
1997-12-07 10:54     ` Toon Moene
1997-12-07 17:04       ` Orn E. Hansen
1997-12-08 11:06 ` Paul Koning
1997-12-12  9:19 ` giv combination patch Richard Henderson
1997-12-13 10:54   ` Toon Moene
1997-12-13 21:14     ` Richard Henderson
1997-12-14  3:46       ` giv combination patch mark 2 Richard Henderson
1997-12-14 11:32         ` Toon Moene
1997-12-14 11:56           ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).