From mboxrd@z Thu Jan  1 00:00:00 1970
From: Toon Moene <toon@moene.indiv.nluug.nl>
To: law@cygnus.com
Cc: egcs@cygnus.com
Subject: Re: (really Fortran patches) 
Date: Thu, 23 Oct 1997 10:45:00 -0000
Message-id: <9710231737.AA22320@moene.indiv.nluug.nl>
References: <7055.877592152@hurl.cygnus.com>
X-SW-Source: 1997-10/msg00987.html

> This gets more interesting by the minute.

[ ... ]

>  The slowdown turned out to be a bad schedule.  After hand
>  fixing the schedule the version with the USE patch ran
>  faster than the one without the USE patch.

Well, there is another difference between the loops you showed.   
The one produced without your `use' patch has this instruction:

        fmpysub,dbl %fr22,%fr24,%fr22,%fr25,%fr23

whereas with your `use' patch, the following sequence is produced:

        fmpy,dbl %fr24,%fr23,%fr24
        fsub,dbl %fr22,%fr24,%fr22

Now, assuming that the fmpysub instruction really buys you anything  
over the above sequence, that could be another cause of the  
slowdown.

I have no clue why the fmpysub instruction wasn't generated in the  
second case.

BTW, does HP really palm off this PA as a *R*ISC architecture ?   
With a five operand instruction ?  Are you sure there isn't a  
`movc5' hiding somewhere, or `index', so that we don't have to do  
strength reduction at all (not to mention `editpc' to help the COBOL  
programs in SPEC95 :-)

Cheers,
Toon.

A RISC architecture is basically a 6600 with 64- instead of 60-bit  
FP registers, a unified integer/address register file and some more  
addressing bits; in short, an Alpha.