From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-return-74240-listarch-gcc=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 13216 invoked by alias); 14 May 2003 15:31:50 -0000
Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Archive: <http://gcc.gnu.org/ml/gcc/>
List-Post: <mailto:gcc@gcc.gnu.org>
List-Help: <http://gcc.gnu.org/ml/>
Sender: gcc-owner@gcc.gnu.org
Received: (qmail 13205 invoked from network); 14 May 2003 15:31:49 -0000
Received: from unknown (HELO omnigroup.com) (198.151.161.1)
  by sources.redhat.com with SMTP; 14 May 2003 15:31:49 -0000
Received: from omnigroup.com (seel.omnigroup.com [198.151.161.19])
	by omnigroup.com (8.10.2/8.9.1) with ESMTP id h4EFVmH06280;
	Wed, 14 May 2003 08:31:48 -0700 (PDT)
Date: Wed, 14 May 2003 15:31:00 -0000
Subject: Re: [tree-ssa] RFC: Dropping INDIRECT_REF variables
Content-Type: text/plain; charset=US-ASCII; format=flowed
Mime-Version: 1.0 (Apple Message framework v552)
Cc: gcc@gcc.gnu.org
To: Diego Novillo <dnovillo@redhat.com>
From: "Timothy J. Wood" <tjw@omnigroup.com>
In-Reply-To: <20030514150917.GA12344@tornado.toronto.redhat.com>
Message-Id: <2D0876A9-8621-11D7-93E7-000A9567A046@omnigroup.com>
Content-Transfer-Encoding: 7bit
X-SW-Source: 2003-05/txt/msg01412.txt.bz2


On Wednesday, May 14, 2003, at 08:09  AM, Diego Novillo wrote:
> - We disable the ability to treat non-aliased pointer
>   dereferences as if they were variables.  We would lose the
>   ability to do some optimizations like:

   Please forgive me if I'm way off base here, but one example on PPC 
that your code snippet reminded me of was int->float conversion.  On 
PPC this happens by building storing a pair of 32-bit constructed ints 
on the stack, loading them as a double and then doing some more 
contortions with this.

   The current problem is that if you have float conversion in a loop:

void convert(int *input, float *output, unsigned int count)
{
     while (count--)
         *output++ = *input++;
}

   both words of the double on the stack are written on each loop even 
though the one 32-bit portion is always the same.  This store should be 
hoisted outside the loop (in the degenerate case above this can by a 
pretty big performance win).  I don't have a 3.3 compiler right now, 
but on Mac OS X 10.2 (3.1-based):

cc -mdynamic-no-pic -S -O2 foo.c

.data
.literal8
         .align 3
LC0:
         .long   1127219200
         .long   -2147483648
.text
         .align 2
         .globl _convert
_convert:
         cmpwi cr0,r5,0
         addi r5,r5,-1
         beqlr- cr0
         addi r5,r5,1
         lis r11,ha16(LC0)
         mtctr r5
         lfd f13,lo16(LC0)(r11)
         lis r9,0x4330
L8:
         lwz r0,0(r3)
         addi r3,r3,4
         stw r9,-16(r1)  <--- should be outside the loop
         xoris r0,r0,0x8000
         stw r0,-12(r1)
         lfd f0,-16(r1)
         fsub f0,f0,f13
         frsp f0,f0
         stfs f0,0(r4)
         addi r4,r4,4
         bdnz L8
         blr

   I'm not sure this applies to what you were talking about, but I 
thought I'd bring it up since it looks similar and we could benefit 
from having this optimization.

-tim