From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-return-65939-listarch-gcc=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 6771 invoked by alias); 8 Jan 2003 11:45:17 -0000
Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Archive: <http://gcc.gnu.org/ml/gcc/>
List-Post: <mailto:gcc@gcc.gnu.org>
List-Help: <http://gcc.gnu.org/ml/>
Sender: gcc-owner@gcc.gnu.org
Received: (qmail 6763 invoked from network); 8 Jan 2003 11:45:13 -0000
Received: from unknown (HELO nile.gnat.com) (205.232.38.5)
  by 209.249.29.67 with SMTP; 8 Jan 2003 11:45:13 -0000
Received: by nile.gnat.com (Postfix, from userid 338)
	id 17181F2DB9; Wed,  8 Jan 2003 06:45:01 -0500 (EST)
To: gcc@gcc.gnu.org, marcel_cox@hotmail.com
Subject: Re: An unusual Performance approach using Synthetic registers
Message-Id: <20030108114501.17181F2DB9@nile.gnat.com>
Date: Wed, 08 Jan 2003 12:27:00 -0000
From: dewar@gnat.com (Robert Dewar)
X-SW-Source: 2003-01/txt/msg00420.txt.bz2

> What you're describing is actually bad on the Pentium, and probably
> subsequent implementations as well.
> The Pentium can dual-issue loads as long as they reference separate cache
> ways. So, manually sorting the stack so contiguous accesses are localized
> increases the probability of the loads accessing the same cache way, thus
> decreasing the probability of single-issuing.

I would guess this would be dominated by the improvement in icache behavior
from the use of shorter offsets.