From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 6771 invoked by alias); 8 Jan 2003 11:45:17 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 6763 invoked from network); 8 Jan 2003 11:45:13 -0000 Received: from unknown (HELO nile.gnat.com) (205.232.38.5) by 209.249.29.67 with SMTP; 8 Jan 2003 11:45:13 -0000 Received: by nile.gnat.com (Postfix, from userid 338) id 17181F2DB9; Wed, 8 Jan 2003 06:45:01 -0500 (EST) To: gcc@gcc.gnu.org, marcel_cox@hotmail.com Subject: Re: An unusual Performance approach using Synthetic registers Message-Id: <20030108114501.17181F2DB9@nile.gnat.com> Date: Wed, 08 Jan 2003 12:27:00 -0000 From: dewar@gnat.com (Robert Dewar) X-SW-Source: 2003-01/txt/msg00420.txt.bz2 > What you're describing is actually bad on the Pentium, and probably > subsequent implementations as well. > The Pentium can dual-issue loads as long as they reference separate cache > ways. So, manually sorting the stack so contiguous accesses are localized > increases the probability of the loads accessing the same cache way, thus > decreasing the probability of single-issuing. I would guess this would be dominated by the improvement in icache behavior from the use of shorter offsets.