From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 26045 invoked by alias); 8 Jun 2010 14:29:48 -0000 Received: (qmail 25998 invoked by uid 48); 8 Jun 2010 14:29:31 -0000 Date: Tue, 08 Jun 2010 14:29:00 -0000 Message-ID: <20100608142931.25997.qmail@sourceware.org> X-Bugzilla-Reason: CC References: Subject: [Bug tree-optimization/44423] [4.5/4.6 Regression] Massive performance regression in SSE code due to SRA In-Reply-To: Reply-To: gcc-bugzilla@gcc.gnu.org To: gcc-bugs@gcc.gnu.org From: "jamborm at gcc dot gnu dot org" Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2010-06/txt/msg00893.txt.bz2 ------- Comment #7 from jamborm at gcc dot gnu dot org 2010-06-08 14:29 ------- I don't think I can fix this bug in its most general form without doing some flow-sensitive decisions (which can be difficult for aggregates) and without causing PR 43846 again. (Aggregate copy-propagation and either of the two things described below should do the trick, though). As noted, this is caused by a fix to PR 43846 which on the other hand is certainly not necessary for non-aggregate types when we do type punning of register types through unions. I've got a two line patch testing that and it works (and bootstraps and tests) fine. However, that is only a change in the new heuristics and if the array elements are individually read somewhere else in the function too, a different decision making condition will kick in and we will end up with the replacements and extra statements in the loop again. Therefore, I now tend to think that these accesses to SSE vectors are a good reason to simply disallow scalarization of anything that has a non-aggregate parent in the SRA access tree. This would only affect type punning through unions and weird typecasts (none of which could be processed by previous SRA). Actually, I had this disallowed when I was developing the new SRA most of the time and then decided to allow it only very late. I don't remember why I did that. I'm now testing a patch doing that, maybe some testcase will remind me what the reason was. I will ponder about this a bit more but probably will soon submit a patch doing the latter. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44423