From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16445 invoked by alias); 13 Dec 2009 23:48:32 -0000 Received: (qmail 16359 invoked by uid 48); 13 Dec 2009 23:48:20 -0000 Date: Sun, 13 Dec 2009 23:48:00 -0000 Message-ID: <20091213234820.16358.qmail@sourceware.org> X-Bugzilla-Reason: CC References: Subject: [Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression In-Reply-To: Reply-To: gcc-bugzilla@gcc.gnu.org To: gcc-bugs@gcc.gnu.org From: "matz at gcc dot gnu dot org" Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2009-12/txt/msg01321.txt.bz2 ------- Comment #25 from matz at gcc dot gnu dot org 2009-12-13 23:48 ------- The reason that the testcase still is slow (and that the inner loop isn't unrolled or vectorized) is still the calculation of countm1. The division therein stays in the second inner loop, whereas with GCC 4.3 it can be moved into the outer loop. In this specific testcase it's a pass ordering problem: we start with (at .vrp1) (only parts shown): : D.1564_45 = *n_9(D); if (D.1564_45 > 1) ... : D.1572_60 = *n_9(D); if (D.1572_60 > 0) goto ; else goto ; Here _45 and _60 are equivalent, but VRP doesn't know this, hence it doesn't detect the goto as dead. The equivalence is only detected after PRE (not by PRE, though :-/ ), which means VRP2 does detect the jump as dead, and hence leaves only the step>0 case in the code. But this is too late for the late PRE (running before VRP2 and the loop optimizers) in order to move the dependend division to the outer loop. As the division isn't moved as loop invariant to the outer loop this also means that the loop count determination doesn't work, hence no unrolling. But the slowness itself is due to the div instruction in the second loop, instead of in the outer loop as with 4.3. -- matz at gcc dot gnu dot org changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |matz at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42108