From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 22439 invoked by alias); 19 Dec 2009 21:10:26 -0000 Received: (qmail 22392 invoked by uid 48); 19 Dec 2009 21:10:13 -0000 Date: Sat, 19 Dec 2009 21:10:00 -0000 Message-ID: <20091219211013.22391.qmail@sourceware.org> X-Bugzilla-Reason: CC References: Subject: [Bug tree-optimization/42108] [4.4/4.5 Regression] 50% performance regression In-Reply-To: Reply-To: gcc-bugzilla@gcc.gnu.org To: gcc-bugs@gcc.gnu.org From: "rguenth at gcc dot gnu dot org" Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2009-12/txt/msg01939.txt.bz2 ------- Comment #45 from rguenth at gcc dot gnu dot org 2009-12-19 21:10 ------- (In reply to comment #41) > Indeed. The PRE issue could be fixed by fixing PR38819 not in the way it is > done now but "properly" detect the invalid situations during ANTIC computation > and simply never mark trapping expressions so. At the current point its > hard to tell if the insertion is valid because the original expression is > always executed if the insertion point is - simply because we no longer > know where the original expression was. > > Thus, the "proper" place (err, I think at least) is during translating > ANTIC_OUT through the basic-block to ANTIC_IN (thus, in clean()). It > might be a bit expensive, though pre-computing if a basic-block possibly > exits the CFG could speed this up significantly. Another "proper" place > would be to add fake edges to exit for each such point in the CFG > (basically split blocks at each possibly noreturn call and add an edge > to exit). But that might be even more expensive. Doing this in a straight-forward way shows that the division isn't partially redundant: : # j_2 = PHI <1(3), j_101(7)> jmini_55 = j_2 - i_1; D.1530_57 = *nnd_28(D); if (i_1 > D.1530_57) goto ; else goto ; : D.1576_60 = D.1530_57 - i_1; D.1577_64 = (character(kind=4)) D.1576_60; D.1583_68 = (character(kind=4)) D.1582_45; countm1.6_69 = D.1577_64 / D.1583_68; ... if (countm1.6_69 == 0) goto ; else goto ; : ... if (countm1.6_81 == 0) goto ; else goto ; : ... if (j_2 == D.1560_49) goto ; else goto ; : i_103 = i_1 + 1; if (i_1 == D.1582_45) goto ; else goto ; The division may be not executed if i > nnd is always true which it is if nnd is <= 2. Thus fixing PRE is not the solution here (LIM will still move the expensive division if it is proven to not trap by VRP though). That is, computing coumtm before the loop entry check as suggested by Michael. But then going with the VRP solution sounds like a better idea to me (to fix this particular regression, that is). PR42438 tracks the PRE issue now which IMHO is unrelated to this bug. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42108