From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 96888 invoked by alias); 29 Mar 2017 10:05:20 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 96871 invoked by uid 89); 29 Mar 2017 10:05:19 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-11.1 required=5.0 tests=BAYES_00,GIT_PATCH_2,GIT_PATCH_3,KAM_ASCII_DIVIDERS,RP_MATCHES_RCVD,SPF_PASS autolearn=ham version=3.3.2 spammy=measures X-HELO: mx2.suse.de Received: from mx2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 29 Mar 2017 10:05:17 +0000 Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id F3C40AAC7 for ; Wed, 29 Mar 2017 10:05:16 +0000 (UTC) Date: Wed, 29 Mar 2017 10:17:00 -0000 From: Richard Biener To: gcc-patches@gcc.gnu.org Subject: [PATCH][RFC] Fix P1 PR77498 Message-ID: User-Agent: Alpine 2.20 (LSU 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-SW-Source: 2017-03/txt/msg01467.txt.bz2 After quite some pondering over this and other related bugs I propose the following for GCC 7 which tames down PRE a bit (back to levels of GCC 6). Technically it's the wrong place to fix this, we do have measures in place during elimination but they are not in effect at -O2. For GCC 8 I'd like to be more aggressive there but that would require to enable predictive commoning at -O2 (with some limits to its unrolling) to not lose optimization opportunities. The other option is to ignore this issue and postpone the solution to GCC 8. Bootstrapped / tested on x86_64-unknown-linux-gnu. Any preference? Thanks, Richard. 2017-03-29 Richard Biener PR tree-optimization/77498 * tree-ssa-pre.c (phi_translate_1): Do not allow simplifications to non-constants over backedges. * gfortran.dg/pr77498.f: New testcase. Index: gcc/tree-ssa-pre.c =================================================================== *** gcc/tree-ssa-pre.c (revision 246026) --- gcc/tree-ssa-pre.c (working copy) *************** phi_translate_1 (pre_expr expr, bitmap_s *** 1468,1477 **** leader for it. */ if (constant->kind != CONSTANT) { ! unsigned value_id = get_expr_value_id (constant); ! constant = find_leader_in_sets (value_id, set1, set2); ! if (constant) ! return constant; } else return constant; --- 1468,1487 ---- leader for it. */ if (constant->kind != CONSTANT) { ! /* Do not allow simplifications to non-constants over ! backedges as this will likely result in a loop PHI node ! to be inserted and increased register pressure. ! See PR77498 - this avoids doing predcoms work in ! a less efficient way. */ ! if (find_edge (pred, phiblock)->flags & EDGE_DFS_BACK) ! ; ! else ! { ! unsigned value_id = get_expr_value_id (constant); ! constant = find_leader_in_sets (value_id, set1, set2); ! if (constant) ! return constant; ! } } else return constant; Index: gcc/testsuite/gfortran.dg/pr77498.f =================================================================== --- gcc/testsuite/gfortran.dg/pr77498.f (nonexistent) +++ gcc/testsuite/gfortran.dg/pr77498.f (working copy) @@ -0,0 +1,36 @@ +! { dg-do compile } +! { dg-options "-O2 -ffast-math -fdump-tree-pre" } + + subroutine foo(U,V,R,N,A) + integer N + real*8 U(N,N,N),V(N,N,N),R(N,N,N),A(0:3) + integer I3, I2, I1 +C + do I3=2,N-1 + do I2=2,N-1 + do I1=2,N-1 + R(I1,I2,I3)=V(I1,I2,I3) + * -A(0)*( U(I1, I2, I3 ) ) + * -A(1)*( U(I1-1,I2, I3 ) + U(I1+1,I2, I3 ) + * + U(I1, I2-1,I3 ) + U(I1, I2+1,I3 ) + * + U(I1, I2, I3-1) + U(I1, I2, I3+1) ) + * -A(2)*( U(I1-1,I2-1,I3 ) + U(I1+1,I2-1,I3 ) + * + U(I1-1,I2+1,I3 ) + U(I1+1,I2+1,I3 ) + * + U(I1, I2-1,I3-1) + U(I1, I2+1,I3-1) + * + U(I1, I2-1,I3+1) + U(I1, I2+1,I3+1) + * + U(I1-1,I2, I3-1) + U(I1-1,I2, I3+1) + * + U(I1+1,I2, I3-1) + U(I1+1,I2, I3+1) ) + * -A(3)*( U(I1-1,I2-1,I3-1) + U(I1+1,I2-1,I3-1) + * + U(I1-1,I2+1,I3-1) + U(I1+1,I2+1,I3-1) + * + U(I1-1,I2-1,I3+1) + U(I1+1,I2-1,I3+1) + * + U(I1-1,I2+1,I3+1) + U(I1+1,I2+1,I3+1) ) + enddo + enddo + enddo + return + end + +! PRE shouldn't do predictive commonings job here (and in a bad way) +! ??? It still does but not as bad as it could. Less prephitmps +! would be better, pcom does it with 6. +! { dg-final { scan-tree-dump-times "# prephitmp" 9 "pre" } }