From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 8774 invoked by alias); 4 Apr 2002 14:57:01 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 8764 invoked from network); 4 Apr 2002 14:57:00 -0000 Received: from unknown (HELO newton.math.purdue.edu) (128.210.3.6) by sources.redhat.com with SMTP; 4 Apr 2002 14:57:00 -0000 Received: from banach.math.purdue.edu (lucier@banach.math.purdue.edu [128.210.3.16]) by newton.math.purdue.edu (8.10.1/8.10.1/PURDUE_MATH-4.0) with ESMTP id g34Eur800905; Thu, 4 Apr 2002 09:56:54 -0500 (EST) Received: (from lucier@localhost) by banach.math.purdue.edu (8.10.1/8.10.1/PURDUE_MATH-4.0) id g34Eunh29854; Thu, 4 Apr 2002 09:56:49 -0500 (EST) From: Brad Lucier Message-Id: <200204041456.g34Eunh29854@banach.math.purdue.edu> Subject: Re: optimization/6007: cfg cleanup tremendous performance hog with -O1 To: jh@suse.cz (Jan Hubicka) Date: Thu, 04 Apr 2002 07:02:00 -0000 Cc: lucier@math.purdue.edu (Brad Lucier), jh@suse.cz (Jan Hubicka), dje@watson.ibm.com (David Edelsohn), gcc@gcc.gnu.org, mark@codesourcery.com, feeley@iro.umontreal.ca In-Reply-To: <20020329162904.GK2886@atrey.karlin.mff.cuni.cz> from "Jan Hubicka" at Mar 29, 2002 05:29:04 PM MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-SW-Source: 2002-04/txt/msg00141.txt.bz2 Honza and I have exchanged some off-list e-mail about this problem. Real life has intervened, and I doubt that he will have time before the scheduled release of April 15 to work on this, so I'll attempt to summarize what's been happening. I think I need some help. All this is with the compile options -O1 -fschedule-insns2. At first, this was the profile on my all.i test: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 37.31 1106.44 1106.44 2123667007 0.00 0.00 try_crossjump_to_edge 11.27 1440.70 334.26 internal_mcount 6.85 1643.67 202.97 395788 0.00 0.00 cselib_invalidate_regno 6.53 1837.39 193.72 htab_traverse 4.67 1975.95 138.56 4987 0.03 0.03 propagate_freq 2.87 2061.08 85.13 29 2.94 2.94 find_unreachable_blocks 2.50 2135.09 74.01 15 4.93 4.94 calc_idoms 2.48 2208.53 73.44 468802 0.00 0.00 try_forward_edges 2.46 2281.48 72.95 173160573 0.00 0.00 cached_make_edge 2.41 2353.01 71.53 175996207 0.00 0.00 bitmap_operation ... cleanup cfg took > 98% of the CPU time of 18 hours on my UltraSPARC. Honza sent me a patch on March 28 that disabled try_crossjump_bb if there are more than 100 outgoint edges. This changed the profile on a slightly smaller example (denoise3.i) to Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 63.29 1219.40 1219.40 33525 0.04 0.04 try_crossjump_to_edge 8.19 1377.13 157.73 internal_mcount 5.08 1474.99 97.86 htab_traverse 2.52 1523.56 48.57 206655 0.00 0.00 try_forward_edges 2.40 1569.81 46.25 31 1.49 1.49 find_unreachable_blocks 2.16 1611.38 41.57 2905 0.01 0.01 propagate_freq 1.40 1638.30 26.91 9 2.99 5.48 calculate_global_regs_live 1.33 1663.98 25.68 15 1.71 1.71 calc_idoms 1.20 1687.02 23.04 15 1.54 1.54 calc_dfs_tree_nonrec Basically, try_crossjump_bb was no longer a problem, but try_crossjump_to_edge is still a problem; cleanup cfg still took > 87% of the CPU time. Honza suggested (several times) that one way to deal with this is to disable try_crossjump_to_edge and try_crossjump_bb with unless we use -O2 or higher. These algorithms are O(N^3) in the number of edges, which are quadratic in the size of my programs, since I use a lot of computed gotos and label addresses. I've looked at cfgcleanup.c and don't really know how to proceed. Can someone suggest a reasonable way to fix this? Brad