From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 31143 invoked by alias); 19 Mar 2002 20:26:02 -0000 Mailing-List: contact gcc-prs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-prs-owner@gcc.gnu.org Received: (qmail 31098 invoked by uid 71); 19 Mar 2002 20:26:01 -0000 Resent-Date: 19 Mar 2002 20:26:01 -0000 Resent-Message-ID: <20020319202601.31097.qmail@sources.redhat.com> Resent-From: gcc-gnats@gcc.gnu.org (GNATS Filer) Resent-To: nobody@gcc.gnu.org Resent-Cc: gcc-prs@gcc.gnu.org, gcc-bugs@gcc.gnu.org, feeley@iro.umontreal.ca Resent-Reply-To: gcc-gnats@gcc.gnu.org, lucier@math.purdue.edu Received:(qmail 22843 invoked by uid 61); 19 Mar 2002 20:19:37 -0000 Message-Id:<20020319201936.22840.qmail@sources.redhat.com> Date: Tue, 19 Mar 2002 12:26:00 -0000 From: lucier@math.purdue.edu Reply-To: lucier@math.purdue.edu To: gcc-gnats@gcc.gnu.org Cc: feeley@iro.umontreal.ca X-Send-Pr-Version:gnatsweb-2.9.3 (1.1.1.1.2.31) X-GNATS-Notify:feeley@iro.umontreal.ca Subject: optimization/6007: cfg cleanup tremendous performance hog with -O1 X-SW-Source: 2002-03/txt/msg00729.txt.bz2 List-Id: >Number: 6007 >Category: optimization >Synopsis: cfg cleanup tremendous performance hog with -O1 >Confidential: no >Severity: serious >Priority: medium >Responsible: unassigned >State: open >Class: sw-bug >Submitter-Id: net >Arrival-Date: Tue Mar 19 12:26:00 PST 2002 >Closed-Date: >Last-Modified: >Originator: B. Lucier >Release: 3.1 20020318 (prerelease) >Organization: >Environment: sparc-sun-solaris2.8 Solaris as/ld >Description: The input file is http://www.math.purdue.edu/~lucier/GNATS/GNATS-4/all.i.gz It takes about 16 hours and about 3 GB of memory to compile this program on a 500 MHz UltraSPARC II. This is with a profiled version of cc1. There is no way that it should take this long with -O1. -O1 is for local optimizations; it should be the optimization level one can use when a routine is too large to use -O2. This is definitely a regression from 2.95*. banach-169% /home/c/lucier/local/gcc-3.1/lib/gcc-lib/sparc-sun-solaris2.8/3.1/cc1 -fpreprocessed all.i -mptr64 -mstack-bias -mno-v8plus -dumpbase all.c -m64 -mcpu=ultrasparc -mtune=ultrasparc -O1 -Wall -W -Wno-unused -version -fPIC -fschedule-insns2 -fno-math-errno -fno-strict-aliasing -o all.s GNU CPP version 3.1 20020318 (prerelease) (cpplib) (sparc ELF) GNU C version 3.1 20020318 (prerelease) (sparc-sun-solaris2.8) compiled by GNU C version 3.1 20020318 (prerelease). options passed: -fpreprocessed -mptr64 -mstack-bias -mno-v8plus -m64 -mcpu=ultrasparc -mtune=ultrasparc -O1 -Wall -W -Wno-unused -fPIC -fschedule-insns2 -fno-math-errno -fno-strict-aliasing options enabled: -fdefer-pop -fomit-frame-pointer -fthread-jumps -fpeephole -ffunction-cse -fkeep-static-consts -freg-struct-return -fdelayed-branch -fgcse-lm -fgcse-sm -fschedule-insns2 -fsched-interblock -fsched-spec -fbranch-count-reg -fPIC -fcprop-registers -fcommon -fgnu-linker -fargument-alias -fmerge-constants -fident -fguess-branch-probability -ftrapping-math -mepilogue -mptr64 -m64 -mstack-bias -mcpu=ultrasparc -mtune=ultrasparc ___H__20_all {GC 104370k -> 44446k} {GC 57858k -> 45642k} {GC 60148k -> 47950k} {GC 85130k -> 47937k} {GC 82606k -> 23072k} {GC 31452k -> 25858k} {GC 33844k -> 26890k} {GC 44252k -> 26464k} ___init_proc ____20_all Execution times (seconds) garbage collection : 12.87 ( 0%) usr 0.04 ( 0%) sys 17.00 ( 0%) wall cfg construction : 341.31 ( 1%) usr 108.04 (47%) sys 449.00 ( 1%) wall cfg cleanup :65630.05 (98%) usr 3.19 ( 1%) sys65995.00 (97%) wall life analysis : 301.17 ( 0%) usr 0.02 ( 0%) sys 304.00 ( 0%) wall life info update : 4.24 ( 0%) usr 0.00 ( 0%) sys 6.00 ( 0%) wall preprocessing : 4.62 ( 0%) usr 4.01 ( 2%) sys 8.00 ( 0%) wall lexical analysis : 5.38 ( 0%) usr 9.45 ( 4%) sys 16.00 ( 0%) wall parser : 19.31 ( 0%) usr 5.38 ( 2%) sys 22.00 ( 0%) wall expand : 10.77 ( 0%) usr 0.52 ( 0%) sys 11.00 ( 0%) wall varconst : 2.87 ( 0%) usr 0.04 ( 0%) sys 3.00 ( 0%) wall integration : 1.91 ( 0%) usr 0.08 ( 0%) sys 3.00 ( 0%) wall jump : 1.85 ( 0%) usr 0.03 ( 0%) sys 2.00 ( 0%) wall CSE : 26.05 ( 0%) usr 0.00 ( 0%) sys 25.00 ( 0%) wall loop analysis : 0.15 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall flow analysis : 435.58 ( 1%) usr 93.09 (40%) sys 530.00 ( 1%) wall combiner : 37.52 ( 0%) usr 0.02 ( 0%) sys 37.00 ( 0%) wall if-conversion : 35.11 ( 0%) usr 0.01 ( 0%) sys 36.00 ( 0%) wall local alloc : 10.35 ( 0%) usr 0.00 ( 0%) sys 11.00 ( 0%) wall global alloc : 85.55 ( 0%) usr 4.51 ( 2%) sys 134.00 ( 0%) wall reload CSE regs : 220.85 ( 0%) usr 0.02 ( 0%) sys 223.00 ( 0%) wall flow 2 : 11.20 ( 0%) usr 0.04 ( 0%) sys 15.00 ( 0%) wall if-conversion 2 : 0.62 ( 0%) usr 0.03 ( 0%) sys 3.00 ( 0%) wall rename registers : 6.03 ( 0%) usr 0.07 ( 0%) sys 9.00 ( 0%) wall scheduling 2 : 12.87 ( 0%) usr 0.00 ( 0%) sys 12.00 ( 0%) wall delay branch sched : 8.40 ( 0%) usr 0.00 ( 0%) sys 8.00 ( 0%) wall shorten branches : 0.82 ( 0%) usr 0.00 ( 0%) sys 3.00 ( 0%) wall final : 1.52 ( 0%) usr 0.03 ( 0%) sys 6.00 ( 0%) wall rest of compilation : 11.21 ( 0%) usr 2.34 ( 1%) sys 188.00 ( 0%) wall TOTAL :67240.23 230.97 68076.00 Here is the top of the profile. The whole gprof output can be found at: http://www.math.purdue.edu/~lucier/GNATS/GNATS-4/all.i.gprof.gz Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 37.31 1106.44 1106.44 2123667007 0.00 0.00 try_crossjump_to_edge 11.27 1440.70 334.26 internal_mcount 6.85 1643.67 202.97 395788 0.00 0.00 cselib_invalidate_regno 6.53 1837.39 193.72 htab_traverse 4.67 1975.95 138.56 4987 0.03 0.03 propagate_freq 2.87 2061.08 85.13 29 2.94 2.94 find_unreachable_blocks 2.50 2135.09 74.01 15 4.93 4.94 calc_idoms 2.48 2208.53 73.44 468802 0.00 0.00 try_forward_edges 2.46 2281.48 72.95 173160573 0.00 0.00 cached_make_edge 2.41 2353.01 71.53 175996207 0.00 0.00 bitmap_operation 2.14 2416.55 63.54 9 7.06 15.68 calculate_global_regs_live 2.14 2480.01 63.46 15 4.23 4.23 calc_dfs_tree_nonrec 1.90 2536.38 56.37 173039017 0.00 0.00 make_label_edge 1.60 2583.90 47.52 3 15.84 31.63 flow_loops_find 0.98 2613.04 29.14 173160566 0.00 0.00 free_edge 0.97 2641.89 28.85 5 5.77 5.77 mark_dfs_back_edges 0.70 2662.75 20.86 3 6.95 64.74 estimate_bb_frequencies 0.60 2680.56 17.81 __lshrdi3 0.56 2697.23 16.67 266211 0.00 0.00 record_one_conflict 0.55 2713.54 16.31 5058 0.00 0.00 flow_loop_exit_edges_find 0.55 2729.73 16.19 64492 0.00 0.00 purge_dead_edges 0.51 2744.83 15.10 43310765 0.00 0.00 remove_edge >How-To-Repeat: >Fix: >Release-Note: >Audit-Trail: >Unformatted: