From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 17423 invoked by alias); 20 Jan 2004 03:49:52 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 17412 invoked from network); 20 Jan 2004 03:49:51 -0000 Received: from unknown (HELO mail.codesourcery.com) (65.74.133.9) by sources.redhat.com with SMTP; 20 Jan 2004 03:49:51 -0000 Received: (qmail 9867 invoked from network); 20 Jan 2004 03:49:41 -0000 Received: from taltos.codesourcery.com (zack@66.92.218.83) by mail.codesourcery.com with DES-CBC3-SHA encrypted SMTP; 20 Jan 2004 03:49:41 -0000 Received: by taltos.codesourcery.com (sSMTP sendmail emulation); Mon, 19 Jan 2004 19:49:40 -0800 From: "Zack Weinberg" To: Nick Burrett Cc: Gabriel Dos Reis , Marc Espie , geoffk@apple.com, gcc@gcc.gnu.org Subject: Re: gcc 3.5 integration branch proposal References: <90200277-4301-11D8-BDBD-000A95B1F520@apple.com> <20040110002526.GA13568@disaster.jaj.com> <82D6F34E-4306-11D8-BDBD-000A95B1F520@apple.com> <20040110154129.GA28152@disaster.jaj.com> <1073935323.3458.42.camel@minax.codesourcery.com> <1073951351.3458.162.camel@minax.codesourcery.com> <20040119013113.044D74895@quatramaran.ens.fr> <400BB40B.4070101@dsvr.net> Date: Tue, 20 Jan 2004 03:49:00 -0000 In-Reply-To: <400BB40B.4070101@dsvr.net> (Nick Burrett's message of "Mon, 19 Jan 2004 10:40:11 +0000") Message-ID: <871xpvp9d7.fsf@egil.codesourcery.com> User-Agent: Gnus/5.110002 (No Gnus v0.2) Emacs/21.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-SW-Source: 2004-01/txt/msg01441.txt.bz2 Nick Burrett writes: > There's no harm in that. I have a port of GCC 3.3.3 running on a > 200MHz StrongARM that takes over 6 minutes to compile the following: > > #include > > int main (void) > { > std::cout << "Hello World" << std::endl; > return 0; > } > > GCC 2.95.4 compiled the same application on the same hardware in > around 20-30 seconds. I compiled GCC 3.4-to-be with profiling instrumentation and ran it against this test case. The test takes 2.7 seconds on my roughly two-year-old Athlon, which I think is far too slow; it should be <0.01s on this hardware. Without benefit of PCH or other such cleverness. Profiling results are interesting. First, the times are nearly identical at -O2 and -fsyntax-only. This is natural, there really isn't anything here to optimize, but it's worth noting. Almost all the time is in the C++ front end. Here's the top of the flat profile (-fsyntax-only): Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 4.40 0.67 0.67 661 0.00 0.00 store_bindings 2.95 1.12 0.45 372916 0.00 0.00 ggc_alloc 2.95 1.57 0.45 129400 0.00 0.00 make_node 2.50 1.95 0.38 193075 0.00 0.00 _cpp_lex_direct 2.36 2.31 0.36 14689 0.00 0.00 grokdeclarator 2.30 2.66 0.35 103752 0.00 0.00 memset 2.23 3.00 0.34 96832 0.00 0.00 ht_lookup 1.90 3.29 0.29 365671 0.00 0.00 htab_find_slot_with_hash 1.90 3.58 0.29 89353 0.00 0.00 walk_tree 1.90 3.87 0.29 81277 0.00 0.00 _int_malloc store_bindings potentially does a tremendous amount of work: it (in conjunction with its sole caller, maybe_push_to_top_level) temporarily unwinds the current scope stack, which entails modifying the data structure for every identifier declared in the program. Since the program declares some 8,000 identifiers, you can see why a function that's called only 661 times ends up at the top of the profile. I am not sure why ggc_alloc comes in second; checking is disabled so it isn't doing tons and tons of memset() operations or anything. The time spent in make_node is, I suspect, largely due to the inlined memset in there. Memset itself is being called mostly by [x]calloc, and most of *those* calls trace back to walk_tree_without_duplicates and/or for_each_template_parm, via htab_create_alloc. Those hash tables are a kludge to prevent exponential time consumption in certain algorithms; it would be nice if they weren't necessary. I think there's room for some easy speedups here, and I think they could get into 3.4 if developed promptly. Let's see some patches. zw