From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 30950 invoked by alias); 24 Mar 2004 09:52:25 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 30854 invoked by alias); 24 Mar 2004 09:52:20 -0000 Date: Wed, 24 Mar 2004 09:52:00 -0000 Message-ID: <20040324095220.30853.qmail@sources.redhat.com> From: "paulthomas2 at wanadoo dot fr" To: gcc-bugs@gcc.gnu.org In-Reply-To: <20040312233606.14563.paulthomas2@wanadoo.fr> References: <20040312233606.14563.paulthomas2@wanadoo.fr> Reply-To: gcc-bugzilla@gcc.gnu.org Subject: [Bug c++/14563] octave built under Cygwin very slow X-Bugzilla-Reason: CC X-SW-Source: 2004-03/txt/msg02880.txt.bz2 List-Id: ------- Additional Comments From paulthomas2 at wanadoo dot fr 2004-03-24 09:52 ------- Subject: Re: octave built under Cygwin very slow bangerth at dealii dot org wrote: >------- Additional Comments From bangerth at dealii dot org 2004-03-14 20:33 ------- >One of the things you could do is to compile the octave version you >use with both 3.2.3 and 3.3.3, in each case with profiling information >(i.e. with -pg) and check the output to see whether there is one single >function that now takes significantly longer (and thus moved up in >the output of gprof, in the list of functions sorted by execution time). > >If there should be one such function, it might be worth to compile just this >one function (or file) with the other compiler, to verify that it indeed >is the problem. This way we could at least localize the problem a little >better. This would also be of great value to us, since we have no clue >about the octave functions, and it is hard for us to look at profiling >output without knowing what all these functions do etc. > >If you have identified the function that has the slowdown (assuming that it >is a single function), then one would rip it out of the program and place >it (and whatever else it needs) into a small file where main() simply calls >this function a number of times with dummy arguments. This way we would >have a simpler and much smaller testcase. > >W. > > > Ben Diedrich and I have done as you suggested. It has to be said that neither of us have tried profiling anything, let alone a code as large and as complicated as octave. I did the gcc-3.2 and Ben did the gcc-3.3. Our machines are nearly identical, both with Windows 2000 and with the same version of Cygwin. We ran a small test programme in octave that shows a factor of approximately 6 difference in execution time, when octave is built with gcc-3.2 (~1.8-2s) or gcc-3.3 (10-12s). When compiled and linked with -pg set, the execution time increased to 9s with gcc-3.2 (Ben, what is the corresponding figure for gcc-3.3?). Switching off the -O2 flags, increased the time by a further 3s but resulted in a much more detailed profile. In the enclosed, we increased the number of loops by a factor of 10, in order to get a reasonable resolution on the less frequently visited functions. So remember that the corresponding wall-clock time is 90s for cycle 2. I have added the most pertinent part of the graphical output. If you want to see the entirity of the gprof output, I can easily forward it to you for both builds. To first order, there is an enormous amount of unaccounted time but both builds are more or less identical in their time in octave functions. The only significant difference that I can see is the appearance in the gcc-3.3 version of [7] 22.3 5.62 0.00 _Unwind_SjLj_Register [7] [9] 12.4 3.11 0.00 _Unwind_SjLj_Unregister [9] which together take a significant amout of time. What are these calls and could they be the culprits? I will be spending a bit of time over the next few days to get a better feel for profiling and its relationship to the execution times without -pg set. Whilst the time spent in various routines does change in correct proportion to the content of the octave code, the absolute magnitudes are way out. I hope that this helps to give you a clue. Any advice that you can offer on the profiling would be gratefully received. Paul Thomas Profiles for octave-2.1.50 built using (i) gcc-3.2 and (ii) gcc-3.3 Paul Thomas and Ben Diedrich 03/23/04 Both run with test programme, entered from octave command line: a = cputime ; tot = 0 ; x = [1:1e6] ; for i = 1:1e6 ; tot = tot + x(i) ; end ; disp(cputime-a) The individual lines in the graphical output can be identified with features in the test programme and the times correspond well, in proportion, with timing tests done in octave. The absolute times are way out. granularity: each sample hit covers 4 byte(s) for 0.04% of 25.18 seconds index % time self children called name >>>>>Built with gcc-3.2 [3] 70.7 2.33 6.14 4+16000183 [3] 0.27 2.15 1000017 tree_index_expression::rvalue(int) [8] 0.29 1.06 3000033 tree_identifier::rvalue(int) [12] 0.24 0.58 1000015 tree_simple_assignment::rvalue() [17] 0.22 0.51 1000022 tree_statement::eval(bool, int, bool) [19] 0.19 0.47 1000009+4 tree_binary_expression::rvalue() [20] 0.28 0.24 1000003 tree_argument_list::convert_to_const_vector(octave_value const*) [23] 0.21 0.26 3000032 tree_identifier::rvalue() [27] 0.14 0.23 1000015 tree_simple_assignment::rvalue(int) [32] 0.17 0.13 1000006 tree_statement_list::eval(bool, int) [36] 0.09 0.17 1000005 make_value_list(tree_argument_list*, string_vector const&, octave_value const*) [40] 0.15 0.09 1000014 tree_index_expression::rvalue() [45] 0.02 0.19 1000000 tree_simple_for_command::do_for_loop_once(octave_lvalue&, octave_value const&, bool&) [47] 0.06 0.06 1 tree_simple_for_command::eval() [64] 0.00 0.00 2 octave_user_function::do_multi_index_op(int, octave_value_list const&) [184] 0.00 0.00 3 octave_value::do_multi_index_op(int, octave_value_list const&) [208] 0.00 0.00 2 tree_parameter_list::convert_to_const_vector(tree_va_return_list*) [242] 0.00 0.00 2 tree_if_command::eval() [1079] 0.00 0.00 2 tree_if_command_list::eval() [1084] 0.00 0.00 2 tree_if_clause::eval() [1077] --- >>>>>Built with gcc-3.3 [3] 42.7 3.07 7.68 3+16000180 [3] 0.36 2.66 1000017 tree_index_expression::rvalue(int) [10] 0.36 1.27 3000032 tree_identifier::rvalue(int) [14] 0.41 0.88 1000015 tree_simple_assignment::rvalue() [17] 0.25 0.66 1000021 tree_statement::eval(bool, int, bool) [21] 0.13 0.64 1000009+4 tree_binary_expression::rvalue() [25] 0.26 0.31 1000015 tree_simple_assignment::rvalue(int) [29] 0.29 0.28 1000003 tree_argument_list::convert_to_const_vector(octave_value const*) [30] 0.27 0.29 3000032 tree_identifier::rvalue() [31] 0.30 0.13 1000005 tree_statement_list::eval(bool, int) [35] 0.17 0.17 1000005 make_value_list(tree_argument_list*, string_vector const&, octave_value const*) [44] 0.05 0.24 1000000 tree_simple_for_command::do_for_loop_once(octave_lvalue&, octave_value const&, bool&) [49] 0.10 0.10 1000014 tree_index_expression::rvalue() [63] 0.12 0.06 1 tree_simple_for_command::eval() [69] 0.00 0.00 2 octave_user_function::do_multi_index_op(int, octave_value_list const&) [186] 0.00 0.00 2 tree_parameter_list::convert_to_const_vector(tree_va_return_list*) [233] 0.00 0.00 2 octave_value::do_multi_index_op(int, octave_value_list const&) [1058] 0.00 0.00 2 tree_if_command::eval() [1068] 0.00 0.00 2 tree_if_command_list::eval() [1074] 0.00 0.00 2 tree_if_clause::eval() [1066] >>>>Lines that do not appear in the version built with gcc-3.2 [7] 22.3 5.62 0.00 _Unwind_SjLj_Register [7] [9] 12.4 3.11 0.00 _Unwind_SjLj_Unregister [9] [11] 10.2 2.56 0.00 operator new(unsigned int) [11] -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14563