From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 17068 invoked by alias); 22 Apr 2002 21:32:24 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 17030 invoked from network); 22 Apr 2002 21:32:22 -0000 Received: from unknown (HELO Angel.zoy.org) (12.236.86.18) by sources.redhat.com with SMTP; 22 Apr 2002 21:32:22 -0000 Received: by Angel.zoy.org (Postfix, from userid 1000) id 5111EB89B; Mon, 22 Apr 2002 14:32:22 -0700 (PDT) Date: Mon, 22 Apr 2002 14:33:00 -0000 From: Michel LESPINASSE To: gcc list Subject: GCC performance regression - its memset ! Message-ID: <20020422213222.GA21429@zoy.org> References: <20020421005718.GA16378@zoy.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20020421005718.GA16378@zoy.org> User-Agent: Mutt/1.3.28i X-SW-Source: 2002-04/txt/msg01115.txt.bz2 OK, so I worked more to find the cause of the slowdown, and I figured out its all because of memset(). This function seems to be about twice slower than in 2.95, and also for some reason the time spent in memset does not show up in gprof. Here is a test case: --------------------------- foo.c ------------------------------ #include short table[64]; void bar (void); int main (void) { int i; bar (); for (i = 0; i < 100000000; i++) memset (table + 1, 0, 63 * sizeof(short)); return 0; } ----------------------------- end of foo.c ------------------------ ----------------------------- bar.c ------------------------------- void bar (void) { } ----------------------------- end of bar.c ------------------------ # gcc-2.95 -g -O3 -p foo.c bar.c # time ./a.out ./a.out 5.75s user 0.00s system 100% cpu 5.739 total # gprof -bp ./a.out Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls Ts/call Ts/call name 100.00 5.74 5.74 main 0.00 5.74 0.00 1 0.00 0.00 bar # gcc-3.1 -g -O3 -p foo.c bar.c # time ./a.out ./a.out 10.78s user 0.00s system 101% cpu 10.634 total # gprof -bp ./a.out Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls Ts/call Ts/call name 100.00 0.62 0.62 main 0.00 0.62 0.00 1 0.00 0.00 bar gcc-3.1 snapshot is about twice slower than 2.95 on that test case, and for some reason the gprof output is bogus (it does not account for the time spent in memset), while it was not with 2.95. I did not know my code spent that much time in memset, I'll see what I can do about it. Hope this helps, -- Michel "Walken" LESPINASSE Is this the best that god can do ? Then I'm not impressed.