From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 3094 invoked by alias); 8 Feb 2006 04:02:18 -0000 Received: (qmail 3084 invoked by uid 48); 8 Feb 2006 04:02:15 -0000 Date: Wed, 08 Feb 2006 04:02:00 -0000 Message-ID: <20060208040215.3083.qmail@sourceware.org> X-Bugzilla-Reason: CC References: Subject: [Bug debug/25468] [3.4 Regression] -g makes g++ loop forever In-Reply-To: Reply-To: gcc-bugzilla@gcc.gnu.org To: gcc-bugs@gcc.gnu.org From: "wilson at gcc dot gnu dot org" Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2006-02/txt/msg00725.txt.bz2 List-Id: ------- Comment #4 from wilson at gcc dot gnu dot org 2006-02-08 04:02 ------- The problem here is ASM_OUTPUT_ASCII. The testcase is a big file with a lot of functions that have long function names. It is 4MB of C++ code, producing a 28MB .s file in about 15 seconds. If I add -Q and pipe the output to a file, I get 123MB, which means we have a lot of strings just for the function names. Adding -g, it takes about 20 minutes to compile the file, producing a .s file of about 108MB. Of the extra 80MB much of it is going through ASM_OUTPUT_ASCII. If you look at ASM_OUTPUT_ASCII, you see that the code uses putc to print one character at a time while checking for characters that need to be quoted. It also throws in a lot of extra fprintf calls just to make this even slower. So we are trying to feed maybe 70MB through putc, and obviously, that is going to take a lot of time. If I hack in a definition of ASM_OUTPUT_ASCII that just uses 3 fputs calls, then it compiles in about 16 seconds. This isn't quite right, as we are missing quoting for special characters, but this clearly shows that the problem is here in ASM_OUTPUT_ASCII. There are some complicating factors here. Some assemblers can't handle long lines in the .ascii directive, so we deliberately break up the output into lines less than 80 characters. This can perhaps be dropped if we know we are using GNU as. We could maybe document the allowed line lengths for other assemblers instead of assuming it is 80 characters. ASM_OUTPUT_ASCII is defined in a lot of different files in subtly different ways, and they all have this problem. If we can assume that function names never contain special characters that need quoting, then we can perhaps solve the problem by not using ASM_OUTPUT_ASCII for this. We could have a simpler and faster routine that doesn't worry about the need to emit quoting characters. This should be true for C. I'm not sure about C++ and the other languages. This is probably a risky assumption to make. ASM_OUTPUT_ASCII could be faster if it scanned forward for the next special character that needs quoting, then then emits everything up to the next special character via puts (or whatever), modulo the allowed line length. -- wilson at gcc dot gnu dot org changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |wilson at gcc dot gnu dot | |org Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|0000-00-00 00:00:00 |2006-02-08 04:02:15 date| | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25468