public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug preprocessor/17798] New: cpp memory leak with undefined symbols
@ 2004-10-02 22:24 rth at gcc dot gnu dot org
  2004-10-02 22:38 ` [Bug preprocessor/17798] [3.4/4.0 Regression] " pinskia at gcc dot gnu dot org
                   ` (8 more replies)
  0 siblings, 9 replies; 11+ messages in thread
From: rth at gcc dot gnu dot org @ 2004-10-02 22:24 UTC (permalink / raw)
  To: gcc-bugs

The following generator program

#include <stdio.h>
#include <stdlib.h>                                                            
                   
int main()
{
  int i;
  setvbuf (stdout, malloc(32768), _IOFBF, 32768);                              
                                               
  for (i = 0; i < 10000000; ++i)
    printf ("#ifdef M%d\nchar c%d[] = M%d;\n#endif\n", i, i, i);               
                                                               
  return 0;
}

builds a 483MB file which contains nothing but #ifdefs against undefined
symbols.  The preprocessed file should contain nothing but cpp line notes.

With gcc 3.2, cpp0 has peak memory usage of 930MB.
With gcc 3.4 and 4.0, cc1 has peak memory usage of 2010MB.

While this does strike me as a bit silly, apparently there are users trying
this sort of thing.

  https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=68634

-- 
           Summary: cpp memory leak with undefined symbols
           Product: gcc
           Version: 3.4.3
            Status: UNCONFIRMED
          Keywords: memory-hog
          Severity: normal
          Priority: P2
         Component: preprocessor
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: rth at gcc dot gnu dot org
                CC: gcc-bugs at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17798


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug preprocessor/17798] [3.4/4.0 Regression] cpp memory leak with undefined symbols
  2004-10-02 22:24 [Bug preprocessor/17798] New: cpp memory leak with undefined symbols rth at gcc dot gnu dot org
@ 2004-10-02 22:38 ` pinskia at gcc dot gnu dot org
  2004-10-03 17:56 ` pinskia at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-10-02 22:38 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-10-02 22:38 -------
I think this is GC related.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|cpp memory leak with        |[3.4/4.0 Regression] cpp
                   |undefined symbols           |memory leak with undefined
                   |                            |symbols


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17798


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug preprocessor/17798] [3.4/4.0 Regression] cpp memory leak with undefined symbols
  2004-10-02 22:24 [Bug preprocessor/17798] New: cpp memory leak with undefined symbols rth at gcc dot gnu dot org
  2004-10-02 22:38 ` [Bug preprocessor/17798] [3.4/4.0 Regression] " pinskia at gcc dot gnu dot org
@ 2004-10-03 17:56 ` pinskia at gcc dot gnu dot org
  2004-10-07  0:33 ` stuartm at connecttech dot com
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-10-03 17:56 UTC (permalink / raw)
  To: gcc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |3.4.3


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17798


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug preprocessor/17798] [3.4/4.0 Regression] cpp memory leak with undefined symbols
  2004-10-02 22:24 [Bug preprocessor/17798] New: cpp memory leak with undefined symbols rth at gcc dot gnu dot org
  2004-10-02 22:38 ` [Bug preprocessor/17798] [3.4/4.0 Regression] " pinskia at gcc dot gnu dot org
  2004-10-03 17:56 ` pinskia at gcc dot gnu dot org
@ 2004-10-07  0:33 ` stuartm at connecttech dot com
  2004-10-07  0:39 ` pinskia at gcc dot gnu dot org
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: stuartm at connecttech dot com @ 2004-10-07  0:33 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From stuartm at connecttech dot com  2004-10-07 00:33 -------
I was the original reporter. I had generated a file of this format in a
programmatic search to determine all the gcc predefined macros. I was looping
through all the possible macro names. Problem was my search kept failing due to
a crashing compiler.

Since it looked like a memory leak to me, I reported it.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17798


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug preprocessor/17798] [3.4/4.0 Regression] cpp memory leak with undefined symbols
  2004-10-02 22:24 [Bug preprocessor/17798] New: cpp memory leak with undefined symbols rth at gcc dot gnu dot org
                   ` (2 preceding siblings ...)
  2004-10-07  0:33 ` stuartm at connecttech dot com
@ 2004-10-07  0:39 ` pinskia at gcc dot gnu dot org
  2004-11-01  0:45 ` mmitchel at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-10-07  0:39 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-10-07 00:39 -------
A better way is to do "~/local/bin/gcc -x c /dev/null -dD -o - -E" and then look for "#define".

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17798


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug preprocessor/17798] [3.4/4.0 Regression] cpp memory leak with undefined symbols
  2004-10-02 22:24 [Bug preprocessor/17798] New: cpp memory leak with undefined symbols rth at gcc dot gnu dot org
                   ` (3 preceding siblings ...)
  2004-10-07  0:39 ` pinskia at gcc dot gnu dot org
@ 2004-11-01  0:45 ` mmitchel at gcc dot gnu dot org
  2004-12-14  5:28 ` pinskia at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2004-11-01  0:45 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From mmitchel at gcc dot gnu dot org  2004-11-01 00:45 -------
Postponed until GCC 3.4.4.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|3.4.3                       |3.4.4


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17798


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug preprocessor/17798] [3.4/4.0 Regression] cpp memory leak with undefined symbols
  2004-10-02 22:24 [Bug preprocessor/17798] New: cpp memory leak with undefined symbols rth at gcc dot gnu dot org
                   ` (4 preceding siblings ...)
  2004-11-01  0:45 ` mmitchel at gcc dot gnu dot org
@ 2004-12-14  5:28 ` pinskia at gcc dot gnu dot org
  2004-12-14  5:53 ` [Bug preprocessor/17798] [3.4/4.0 Regression] high cpp memory usage " pinskia at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-14  5:28 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-12-14 05:28 -------
Confirmed.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|                            |1
   Last reconfirmed|0000-00-00 00:00:00         |2004-12-14 05:28:08
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17798


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug preprocessor/17798] [3.4/4.0 Regression] high cpp memory usage with undefined symbols
  2004-10-02 22:24 [Bug preprocessor/17798] New: cpp memory leak with undefined symbols rth at gcc dot gnu dot org
                   ` (5 preceding siblings ...)
  2004-12-14  5:28 ` pinskia at gcc dot gnu dot org
@ 2004-12-14  5:53 ` pinskia at gcc dot gnu dot org
  2004-12-14 13:56   ` Neil Booth
  2004-12-14 13:57 ` neil at daikokuya dot co dot uk
  2005-02-09 14:05 ` mmitchel at gcc dot gnu dot org
  8 siblings, 1 reply; 11+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-12-14  5:53 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-12-14 05:53 -------
The first thing is that read_file_guts mallocs the whole file which seems wrong.  That accounts for 
500M.
The next problem is that keep every identifier we parsed even though we don't need it.
3014 calls for 12,273,008 bytes: thread_a000a1ec |0x0 | _dyld_start | _start | main | toplev_main | 
do_compile | compile_file | c_common_parse_file | c_parse_file | yyparse | yylex | _yylex | c_lex | 
c_lex_with_flags | cpp_get_token | _cpp_lex_token | _cpp_handle_directive | do_ifdef | lex_macro_node 
| _cpp_lex_token | _cpp_lex_direct | lex_identifier | ht_lookup_with_hash | _obstack_newchunk | 
xmalloc | malloc | malloc_zone_malloc 

And this is where the problem comes from.
No there is no leak we keep a reference to all of thes identifiers but this seems like we should not.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[3.4/4.0 Regression] cpp    |[3.4/4.0 Regression] high
                   |memory leak with undefined  |cpp memory usage with
                   |symbols                     |undefined symbols


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17798


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Bug preprocessor/17798] [3.4/4.0 Regression] high cpp memory usage with undefined symbols
  2004-12-14  5:53 ` [Bug preprocessor/17798] [3.4/4.0 Regression] high cpp memory usage " pinskia at gcc dot gnu dot org
@ 2004-12-14 13:56   ` Neil Booth
  0 siblings, 0 replies; 11+ messages in thread
From: Neil Booth @ 2004-12-14 13:56 UTC (permalink / raw)
  To: pinskia at gcc dot gnu dot org; +Cc: gcc-bugs

pinskia at gcc dot gnu dot org wrote:-

> 
> ------- Additional Comments From pinskia at gcc dot gnu dot org  2004-12-14 05:53 -------
> The first thing is that read_file_guts mallocs the whole file which seems wrong.  That accounts for 
> 500M.
> The next problem is that keep every identifier we parsed even though we don't need it.
> 3014 calls for 12,273,008 bytes: thread_a000a1ec |0x0 | _dyld_start | _start | main | toplev_main | 
> do_compile | compile_file | c_common_parse_file | c_parse_file | yyparse | yylex | _yylex | c_lex | 
> c_lex_with_flags | cpp_get_token | _cpp_lex_token | _cpp_handle_directive | do_ifdef | lex_macro_node 
> | _cpp_lex_token | _cpp_lex_direct | lex_identifier | ht_lookup_with_hash | _obstack_newchunk | 
> xmalloc | malloc | malloc_zone_malloc 
> 
> And this is where the problem comes from.
> No there is no leak we keep a reference to all of thes identifiers but this seems like we should not.

Not doing either of these involves a major rework of cpplib FWIW.

I happen to think it would be beneficial, but I also think that the
whole approach CPP takes needs rethinking.

Neil.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug preprocessor/17798] [3.4/4.0 Regression] high cpp memory usage with undefined symbols
  2004-10-02 22:24 [Bug preprocessor/17798] New: cpp memory leak with undefined symbols rth at gcc dot gnu dot org
                   ` (6 preceding siblings ...)
  2004-12-14  5:53 ` [Bug preprocessor/17798] [3.4/4.0 Regression] high cpp memory usage " pinskia at gcc dot gnu dot org
@ 2004-12-14 13:57 ` neil at daikokuya dot co dot uk
  2005-02-09 14:05 ` mmitchel at gcc dot gnu dot org
  8 siblings, 0 replies; 11+ messages in thread
From: neil at daikokuya dot co dot uk @ 2004-12-14 13:57 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From neil at daikokuya dot co dot uk  2004-12-14 13:57 -------
Subject: Re:  [3.4/4.0 Regression] high cpp memory usage with undefined symbols

pinskia at gcc dot gnu dot org wrote:-

> 
> ------- Additional Comments From pinskia at gcc dot gnu dot org  2004-12-14 05:53 -------
> The first thing is that read_file_guts mallocs the whole file which seems wrong.  That accounts for 
> 500M.
> The next problem is that keep every identifier we parsed even though we don't need it.
> 3014 calls for 12,273,008 bytes: thread_a000a1ec |0x0 | _dyld_start | _start | main | toplev_main | 
> do_compile | compile_file | c_common_parse_file | c_parse_file | yyparse | yylex | _yylex | c_lex | 
> c_lex_with_flags | cpp_get_token | _cpp_lex_token | _cpp_handle_directive | do_ifdef | lex_macro_node 
> | _cpp_lex_token | _cpp_lex_direct | lex_identifier | ht_lookup_with_hash | _obstack_newchunk | 
> xmalloc | malloc | malloc_zone_malloc 
> 
> And this is where the problem comes from.
> No there is no leak we keep a reference to all of thes identifiers but this seems like we should not.

Not doing either of these involves a major rework of cpplib FWIW.

I happen to think it would be beneficial, but I also think that the
whole approach CPP takes needs rethinking.

Neil.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17798


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug preprocessor/17798] [3.4/4.0 Regression] high cpp memory usage with undefined symbols
  2004-10-02 22:24 [Bug preprocessor/17798] New: cpp memory leak with undefined symbols rth at gcc dot gnu dot org
                   ` (7 preceding siblings ...)
  2004-12-14 13:57 ` neil at daikokuya dot co dot uk
@ 2005-02-09 14:05 ` mmitchel at gcc dot gnu dot org
  8 siblings, 0 replies; 11+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2005-02-09 14:05 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From mmitchel at gcc dot gnu dot org  2005-02-09 08:13 -------
This is nowhere near release-critical; it's an intentional extreme corner case.

As for the facts noted in the audit trail (i.e., that we lex the whole file up
front, and that we keep all identifiers around the entire time), those are very
sound strategies for most programs.

I've removed the target milestone, and closed as WONTFIX.  If someone chooses to
reopen this, please do not reset the target milestone.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |WONTFIX
   Target Milestone|3.4.4                       |---


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17798


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2005-02-09  8:14 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-10-02 22:24 [Bug preprocessor/17798] New: cpp memory leak with undefined symbols rth at gcc dot gnu dot org
2004-10-02 22:38 ` [Bug preprocessor/17798] [3.4/4.0 Regression] " pinskia at gcc dot gnu dot org
2004-10-03 17:56 ` pinskia at gcc dot gnu dot org
2004-10-07  0:33 ` stuartm at connecttech dot com
2004-10-07  0:39 ` pinskia at gcc dot gnu dot org
2004-11-01  0:45 ` mmitchel at gcc dot gnu dot org
2004-12-14  5:28 ` pinskia at gcc dot gnu dot org
2004-12-14  5:53 ` [Bug preprocessor/17798] [3.4/4.0 Regression] high cpp memory usage " pinskia at gcc dot gnu dot org
2004-12-14 13:56   ` Neil Booth
2004-12-14 13:57 ` neil at daikokuya dot co dot uk
2005-02-09 14:05 ` mmitchel at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).