From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 10329 invoked by alias); 23 Jan 2014 09:13:48 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 10273 invoked by uid 55); 23 Jan 2014 09:13:43 -0000 From: "dodji at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug preprocessor/58580] [4.8 Regression] preprocessor goes OOM with warning for zero literals Date: Thu, 23 Jan 2014 09:13:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: preprocessor X-Bugzilla-Version: 4.8.1 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: dodji at gcc dot gnu.org X-Bugzilla-Status: REOPENED X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: dodji at gcc dot gnu.org X-Bugzilla-Target-Milestone: 4.8.3 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2014-01/txt/msg02419.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58580 --- Comment #10 from Dodji Seketeli --- Author: dodji Date: Thu Jan 23 09:13:08 2014 New Revision: 206957 URL: http://gcc.gnu.org/viewcvs?rev=206957&root=gcc&view=rev Log: PR preprocessor/58580 - preprocessor goes OOM with warning for zero literals In this problem report, the compiler is fed a (bogus) translation unit in which some literals contain bytes whose value is zero. The preprocessor detects that and proceeds to emit diagnostics for that king of bogus literals. But then when the diagnostics machinery re-reads the input file again to display the bogus literals with a caret, it attempts to calculate the length of each of the lines it got using fgets. The line length calculation is done using strlen. But that doesn't work well when the content of the line can have several zero bytes. The result is that the read_line never sees the end of the line because strlen repeatedly reports that the line ends before the end-of-line character; so read_line thinks its buffer for reading the line is too small; it thus increases the buffer, leading to a huge memory consumption and disaster. Here is what this patch does. location_get_source_line is modified to return the length of a source line that can now contain bytes with zero value. diagnostic_show_locus() is then modified to consider that a line can have characters of value zero, and so just shows a white space when instructed to display one of these characters. Additionally location_get_source_line is modified to avoid re-reading each and every line from the beginning of the file until it reaches the line number N that it is instructed to get; this was leading to annoying quadratic behaviour when reading adjacent lines near the end of (big) files. So a cache is now associated to the file opened in text mode. When the content of the file is read, that content is stashed in the file cache. That file cache is searched for line delimiters. A number of line positions are saved in the cache and a number of file caches are kept in memory. That way when location_get_source_line is asked to read line N + 1, it just has to start reading from line N that it has already read. libcpp/ChangeLog: * include/line-map.h (linemap_get_file_highest_location): Declare new function. * line-map.c (linemap_get_file_highest_location): Define it. gcc/ChangeLog: * input.h (location_get_source_line): Take an additional line_size parameter. (void diagnostics_file_cache_fini): Declare new function. * input.c (struct fcache): New type. (fcache_tab_size, fcache_buffer_size, fcache_line_record_size): New static constants. (diagnostic_file_cache_init, total_lines_num) (lookup_file_in_cache_tab, evicted_cache_tab_entry) (add_file_to_cache_tab, lookup_or_add_file_to_cache_tab) (needs_read, needs_grow, maybe_grow, read_data, maybe_read_data) (get_next_line, read_next_line, goto_next_line, read_line_num): New static function definitions. (diagnostic_file_cache_fini): New function. (location_get_source_line): Take an additional output line_len parameter. Re-write using lookup_or_add_file_to_cache_tab and read_line_num. * diagnostic.c (diagnostic_finish): Call diagnostic_file_cache_fini. (adjust_line): Take an additional input parameter for the length of the line, rather than calculating it with strlen. (diagnostic_show_locus): Adjust the use of location_get_source_line and adjust_line with respect to their new signature. While displaying a line now, do not stop at the first null byte. Rather, display the zero byte as a space and keep going until we reach the size of the line. * Makefile.in: Add vec.o to OBJS-libcommon gcc/testsuite/ChangeLog: * c-c++-common/cpp/warning-zero-in-literals-1.c: New test file. Signed-off-by: Dodji Seketeli Added: trunk/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c Modified: trunk/gcc/ChangeLog trunk/gcc/Makefile.in trunk/gcc/diagnostic.c trunk/gcc/diagnostic.h trunk/gcc/input.c trunk/gcc/input.h trunk/gcc/testsuite/ChangeLog trunk/libcpp/ChangeLog trunk/libcpp/include/line-map.h trunk/libcpp/line-map.c