public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "dodji at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug preprocessor/58580] [4.8 Regression] preprocessor goes OOM with warning for zero literals
Date: Thu, 23 Jan 2014 09:13:00 -0000	[thread overview]
Message-ID: <bug-58580-4-S8oiL1fg0o@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-58580-4@http.gcc.gnu.org/bugzilla/>

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58580

--- Comment #10 from Dodji Seketeli <dodji at gcc dot gnu.org> ---
Author: dodji
Date: Thu Jan 23 09:13:08 2014
New Revision: 206957

URL: http://gcc.gnu.org/viewcvs?rev=206957&root=gcc&view=rev
Log:
PR preprocessor/58580 - preprocessor goes OOM with warning for zero literals

In this problem report, the compiler is fed a (bogus) translation unit
in which some literals contain bytes whose value is zero.  The
preprocessor detects that and proceeds to emit diagnostics for that
king of bogus literals.  But then when the diagnostics machinery
re-reads the input file again to display the bogus literals with a
caret, it attempts to calculate the length of each of the lines it got
using fgets.  The line length calculation is done using strlen.  But
that doesn't work well when the content of the line can have several
zero bytes.  The result is that the read_line never sees the end of
the line because strlen repeatedly reports that the line ends before
the end-of-line character; so read_line thinks its buffer for reading
the line is too small; it thus increases the buffer, leading to a huge
memory consumption and disaster.

Here is what this patch does.

location_get_source_line is modified to return the length of a source
line that can now contain bytes with zero value.
diagnostic_show_locus() is then modified to consider that a line can
have characters of value zero, and so just shows a white space when
instructed to display one of these characters.

Additionally location_get_source_line is modified to avoid re-reading
each and every line from the beginning of the file until it reaches
the line number N that it is instructed to get; this was leading to
annoying quadratic behaviour when reading adjacent lines near the end
of (big) files.  So a cache is now associated to the file opened in
text mode.  When the content of the file is read, that content is
stashed in the file cache.  That file cache is searched for line
delimiters.  A number of line positions are saved in the cache and a
number of file caches are kept in memory.  That way when
location_get_source_line is asked to read line N + 1, it just has to
start reading from line N that it has already read.

libcpp/ChangeLog:

    * include/line-map.h (linemap_get_file_highest_location): Declare
    new function.
    * line-map.c (linemap_get_file_highest_location): Define it.

gcc/ChangeLog:

    * input.h (location_get_source_line): Take an additional line_size
    parameter.
    (void diagnostics_file_cache_fini): Declare new function.
    * input.c (struct fcache): New type.
    (fcache_tab_size, fcache_buffer_size, fcache_line_record_size):
    New static constants.
    (diagnostic_file_cache_init, total_lines_num)
    (lookup_file_in_cache_tab, evicted_cache_tab_entry)
    (add_file_to_cache_tab, lookup_or_add_file_to_cache_tab)
    (needs_read, needs_grow, maybe_grow, read_data, maybe_read_data)
    (get_next_line, read_next_line, goto_next_line, read_line_num):
    New static function definitions.
    (diagnostic_file_cache_fini): New function.
    (location_get_source_line): Take an additional output line_len
    parameter.  Re-write using lookup_or_add_file_to_cache_tab and
    read_line_num.
    * diagnostic.c (diagnostic_finish): Call
    diagnostic_file_cache_fini.
    (adjust_line): Take an additional input parameter for the length
    of the line, rather than calculating it with strlen.
    (diagnostic_show_locus): Adjust the use of
    location_get_source_line and adjust_line with respect to their new
    signature.  While displaying a line now, do not stop at the first
    null byte.  Rather, display the zero byte as a space and keep
    going until we reach the size of the line.
    * Makefile.in: Add vec.o to OBJS-libcommon

gcc/testsuite/ChangeLog:

    * c-c++-common/cpp/warning-zero-in-literals-1.c: New test file.

Signed-off-by: Dodji Seketeli <dodji@seketeli.org>

Added:
    trunk/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/Makefile.in
    trunk/gcc/diagnostic.c
    trunk/gcc/diagnostic.h
    trunk/gcc/input.c
    trunk/gcc/input.h
    trunk/gcc/testsuite/ChangeLog
    trunk/libcpp/ChangeLog
    trunk/libcpp/include/line-map.h
    trunk/libcpp/line-map.c


  parent reply	other threads:[~2014-01-23  9:13 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-30 14:18 [Bug preprocessor/58580] New: " rguenth at gcc dot gnu.org
2013-09-30 14:20 ` [Bug preprocessor/58580] [4.8/4.9 Regression] " rguenth at gcc dot gnu.org
2013-10-16  9:51 ` jakub at gcc dot gnu.org
2013-10-21 13:57 ` rguenth at gcc dot gnu.org
2013-10-29 10:17 ` dodji at gcc dot gnu.org
2013-10-31 13:32 ` dodji at gcc dot gnu.org
2013-11-05 14:45 ` rguenth at gcc dot gnu.org
2013-11-06 12:38 ` dodji at gcc dot gnu.org
2013-11-06 17:26 ` [Bug preprocessor/58580] [4.8 " rguenther at suse dot de
2014-01-23  9:13 ` dodji at gcc dot gnu.org [this message]
2014-01-27 21:26 ` pthaugen at gcc dot gnu.org
2014-02-24 14:55 ` dodji at gcc dot gnu.org
2014-02-24 14:56 ` rguenth at gcc dot gnu.org
2014-05-22  9:03 ` rguenth at gcc dot gnu.org
2014-12-19 13:32 ` jakub at gcc dot gnu.org
2015-06-23  8:43 ` rguenth at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-58580-4-S8oiL1fg0o@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).