public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Dodji Seketeli <dodji@redhat.com>
To: Jakub Jelinek <jakub@redhat.com>
Cc: Bernd Edlinger <bernd.edlinger@hotmail.de>,
	       "gcc-patches\@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Date: Mon, 04 Nov 2013 11:52:00 -0000	[thread overview]
Message-ID: <87r4awtmnx.fsf@redhat.com> (raw)
In-Reply-To: <20131031173649.GW27813@tucnak.zalov.cz> (Jakub Jelinek's message	of "Thu, 31 Oct 2013 18:36:49 +0100")

Jakub Jelinek <jakub@redhat.com> writes:

> I think even as a fallback is the patch too expensive.
> I'd say best would be to write some getline API compatible function
> and just use it, using fread on say fixed size buffer.

OK, thanks for the insight.  I have just used the getline_fallback
function you proposed, slightly amended to use the memory allocation
routines commonly used in gcc and renamed into get_line, with a
hopefully complete comment explaining where this function comes from
etc.

[...]

> A slight complication is what to do on mingw/cygwin and other DOS or
> Mac style line ending environments, no idea what fgets exactly does
> there.

Actually, I think that even fgets just deals with '\n'.  The reason,
from what I gathered being that on windows, we fopen the input file in
text mode; and in that mode, the \r\n is transformed into just \n.

Apparently OSX is compatible with '\n' too.  Someone corrects me if I am
saying non-sense here.

So the patch below is what I am bootstrapping at the moment.

OK if it passes bootstrap on x86_64-unknown-linux-gnu against trunk?

> BTW, we probably want to do something with the speed of the caret
> diagnostics too, right now it opens the file again for each single line
> to be printed in caret diagnostics and reads all lines until the right one,
> so imagine how fast is printing of many warnings on almost adjacent lines
> near the end of many megabytes long file.
> Perhaps we could remember the last file we've opened for caret diagnostics,
> don't fclose the file right away but only if a new request is for a
> different file, perhaps keep some vector of line start offsets (say starting
> byte of every 100th line or similar) and also remember the last read line
> offset, so if a new request is for the same file, but higher line than last,
> we can just keep getlineing, and if it is smaller line than last, we look up
> the offset of the line / 100, fseek to it and just getline only modulo 100
> lines.  Maybe we should keep not just one, but 2 or 4 opened files as cache
> (again, with the starting line offsets vectors).

I like this idea.  I'll try and work on it.

And now the patch.

Cheers.

gcc/ChangeLog:

	* input.h (location_get_source_line): Take an additional line_size
	parameter by reference.
	* input.c (get_line): New static function definition.
	(read_line): Take an additional line_length output parameter to be
	set to the size of the line.  Use the new get_line function to
	compute the size of the line returned by fgets, rather than using
	strlen.  Ensure that the buffer is initially zeroed; ensure that
	when growing the buffer too.
	(location_get_source_line): Take an additional output line_len
	parameter.  Update the use of read_line to pass it the line_len
	parameter.
	* diagnostic.c (adjust_line): Take an additional input parameter
	for the length of the line, rather than calculating it with
	strlen.
	(diagnostic_show_locus): Adjust the use of
	location_get_source_line and adjust_line with respect to their new
	signature.  While displaying a line now, do not stop at the first
	null byte.  Rather, display the zero byte as a space and keep
	going until we reach the size of the line.

gcc/testsuite/ChangeLog:

	* c-c++-common/cpp/warning-zero-in-literals-1.c: New test file.
---
 gcc/diagnostic.c                                   |  17 ++--
 gcc/input.c                                        | 104 ++++++++++++++++++---
 gcc/input.h                                        |   3 +-
 .../c-c++-common/cpp/warning-zero-in-literals-1.c  | Bin 0 -> 240 bytes
 4 files changed, 103 insertions(+), 21 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c

diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 36094a1..0ca7081 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -259,12 +259,13 @@ diagnostic_build_prefix (diagnostic_context *context,
    MAX_WIDTH by some margin, then adjust the start of the line such
    that the COLUMN is smaller than MAX_WIDTH minus the margin.  The
    margin is either 10 characters or the difference between the column
-   and the length of the line, whatever is smaller.  */
+   and the length of the line, whatever is smaller.  The length of
+   LINE is given by LINE_WIDTH.  */
 static const char *
-adjust_line (const char *line, int max_width, int *column_p)
+adjust_line (const char *line, int line_width,
+	     int max_width, int *column_p)
 {
   int right_margin = 10;
-  int line_width = strlen (line);
   int column = *column_p;
 
   right_margin = MIN (line_width - column, right_margin);
@@ -284,6 +285,7 @@ diagnostic_show_locus (diagnostic_context * context,
 		       const diagnostic_info *diagnostic)
 {
   const char *line;
+  int line_width;
   char *buffer;
   expanded_location s;
   int max_width;
@@ -297,22 +299,25 @@ diagnostic_show_locus (diagnostic_context * context,
 
   context->last_location = diagnostic->location;
   s = expand_location_to_spelling_point (diagnostic->location);
-  line = location_get_source_line (s);
+  line = location_get_source_line (s, line_width);
   if (line == NULL)
     return;
 
   max_width = context->caret_max_width;
-  line = adjust_line (line, max_width, &(s.column));
+  line = adjust_line (line, line_width, max_width, &(s.column));
 
   pp_newline (context->printer);
   saved_prefix = pp_get_prefix (context->printer);
   pp_set_prefix (context->printer, NULL);
   pp_space (context->printer);
-  while (max_width > 0 && *line != '\0')
+  while (max_width > 0 && line_width > 0)
     {
       char c = *line == '\t' ? ' ' : *line;
+      if (c == '\0')
+	c = ' ';
       pp_character (context->printer, c);
       max_width--;
+      line_width--;
       line++;
     }
   pp_newline (context->printer);
diff --git a/gcc/input.c b/gcc/input.c
index a141a92..be60039 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -87,44 +87,120 @@ expand_location_1 (source_location loc,
   return xloc;
 }
 
-/* Reads one line from file into a static buffer.  */
+/* This function reads a line that might contain zero byte value.  The
+   function returns the number of bytes read.  Note that this function
+   has been adapted from the a combination of geline() and
+   _IO_getdelim() from the GNU C library.  It's been duplicated here
+   because the getline() function is not present on all platforms.
+
+   LINEPTR points to a buffer that is to contain the line read.
+
+   N points to the size of the the LINEPTR buffer.
+
+   FP points to the file to consider.  */
+
+static ssize_t
+get_line (char **lineptr, size_t *n, FILE *fp)
+{
+  ssize_t cur_len = 0, len;
+  char buf[16384];
+
+  if (lineptr == NULL || n == NULL)
+    return -1;
+
+  if (*lineptr == NULL || *n == 0)
+    {
+      *n = 120;
+      *lineptr = XNEWVEC (char, *n);
+      if (*lineptr == NULL)
+	return -1;
+    }
+
+  len = fread (buf, 1, sizeof buf, fp);
+  if (ferror (fp))
+    return -1;
+
+  for (;;)
+    {
+      size_t needed;
+      char *t = (char*) memchr (buf, '\n', len);
+      if (t != NULL) len = (t - buf) + 1;
+      if (__builtin_expect (len >= SSIZE_MAX - cur_len, 0))
+	return -1;
+      needed = cur_len + len + 1;
+      if (needed > *n)
+	{
+	  char *new_lineptr;
+	  if (needed < 2 * *n)
+	    needed = 2 * *n;
+	  new_lineptr = XRESIZEVEC (char, *lineptr, needed);
+	  if (new_lineptr == NULL)
+	    return -1;
+	  *lineptr = new_lineptr;
+	  *n = needed;
+	}
+      memcpy (*lineptr + cur_len, buf, len);
+      cur_len += len;
+      if (t != NULL)
+	break;
+      len = fread (buf, 1, sizeof buf, fp);
+      if (ferror (fp))
+	return -1;
+      if (len == 0)
+	break;
+    }
+  (*lineptr)[cur_len] = '\0';
+  return cur_len;
+}
+
+/* Reads one line from FILE into a static buffer.  LINE_LENGTH is set
+   by this function to the length of the returned line.  Note that the
+   returned line can contain several zero bytes.  */
 static const char *
-read_line (FILE *file)
+read_line (FILE *file, int& line_length)
 {
   static char *string;
-  static size_t string_len;
+  static size_t string_len, cur_len;
   size_t pos = 0;
   char *ptr;
 
   if (!string_len)
     {
       string_len = 200;
-      string = XNEWVEC (char, string_len);
+      string = XCNEWVEC (char, string_len);
     }
+  else
+    memset (string, 0, string_len);
 
-  while ((ptr = fgets (string + pos, string_len - pos, file)))
+  ptr = string;
+  cur_len = string_len;
+  while (size_t len = get_line (&ptr, &cur_len, file))
     {
-      size_t len = strlen (string + pos);
-
-      if (string[pos + len - 1] == '\n')
+      if (ptr[len - 1] == '\n')
 	{
-	  string[pos + len - 1] = 0;
+	  ptr[len - 1] = 0;
+	  line_length = len;
 	  return string;
 	}
       pos += len;
       string = XRESIZEVEC (char, string, string_len * 2);
       string_len *= 2;
-    }
-      
+      ptr = string + pos;
+      cur_len = string_len - pos;
+     }
+
+  line_length = pos ? string_len : 0;
   return pos ? string : NULL;
 }
 
 /* Return the physical source line that corresponds to xloc in a
    buffer that is statically allocated.  The newline is replaced by
-   the null character.  */
+   the null character.  Note that the line can contain several null
+   characters, so LINE_LEN contains the actual length of the line.  */
 
 const char *
-location_get_source_line (expanded_location xloc)
+location_get_source_line (expanded_location xloc,
+			  int& line_len)
 {
   const char *buffer;
   int lines = 1;
@@ -132,7 +208,7 @@ location_get_source_line (expanded_location xloc)
   if (!stream)
     return NULL;
 
-  while ((buffer = read_line (stream)) && lines < xloc.line)
+  while ((buffer = read_line (stream, line_len)) && lines < xloc.line)
     lines++;
 
   fclose (stream);
diff --git a/gcc/input.h b/gcc/input.h
index 8fdc7b2..79b3a10 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -37,7 +37,8 @@ extern char builtins_location_check[(BUILTINS_LOCATION
 				     < RESERVED_LOCATION_COUNT) ? 1 : -1];
 
 extern expanded_location expand_location (source_location);
-extern const char *location_get_source_line (expanded_location xloc);
+extern const char *location_get_source_line (expanded_location xloc,
+					     int& line_size);
 extern expanded_location expand_location_to_spelling_point (source_location);
 extern source_location expansion_point_location_if_in_system_header (source_location);
 
diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c b/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c
new file mode 100644
index 0000000000000000000000000000000000000000..ff2ed962ac96e47ae05b0b040f4e10b8e09637e2
GIT binary patch
literal 240
zcmdPbSEyD<N!LxuS12e-Ehx%QPAx80sO92PVo*}h*HVDUmM0eFW#*+TDCL#r<R~O(
UBo-wmm!uXcDby-x=?^KT09Xk|)&Kwi

literal 0
HcmV?d00001

-- 
		Dodji

  reply	other threads:[~2013-11-04 11:46 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-31 14:48 Bernd Edlinger
2013-10-31 15:06 ` Jakub Jelinek
2013-10-31 15:19   ` Dodji Seketeli
2013-10-31 18:26     ` Jakub Jelinek
2013-11-04 11:52       ` Dodji Seketeli [this message]
2013-11-04 11:59         ` Jakub Jelinek
2013-11-04 15:42           ` Dodji Seketeli
2013-11-05  0:10             ` Bernd Edlinger
2013-11-05  9:50               ` Dodji Seketeli
2013-11-05 11:19                 ` Bernd Edlinger
2013-11-05 11:43                   ` Dodji Seketeli
2013-11-06 22:27                 ` Bernd Edlinger
2013-11-04 12:06         ` Bernd Edlinger
2013-11-04 12:15           ` Jakub Jelinek
2013-11-04 12:32             ` Bernd Edlinger
2013-11-04 15:21           ` Dodji Seketeli
2013-11-11 10:49       ` Dodji Seketeli
2013-11-11 14:35         ` Jakub Jelinek
2013-11-11 17:13           ` Dodji Seketeli
2013-11-12 16:42             ` Dodji Seketeli
2013-11-13  5:10               ` Bernd Edlinger
2013-11-13  9:40                 ` Dodji Seketeli
2013-11-13  9:43                   ` Bernd Edlinger
2013-11-13  9:49                     ` Dodji Seketeli
2013-11-13  9:49                     ` Dodji Seketeli
2013-11-13  9:51               ` Jakub Jelinek
2013-11-14 15:12                 ` Dodji Seketeli
2013-12-09 20:11                   ` Tom Tromey
2014-01-21 12:28                   ` Bernd Edlinger
2014-01-22  8:16                     ` Dodji Seketeli
2014-01-23 17:12                       ` Jakub Jelinek
2014-01-24  2:58                         ` Bernd Edlinger
2014-01-24  7:53                         ` Dodji Seketeli
2014-01-24 15:05                       ` Markus Trippelsdorf
2014-01-24 15:41                         ` Dodji Seketeli
2014-01-24 15:44                           ` Jakub Jelinek
2014-01-24 16:09                             ` Dodji Seketeli
2014-01-24 16:13                               ` Jakub Jelinek
2014-01-24 23:02                               ` Markus Trippelsdorf
2014-01-24 23:20                                 ` Markus Trippelsdorf
2014-01-28 13:20                                   ` Dodji Seketeli
2014-01-28 13:23                                     ` Dodji Seketeli
2014-01-28 18:40                                       ` H.J. Lu
2014-01-29 11:28                                         ` Dodji Seketeli
  -- strict thread matches above, loose matches on Subject: below --
2013-10-31 13:45 Dodji Seketeli
2013-10-31 17:30 ` Manuel López-Ibáñez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r4awtmnx.fsf@redhat.com \
    --to=dodji@redhat.com \
    --cc=bernd.edlinger@hotmail.de \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).