public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* RE: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
@ 2013-10-31 14:48 Bernd Edlinger
  2013-10-31 15:06 ` Jakub Jelinek
  0 siblings, 1 reply; 46+ messages in thread
From: Bernd Edlinger @ 2013-10-31 14:48 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: gcc-patches, Jakub Jelinek

Hi,

if you want to read zero-chars, why don't you simply use fgetc,
optionally replacing '\0' with ' ' in read_line?

Bernd. 		 	   		  

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-10-31 14:48 [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals Bernd Edlinger
@ 2013-10-31 15:06 ` Jakub Jelinek
  2013-10-31 15:19   ` Dodji Seketeli
  0 siblings, 1 reply; 46+ messages in thread
From: Jakub Jelinek @ 2013-10-31 15:06 UTC (permalink / raw)
  To: Bernd Edlinger; +Cc: Dodji Seketeli, gcc-patches

On Thu, Oct 31, 2013 at 03:36:07PM +0100, Bernd Edlinger wrote:
> if you want to read zero-chars, why don't you simply use fgetc,
> optionally replacing '\0' with ' ' in read_line?

Because it is too slow?

getline(3) would be much better for this purpose, though of course
it is a GNU extension in glibc and so we'd need some fallback, which
very well could be the fgetc or something similar.

	Jakub

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-10-31 15:06 ` Jakub Jelinek
@ 2013-10-31 15:19   ` Dodji Seketeli
  2013-10-31 18:26     ` Jakub Jelinek
  0 siblings, 1 reply; 46+ messages in thread
From: Dodji Seketeli @ 2013-10-31 15:19 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Bernd Edlinger, gcc-patches

Jakub Jelinek <jakub@redhat.com> writes:

> On Thu, Oct 31, 2013 at 03:36:07PM +0100, Bernd Edlinger wrote:
>> if you want to read zero-chars, why don't you simply use fgetc,
>> optionally replacing '\0' with ' ' in read_line?
>
> Because it is too slow?
>
> getline(3) would be much better for this purpose, though of course
> it is a GNU extension in glibc and so we'd need some fallback, which
> very well could be the fgetc or something similar.

So would getline (+ the current patch as a fallback) be acceptable?

-- 
		Dodji

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-10-31 15:19   ` Dodji Seketeli
@ 2013-10-31 18:26     ` Jakub Jelinek
  2013-11-04 11:52       ` Dodji Seketeli
  2013-11-11 10:49       ` Dodji Seketeli
  0 siblings, 2 replies; 46+ messages in thread
From: Jakub Jelinek @ 2013-10-31 18:26 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: Bernd Edlinger, gcc-patches

On Thu, Oct 31, 2013 at 04:00:01PM +0100, Dodji Seketeli wrote:
> Jakub Jelinek <jakub@redhat.com> writes:
> 
> > On Thu, Oct 31, 2013 at 03:36:07PM +0100, Bernd Edlinger wrote:
> >> if you want to read zero-chars, why don't you simply use fgetc,
> >> optionally replacing '\0' with ' ' in read_line?
> >
> > Because it is too slow?
> >
> > getline(3) would be much better for this purpose, though of course
> > it is a GNU extension in glibc and so we'd need some fallback, which
> > very well could be the fgetc or something similar.
> 
> So would getline (+ the current patch as a fallback) be acceptable?

I think even as a fallback is the patch too expensive.
I'd say best would be to write some getline API compatible function
and just use it, using fread on say fixed size buffer (4KB or similar),
then for the number of characters returned by fread that were stored
into that buffer look for the line terminator there and allocate/copy
to the dynamically allocated buffer.  A slight complication is what to do
on mingw/cygwin and other DOS or Mac style line ending environments,
no idea what fgets exactly does there.  But, ignoring the DOS/Mac style line
endings, it would be roughly (partially from glibc iogetdelim.c).

ssize_t
getline_fallback (char **lineptr, size_t *n, FILE *fp)
{
  ssize_t cur_len = 0, len;
  char buf[16384];

  if (lineptr == NULL || n == NULL)
    return -1;

  if (*lineptr == NULL || *n == 0)
    {
      *n = 120;
      *lineptr = (char *) malloc (*n);
      if (*lineptr == NULL)
	return -1;
    }

  len = fread (buf, 1, sizeof buf, fp);
  if (ferror (fp))
    return -1;

  for (;;)
    {
      size_t needed;
      char *t = memchr (buf, '\n', len);
      if (t != NULL)
	len = (t - buf) + 1;
      if (__builtin_expect (len >= SSIZE_MAX - cur_len, 0))
	return -1;
      needed = cur_len + len + 1;
      if (needed > *n)
	{
	  char *new_lineptr;
	  if (needed < 2 * *n)
	    needed = 2 * *n;
	  new_lineptr = realloc (*lineptr, needed);
	  if (new_lineptr == NULL)
	    return -1;
	  *lineptr = new_lineptr;
	  *n = needed;
	}
      memcpy (*lineptr + cur_len, buf, len);
      cur_len += len;
      if (t != NULL)
	break;
      len = fread (buf, 1, sizeof buf, fp);
      if (ferror (fp))
	return -1;
      if (len == 0)
	break;
    }
  (*lineptr)[cur_len] = '\0';
  return cur_len;
}

For the DOS/Mac style line endings, you probably want to look at what
exactly does libcpp do with them.

BTW, we probably want to do something with the speed of the caret
diagnostics too, right now it opens the file again for each single line
to be printed in caret diagnostics and reads all lines until the right one,
so imagine how fast is printing of many warnings on almost adjacent lines
near the end of many megabytes long file.
Perhaps we could remember the last file we've opened for caret diagnostics,
don't fclose the file right away but only if a new request is for a
different file, perhaps keep some vector of line start offsets (say starting
byte of every 100th line or similar) and also remember the last read line
offset, so if a new request is for the same file, but higher line than last,
we can just keep getlineing, and if it is smaller line than last, we look up
the offset of the line / 100, fseek to it and just getline only modulo 100
lines.  Maybe we should keep not just one, but 2 or 4 opened files as cache
(again, with the starting line offsets vectors).

	Jakub

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-10-31 18:26     ` Jakub Jelinek
@ 2013-11-04 11:52       ` Dodji Seketeli
  2013-11-04 11:59         ` Jakub Jelinek
  2013-11-04 12:06         ` Bernd Edlinger
  2013-11-11 10:49       ` Dodji Seketeli
  1 sibling, 2 replies; 46+ messages in thread
From: Dodji Seketeli @ 2013-11-04 11:52 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Bernd Edlinger, gcc-patches

Jakub Jelinek <jakub@redhat.com> writes:

> I think even as a fallback is the patch too expensive.
> I'd say best would be to write some getline API compatible function
> and just use it, using fread on say fixed size buffer.

OK, thanks for the insight.  I have just used the getline_fallback
function you proposed, slightly amended to use the memory allocation
routines commonly used in gcc and renamed into get_line, with a
hopefully complete comment explaining where this function comes from
etc.

[...]

> A slight complication is what to do on mingw/cygwin and other DOS or
> Mac style line ending environments, no idea what fgets exactly does
> there.

Actually, I think that even fgets just deals with '\n'.  The reason,
from what I gathered being that on windows, we fopen the input file in
text mode; and in that mode, the \r\n is transformed into just \n.

Apparently OSX is compatible with '\n' too.  Someone corrects me if I am
saying non-sense here.

So the patch below is what I am bootstrapping at the moment.

OK if it passes bootstrap on x86_64-unknown-linux-gnu against trunk?

> BTW, we probably want to do something with the speed of the caret
> diagnostics too, right now it opens the file again for each single line
> to be printed in caret diagnostics and reads all lines until the right one,
> so imagine how fast is printing of many warnings on almost adjacent lines
> near the end of many megabytes long file.
> Perhaps we could remember the last file we've opened for caret diagnostics,
> don't fclose the file right away but only if a new request is for a
> different file, perhaps keep some vector of line start offsets (say starting
> byte of every 100th line or similar) and also remember the last read line
> offset, so if a new request is for the same file, but higher line than last,
> we can just keep getlineing, and if it is smaller line than last, we look up
> the offset of the line / 100, fseek to it and just getline only modulo 100
> lines.  Maybe we should keep not just one, but 2 or 4 opened files as cache
> (again, with the starting line offsets vectors).

I like this idea.  I'll try and work on it.

And now the patch.

Cheers.

gcc/ChangeLog:

	* input.h (location_get_source_line): Take an additional line_size
	parameter by reference.
	* input.c (get_line): New static function definition.
	(read_line): Take an additional line_length output parameter to be
	set to the size of the line.  Use the new get_line function to
	compute the size of the line returned by fgets, rather than using
	strlen.  Ensure that the buffer is initially zeroed; ensure that
	when growing the buffer too.
	(location_get_source_line): Take an additional output line_len
	parameter.  Update the use of read_line to pass it the line_len
	parameter.
	* diagnostic.c (adjust_line): Take an additional input parameter
	for the length of the line, rather than calculating it with
	strlen.
	(diagnostic_show_locus): Adjust the use of
	location_get_source_line and adjust_line with respect to their new
	signature.  While displaying a line now, do not stop at the first
	null byte.  Rather, display the zero byte as a space and keep
	going until we reach the size of the line.

gcc/testsuite/ChangeLog:

	* c-c++-common/cpp/warning-zero-in-literals-1.c: New test file.
---
 gcc/diagnostic.c                                   |  17 ++--
 gcc/input.c                                        | 104 ++++++++++++++++++---
 gcc/input.h                                        |   3 +-
 .../c-c++-common/cpp/warning-zero-in-literals-1.c  | Bin 0 -> 240 bytes
 4 files changed, 103 insertions(+), 21 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c

diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 36094a1..0ca7081 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -259,12 +259,13 @@ diagnostic_build_prefix (diagnostic_context *context,
    MAX_WIDTH by some margin, then adjust the start of the line such
    that the COLUMN is smaller than MAX_WIDTH minus the margin.  The
    margin is either 10 characters or the difference between the column
-   and the length of the line, whatever is smaller.  */
+   and the length of the line, whatever is smaller.  The length of
+   LINE is given by LINE_WIDTH.  */
 static const char *
-adjust_line (const char *line, int max_width, int *column_p)
+adjust_line (const char *line, int line_width,
+	     int max_width, int *column_p)
 {
   int right_margin = 10;
-  int line_width = strlen (line);
   int column = *column_p;
 
   right_margin = MIN (line_width - column, right_margin);
@@ -284,6 +285,7 @@ diagnostic_show_locus (diagnostic_context * context,
 		       const diagnostic_info *diagnostic)
 {
   const char *line;
+  int line_width;
   char *buffer;
   expanded_location s;
   int max_width;
@@ -297,22 +299,25 @@ diagnostic_show_locus (diagnostic_context * context,
 
   context->last_location = diagnostic->location;
   s = expand_location_to_spelling_point (diagnostic->location);
-  line = location_get_source_line (s);
+  line = location_get_source_line (s, line_width);
   if (line == NULL)
     return;
 
   max_width = context->caret_max_width;
-  line = adjust_line (line, max_width, &(s.column));
+  line = adjust_line (line, line_width, max_width, &(s.column));
 
   pp_newline (context->printer);
   saved_prefix = pp_get_prefix (context->printer);
   pp_set_prefix (context->printer, NULL);
   pp_space (context->printer);
-  while (max_width > 0 && *line != '\0')
+  while (max_width > 0 && line_width > 0)
     {
       char c = *line == '\t' ? ' ' : *line;
+      if (c == '\0')
+	c = ' ';
       pp_character (context->printer, c);
       max_width--;
+      line_width--;
       line++;
     }
   pp_newline (context->printer);
diff --git a/gcc/input.c b/gcc/input.c
index a141a92..be60039 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -87,44 +87,120 @@ expand_location_1 (source_location loc,
   return xloc;
 }
 
-/* Reads one line from file into a static buffer.  */
+/* This function reads a line that might contain zero byte value.  The
+   function returns the number of bytes read.  Note that this function
+   has been adapted from the a combination of geline() and
+   _IO_getdelim() from the GNU C library.  It's been duplicated here
+   because the getline() function is not present on all platforms.
+
+   LINEPTR points to a buffer that is to contain the line read.
+
+   N points to the size of the the LINEPTR buffer.
+
+   FP points to the file to consider.  */
+
+static ssize_t
+get_line (char **lineptr, size_t *n, FILE *fp)
+{
+  ssize_t cur_len = 0, len;
+  char buf[16384];
+
+  if (lineptr == NULL || n == NULL)
+    return -1;
+
+  if (*lineptr == NULL || *n == 0)
+    {
+      *n = 120;
+      *lineptr = XNEWVEC (char, *n);
+      if (*lineptr == NULL)
+	return -1;
+    }
+
+  len = fread (buf, 1, sizeof buf, fp);
+  if (ferror (fp))
+    return -1;
+
+  for (;;)
+    {
+      size_t needed;
+      char *t = (char*) memchr (buf, '\n', len);
+      if (t != NULL) len = (t - buf) + 1;
+      if (__builtin_expect (len >= SSIZE_MAX - cur_len, 0))
+	return -1;
+      needed = cur_len + len + 1;
+      if (needed > *n)
+	{
+	  char *new_lineptr;
+	  if (needed < 2 * *n)
+	    needed = 2 * *n;
+	  new_lineptr = XRESIZEVEC (char, *lineptr, needed);
+	  if (new_lineptr == NULL)
+	    return -1;
+	  *lineptr = new_lineptr;
+	  *n = needed;
+	}
+      memcpy (*lineptr + cur_len, buf, len);
+      cur_len += len;
+      if (t != NULL)
+	break;
+      len = fread (buf, 1, sizeof buf, fp);
+      if (ferror (fp))
+	return -1;
+      if (len == 0)
+	break;
+    }
+  (*lineptr)[cur_len] = '\0';
+  return cur_len;
+}
+
+/* Reads one line from FILE into a static buffer.  LINE_LENGTH is set
+   by this function to the length of the returned line.  Note that the
+   returned line can contain several zero bytes.  */
 static const char *
-read_line (FILE *file)
+read_line (FILE *file, int& line_length)
 {
   static char *string;
-  static size_t string_len;
+  static size_t string_len, cur_len;
   size_t pos = 0;
   char *ptr;
 
   if (!string_len)
     {
       string_len = 200;
-      string = XNEWVEC (char, string_len);
+      string = XCNEWVEC (char, string_len);
     }
+  else
+    memset (string, 0, string_len);
 
-  while ((ptr = fgets (string + pos, string_len - pos, file)))
+  ptr = string;
+  cur_len = string_len;
+  while (size_t len = get_line (&ptr, &cur_len, file))
     {
-      size_t len = strlen (string + pos);
-
-      if (string[pos + len - 1] == '\n')
+      if (ptr[len - 1] == '\n')
 	{
-	  string[pos + len - 1] = 0;
+	  ptr[len - 1] = 0;
+	  line_length = len;
 	  return string;
 	}
       pos += len;
       string = XRESIZEVEC (char, string, string_len * 2);
       string_len *= 2;
-    }
-      
+      ptr = string + pos;
+      cur_len = string_len - pos;
+     }
+
+  line_length = pos ? string_len : 0;
   return pos ? string : NULL;
 }
 
 /* Return the physical source line that corresponds to xloc in a
    buffer that is statically allocated.  The newline is replaced by
-   the null character.  */
+   the null character.  Note that the line can contain several null
+   characters, so LINE_LEN contains the actual length of the line.  */
 
 const char *
-location_get_source_line (expanded_location xloc)
+location_get_source_line (expanded_location xloc,
+			  int& line_len)
 {
   const char *buffer;
   int lines = 1;
@@ -132,7 +208,7 @@ location_get_source_line (expanded_location xloc)
   if (!stream)
     return NULL;
 
-  while ((buffer = read_line (stream)) && lines < xloc.line)
+  while ((buffer = read_line (stream, line_len)) && lines < xloc.line)
     lines++;
 
   fclose (stream);
diff --git a/gcc/input.h b/gcc/input.h
index 8fdc7b2..79b3a10 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -37,7 +37,8 @@ extern char builtins_location_check[(BUILTINS_LOCATION
 				     < RESERVED_LOCATION_COUNT) ? 1 : -1];
 
 extern expanded_location expand_location (source_location);
-extern const char *location_get_source_line (expanded_location xloc);
+extern const char *location_get_source_line (expanded_location xloc,
+					     int& line_size);
 extern expanded_location expand_location_to_spelling_point (source_location);
 extern source_location expansion_point_location_if_in_system_header (source_location);
 
diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c b/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c
new file mode 100644
index 0000000000000000000000000000000000000000..ff2ed962ac96e47ae05b0b040f4e10b8e09637e2
GIT binary patch
literal 240
zcmdPbSEyD<N!LxuS12e-Ehx%QPAx80sO92PVo*}h*HVDUmM0eFW#*+TDCL#r<R~O(
UBo-wmm!uXcDby-x=?^KT09Xk|)&Kwi

literal 0
HcmV?d00001

-- 
		Dodji

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-11-04 11:52       ` Dodji Seketeli
@ 2013-11-04 11:59         ` Jakub Jelinek
  2013-11-04 15:42           ` Dodji Seketeli
  2013-11-04 12:06         ` Bernd Edlinger
  1 sibling, 1 reply; 46+ messages in thread
From: Jakub Jelinek @ 2013-11-04 11:59 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: Bernd Edlinger, gcc-patches

On Mon, Nov 04, 2013 at 12:46:10PM +0100, Dodji Seketeli wrote:
> --- a/gcc/diagnostic.c
> +++ b/gcc/diagnostic.c
> @@ -259,12 +259,13 @@ diagnostic_build_prefix (diagnostic_context *context,
>     MAX_WIDTH by some margin, then adjust the start of the line such
>     that the COLUMN is smaller than MAX_WIDTH minus the margin.  The
>     margin is either 10 characters or the difference between the column
> -   and the length of the line, whatever is smaller.  */
> +   and the length of the line, whatever is smaller.  The length of
> +   LINE is given by LINE_WIDTH.  */
>  static const char *
> -adjust_line (const char *line, int max_width, int *column_p)
> +adjust_line (const char *line, int line_width,
> +	     int max_width, int *column_p)

Eventually, I think using int for sizes is just a ticking bomb, what if
somebody uses > 2GB long lines?  Surely, on 32-bit hosts we are unlikely to
handle it, but why couldn't 64-bit host handle it?  Column info maybe bogus
in there, sure, but at least we should not crash or overflow buffers on it
;).  Anyway, not something needed to be fixed right now, but in the future
it would be nicer to use size_t and/or ssize_t here.

>  {
>    int right_margin = 10;
> -  int line_width = strlen (line);
>    int column = *column_p;
>  
>    right_margin = MIN (line_width - column, right_margin);
> @@ -284,6 +285,7 @@ diagnostic_show_locus (diagnostic_context * context,
>  		       const diagnostic_info *diagnostic)
>  {
>    const char *line;
> +  int line_width;
>    char *buffer;
>    expanded_location s;
>    int max_width;
> @@ -297,22 +299,25 @@ diagnostic_show_locus (diagnostic_context * context,
>  
>    context->last_location = diagnostic->location;
>    s = expand_location_to_spelling_point (diagnostic->location);
> -  line = location_get_source_line (s);
> +  line = location_get_source_line (s, line_width);

I think richi didn't like C++ reference arguments to be used that way (and
perhaps guidelines don't either), because it isn't immediately obvious
that line_width is modified by the call.  Can you change it to a pointer
argument instead and pass &line_width?
> +      *lineptr = XNEWVEC (char, *n);
> +      if (*lineptr == NULL)
> +	return -1;

XNEWVEC or XRESIZEVEC will never return NULL though, so it doesn't have
to be tested.  Though, the question is if that is what we want, caret
diagnostics should be optional, if we can't print it, we just won't.
So perhaps using malloc/realloc here would be better?

>  
>  const char *
> -location_get_source_line (expanded_location xloc)
> +location_get_source_line (expanded_location xloc,
> +			  int& line_len)

Ditto.

Otherwise, LGTM.

	Jakub

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-11-04 11:52       ` Dodji Seketeli
  2013-11-04 11:59         ` Jakub Jelinek
@ 2013-11-04 12:06         ` Bernd Edlinger
  2013-11-04 12:15           ` Jakub Jelinek
  2013-11-04 15:21           ` Dodji Seketeli
  1 sibling, 2 replies; 46+ messages in thread
From: Bernd Edlinger @ 2013-11-04 12:06 UTC (permalink / raw)
  To: Dodji Seketeli, Jakub Jelinek; +Cc: gcc-patches

Hi,


I see another "read_line" at gcov.c, which seems to be a copy.

Maybe this should be changed too?

What do you think?

Bernd.

On Mon, 4 Nov 2013 12:46:10, Dodji Seketeli wrote:
>
> Jakub Jelinek <jakub@redhat.com> writes:
>
>> I think even as a fallback is the patch too expensive.
>> I'd say best would be to write some getline API compatible function
>> and just use it, using fread on say fixed size buffer.
>
> OK, thanks for the insight. I have just used the getline_fallback
> function you proposed, slightly amended to use the memory allocation
> routines commonly used in gcc and renamed into get_line, with a
> hopefully complete comment explaining where this function comes from
> etc.
>
> [...]
>
>> A slight complication is what to do on mingw/cygwin and other DOS or
>> Mac style line ending environments, no idea what fgets exactly does
>> there.
>
> Actually, I think that even fgets just deals with '\n'. The reason,
> from what I gathered being that on windows, we fopen the input file in
> text mode; and in that mode, the \r\n is transformed into just \n.
>
> Apparently OSX is compatible with '\n' too. Someone corrects me if I am
> saying non-sense here.
>
> So the patch below is what I am bootstrapping at the moment.
>
> OK if it passes bootstrap on x86_64-unknown-linux-gnu against trunk?
>
>> BTW, we probably want to do something with the speed of the caret
>> diagnostics too, right now it opens the file again for each single line
>> to be printed in caret diagnostics and reads all lines until the right one,
>> so imagine how fast is printing of many warnings on almost adjacent lines
>> near the end of many megabytes long file.
>> Perhaps we could remember the last file we've opened for caret diagnostics,
>> don't fclose the file right away but only if a new request is for a
>> different file, perhaps keep some vector of line start offsets (say starting
>> byte of every 100th line or similar) and also remember the last read line
>> offset, so if a new request is for the same file, but higher line than last,
>> we can just keep getlineing, and if it is smaller line than last, we look up
>> the offset of the line / 100, fseek to it and just getline only modulo 100
>> lines. Maybe we should keep not just one, but 2 or 4 opened files as cache
>> (again, with the starting line offsets vectors).
>
> I like this idea. I'll try and work on it.
>
> And now the patch.
>
> Cheers.
>
> gcc/ChangeLog:
>
> * input.h (location_get_source_line): Take an additional line_size
> parameter by reference.
> * input.c (get_line): New static function definition.
> (read_line): Take an additional line_length output parameter to be
> set to the size of the line. Use the new get_line function to
> compute the size of the line returned by fgets, rather than using
> strlen. Ensure that the buffer is initially zeroed; ensure that
> when growing the buffer too.
> (location_get_source_line): Take an additional output line_len
> parameter. Update the use of read_line to pass it the line_len
> parameter.
> * diagnostic.c (adjust_line): Take an additional input parameter
> for the length of the line, rather than calculating it with
> strlen.
> (diagnostic_show_locus): Adjust the use of
> location_get_source_line and adjust_line with respect to their new
> signature. While displaying a line now, do not stop at the first
> null byte. Rather, display the zero byte as a space and keep
> going until we reach the size of the line.
>
> gcc/testsuite/ChangeLog:
>
> * c-c++-common/cpp/warning-zero-in-literals-1.c: New test file.
> ---
> gcc/diagnostic.c | 17 ++--
> gcc/input.c | 104 ++++++++++++++++++---
> gcc/input.h | 3 +-
> .../c-c++-common/cpp/warning-zero-in-literals-1.c | Bin 0 -> 240 bytes
> 4 files changed, 103 insertions(+), 21 deletions(-)
> create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c
>
> diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
> index 36094a1..0ca7081 100644
> --- a/gcc/diagnostic.c
> +++ b/gcc/diagnostic.c
> @@ -259,12 +259,13 @@ diagnostic_build_prefix (diagnostic_context *context,
> MAX_WIDTH by some margin, then adjust the start of the line such
> that the COLUMN is smaller than MAX_WIDTH minus the margin. The
> margin is either 10 characters or the difference between the column
> - and the length of the line, whatever is smaller. */
> + and the length of the line, whatever is smaller. The length of
> + LINE is given by LINE_WIDTH. */
> static const char *
> -adjust_line (const char *line, int max_width, int *column_p)
> +adjust_line (const char *line, int line_width,
> + int max_width, int *column_p)
> {
> int right_margin = 10;
> - int line_width = strlen (line);
> int column = *column_p;
>
> right_margin = MIN (line_width - column, right_margin);
> @@ -284,6 +285,7 @@ diagnostic_show_locus (diagnostic_context * context,
> const diagnostic_info *diagnostic)
> {
> const char *line;
> + int line_width;
> char *buffer;
> expanded_location s;
> int max_width;
> @@ -297,22 +299,25 @@ diagnostic_show_locus (diagnostic_context * context,
>
> context->last_location = diagnostic->location;
> s = expand_location_to_spelling_point (diagnostic->location);
> - line = location_get_source_line (s);
> + line = location_get_source_line (s, line_width);
> if (line == NULL)
> return;
>
> max_width = context->caret_max_width;
> - line = adjust_line (line, max_width, &(s.column));
> + line = adjust_line (line, line_width, max_width, &(s.column));
>
> pp_newline (context->printer);
> saved_prefix = pp_get_prefix (context->printer);
> pp_set_prefix (context->printer, NULL);
> pp_space (context->printer);
> - while (max_width> 0 && *line != '\0')
> + while (max_width> 0 && line_width> 0)
> {
> char c = *line == '\t' ? ' ' : *line;
> + if (c == '\0')
> + c = ' ';
> pp_character (context->printer, c);
> max_width--;
> + line_width--;
> line++;
> }
> pp_newline (context->printer);
> diff --git a/gcc/input.c b/gcc/input.c
> index a141a92..be60039 100644
> --- a/gcc/input.c
> +++ b/gcc/input.c
> @@ -87,44 +87,120 @@ expand_location_1 (source_location loc,
> return xloc;
> }
>
> -/* Reads one line from file into a static buffer. */
> +/* This function reads a line that might contain zero byte value. The
> + function returns the number of bytes read. Note that this function
> + has been adapted from the a combination of geline() and
> + _IO_getdelim() from the GNU C library. It's been duplicated here
> + because the getline() function is not present on all platforms.
> +
> + LINEPTR points to a buffer that is to contain the line read.
> +
> + N points to the size of the the LINEPTR buffer.
> +
> + FP points to the file to consider. */
> +
> +static ssize_t
> +get_line (char **lineptr, size_t *n, FILE *fp)
> +{
> + ssize_t cur_len = 0, len;
> + char buf[16384];
> +
> + if (lineptr == NULL || n == NULL)
> + return -1;
> +
> + if (*lineptr == NULL || *n == 0)
> + {
> + *n = 120;
> + *lineptr = XNEWVEC (char, *n);
> + if (*lineptr == NULL)
> + return -1;
> + }
> +
> + len = fread (buf, 1, sizeof buf, fp);
> + if (ferror (fp))
> + return -1;
> +
> + for (;;)
> + {
> + size_t needed;
> + char *t = (char*) memchr (buf, '\n', len);
> + if (t != NULL) len = (t - buf) + 1;
> + if (__builtin_expect (len>= SSIZE_MAX - cur_len, 0))
> + return -1;
> + needed = cur_len + len + 1;
> + if (needed> *n)
> + {
> + char *new_lineptr;
> + if (needed < 2 * *n)
> + needed = 2 * *n;
> + new_lineptr = XRESIZEVEC (char, *lineptr, needed);
> + if (new_lineptr == NULL)
> + return -1;
> + *lineptr = new_lineptr;
> + *n = needed;
> + }
> + memcpy (*lineptr + cur_len, buf, len);
> + cur_len += len;
> + if (t != NULL)
> + break;
> + len = fread (buf, 1, sizeof buf, fp);
> + if (ferror (fp))
> + return -1;
> + if (len == 0)
> + break;
> + }
> + (*lineptr)[cur_len] = '\0';
> + return cur_len;
> +}
> +
> +/* Reads one line from FILE into a static buffer. LINE_LENGTH is set
> + by this function to the length of the returned line. Note that the
> + returned line can contain several zero bytes. */
> static const char *
> -read_line (FILE *file)
> +read_line (FILE *file, int& line_length)
> {
> static char *string;
> - static size_t string_len;
> + static size_t string_len, cur_len;
> size_t pos = 0;
> char *ptr;
>
> if (!string_len)
> {
> string_len = 200;
> - string = XNEWVEC (char, string_len);
> + string = XCNEWVEC (char, string_len);
> }
> + else
> + memset (string, 0, string_len);
>
> - while ((ptr = fgets (string + pos, string_len - pos, file)))
> + ptr = string;
> + cur_len = string_len;
> + while (size_t len = get_line (&ptr, &cur_len, file))
> {
> - size_t len = strlen (string + pos);
> -
> - if (string[pos + len - 1] == '\n')
> + if (ptr[len - 1] == '\n')
> {
> - string[pos + len - 1] = 0;
> + ptr[len - 1] = 0;
> + line_length = len;
> return string;
> }
> pos += len;
> string = XRESIZEVEC (char, string, string_len * 2);
> string_len *= 2;
> - }
> -
> + ptr = string + pos;
> + cur_len = string_len - pos;
> + }
> +
> + line_length = pos ? string_len : 0;
> return pos ? string : NULL;
> }
>
> /* Return the physical source line that corresponds to xloc in a
> buffer that is statically allocated. The newline is replaced by
> - the null character. */
> + the null character. Note that the line can contain several null
> + characters, so LINE_LEN contains the actual length of the line. */
>
> const char *
> -location_get_source_line (expanded_location xloc)
> +location_get_source_line (expanded_location xloc,
> + int& line_len)
> {
> const char *buffer;
> int lines = 1;
> @@ -132,7 +208,7 @@ location_get_source_line (expanded_location xloc)
> if (!stream)
> return NULL;
>
> - while ((buffer = read_line (stream)) && lines < xloc.line)
> + while ((buffer = read_line (stream, line_len)) && lines < xloc.line)
> lines++;
>
> fclose (stream);
> diff --git a/gcc/input.h b/gcc/input.h
> index 8fdc7b2..79b3a10 100644
> --- a/gcc/input.h
> +++ b/gcc/input.h
> @@ -37,7 +37,8 @@ extern char builtins_location_check[(BUILTINS_LOCATION
> < RESERVED_LOCATION_COUNT) ? 1 : -1];
>
> extern expanded_location expand_location (source_location);
> -extern const char *location_get_source_line (expanded_location xloc);
> +extern const char *location_get_source_line (expanded_location xloc,
> + int& line_size);
> extern expanded_location expand_location_to_spelling_point (source_location);
> extern source_location expansion_point_location_if_in_system_header (source_location);
>
> diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c b/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..ff2ed962ac96e47ae05b0b040f4e10b8e09637e2
> GIT binary patch
> literal 240
> zcmdPbSEyD<N!LxuS12e-Ehx%QPAx80sO92PVo*}h*HVDUmM0eFW#*+TDCL#r<R~O(
> UBo-wmm!uXcDby-x=?^KT09Xk|)&Kwi
>
> literal 0
> HcmV?d00001
>
> --
> Dodji 		 	   		  

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-11-04 12:06         ` Bernd Edlinger
@ 2013-11-04 12:15           ` Jakub Jelinek
  2013-11-04 12:32             ` Bernd Edlinger
  2013-11-04 15:21           ` Dodji Seketeli
  1 sibling, 1 reply; 46+ messages in thread
From: Jakub Jelinek @ 2013-11-04 12:15 UTC (permalink / raw)
  To: Bernd Edlinger; +Cc: Dodji Seketeli, gcc-patches

On Mon, Nov 04, 2013 at 12:59:49PM +0100, Bernd Edlinger wrote:
> I see another "read_line" at gcov.c, which seems to be a copy.

Copy of what?  gcov.c read_line hardly can be allowed to fail because out of
mem unlike this one for caret diagnostics.
Though, surely, this one could be somewhat adjusted so that it really
doesn't use a temporary buffer but reads directly into the initially
malloced, then realloced, buffer.  But, if we want it to eventually switch
to caching the caret diagnostics, it won't be possible/desirable anymore.

	Jakub

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-11-04 12:15           ` Jakub Jelinek
@ 2013-11-04 12:32             ` Bernd Edlinger
  0 siblings, 0 replies; 46+ messages in thread
From: Bernd Edlinger @ 2013-11-04 12:32 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Dodji Seketeli, gcc-patches

>
> On Mon, Nov 04, 2013 at 12:59:49PM +0100, Bernd Edlinger wrote:
>> I see another "read_line" at gcov.c, which seems to be a copy.
>
> Copy of what? gcov.c read_line hardly can be allowed to fail because out of
> mem unlike this one for caret diagnostics.
> Though, surely, this one could be somewhat adjusted so that it really
> doesn't use a temporary buffer but reads directly into the initially
> malloced, then realloced, buffer. But, if we want it to eventually switch
> to caching the caret diagnostics, it won't be possible/desirable anymore.
>
> Jakub

gcov.c and input.c currently both have a static function "read_line"
they are currently 100% in sync. Both _can_ fail, if the file gets
deleted or modified while the function executes.

If gcov.c crashes in that event, I'd call it a bug.

Bernd. 		 	   		  

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-11-04 12:06         ` Bernd Edlinger
  2013-11-04 12:15           ` Jakub Jelinek
@ 2013-11-04 15:21           ` Dodji Seketeli
  1 sibling, 0 replies; 46+ messages in thread
From: Dodji Seketeli @ 2013-11-04 15:21 UTC (permalink / raw)
  To: Bernd Edlinger; +Cc: Jakub Jelinek, gcc-patches

Bernd Edlinger <bernd.edlinger@hotmail.de> writes:


> I see another "read_line" at gcov.c, which seems to be a copy.
>
> Maybe this should be changed too?

I have seen it as well.

I'd rather have the patch be reviewed and everthing, and only then
propose to share the implementation with the gcov module.

-- 
		Dodji

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-11-04 11:59         ` Jakub Jelinek
@ 2013-11-04 15:42           ` Dodji Seketeli
  2013-11-05  0:10             ` Bernd Edlinger
  0 siblings, 1 reply; 46+ messages in thread
From: Dodji Seketeli @ 2013-11-04 15:42 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Bernd Edlinger, gcc-patches

Jakub Jelinek <jakub@redhat.com> writes:

[...]

> Eventually, I think using int for sizes is just a ticking bomb, what if
> somebody uses > 2GB long lines?  Surely, on 32-bit hosts we are unlikely to
> handle it, but why couldn't 64-bit host handle it?  Column info maybe bogus
> in there, sure, but at least we should not crash or overflow buffers on it
> ;).  Anyway, not something needed to be fixed right now, but in the future
> it would be nicer to use size_t and/or ssize_t here.

Yes.  I initially tried to use size_t but found that I'd need to modify
several other places to shutdown many warning because these places where
using int :-(.  So I felt that would be a battle for later.

But I am adding this to my TODO.  I'll send a patch later that
changes this to size_t then, and adjusts the other places that need it
as well.

[...]

>>    context->last_location = diagnostic->location;
>>    s = expand_location_to_spelling_point (diagnostic->location);
>> -  line = location_get_source_line (s);
>> +  line = location_get_source_line (s, line_width);
>
> I think richi didn't like C++ reference arguments to be used that way (and
> perhaps guidelines don't either), because it isn't immediately obvious
> that line_width is modified by the call.  Can you change it to a pointer
> argument instead and pass &line_width?

Sure.  I have done the change in the patch below.  Sorry for this
reflex.  I tend to use pointers like these only in places where we can
allow them to be NULL.

> XNEWVEC or XRESIZEVEC will never return NULL though, so it doesn't have
> to be tested.  Though, the question is if that is what we want, caret
> diagnostics should be optional, if we can't print it, we just won't.

Hmmh.  This particular bug was noticed because of the explicit OOM
message displayed by XNEWVEC/XRESIZEVEC; otherwise, I bet this could
have just felt through the crack for a little longer.  So I'd say let's
just use XNEWVEC/XRESIZEVEC and remove the test, as you first
suggested.  The caret diagnostics functionality as a whole can be
disabled with -fno-diagnostic-show-caret.


[...]

> Otherwise, LGTM.

Thanks.

So here is the patch that bootstraps.

gcc/ChangeLog:

	* input.h (location_get_source_line): Take an additional line_size
	parameter by reference.
	* input.c (get_line): New static function definition.
	(read_line): Take an additional line_length output parameter to be
	set to the size of the line.  Use the new get_line function to
	compute the size of the line returned by fgets, rather than using
	strlen.  Ensure that the buffer is initially zeroed; ensure that
	when growing the buffer too.
	(location_get_source_line): Take an additional output line_len
	parameter.  Update the use of read_line to pass it the line_len
	parameter.
	* diagnostic.c (adjust_line): Take an additional input parameter
	for the length of the line, rather than calculating it with
	strlen.
	(diagnostic_show_locus): Adjust the use of
	location_get_source_line and adjust_line with respect to their new
	signature.  While displaying a line now, do not stop at the first
	null byte.  Rather, display the zero byte as a space and keep
	going until we reach the size of the line.

gcc/testsuite/ChangeLog:

	* c-c++-common/cpp/warning-zero-in-literals-1.c: New test file.
---
 gcc/diagnostic.c                                   |  17 ++--
 gcc/input.c                                        | 100 ++++++++++++++++++---
 gcc/input.h                                        |   3 +-
 .../c-c++-common/cpp/warning-zero-in-literals-1.c  | Bin 0 -> 240 bytes
 4 files changed, 99 insertions(+), 21 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c

diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 36094a1..0ca7081 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -259,12 +259,13 @@ diagnostic_build_prefix (diagnostic_context *context,
    MAX_WIDTH by some margin, then adjust the start of the line such
    that the COLUMN is smaller than MAX_WIDTH minus the margin.  The
    margin is either 10 characters or the difference between the column
-   and the length of the line, whatever is smaller.  */
+   and the length of the line, whatever is smaller.  The length of
+   LINE is given by LINE_WIDTH.  */
 static const char *
-adjust_line (const char *line, int max_width, int *column_p)
+adjust_line (const char *line, int line_width,
+	     int max_width, int *column_p)
 {
   int right_margin = 10;
-  int line_width = strlen (line);
   int column = *column_p;
 
   right_margin = MIN (line_width - column, right_margin);
@@ -284,6 +285,7 @@ diagnostic_show_locus (diagnostic_context * context,
 		       const diagnostic_info *diagnostic)
 {
   const char *line;
+  int line_width;
   char *buffer;
   expanded_location s;
   int max_width;
@@ -297,22 +299,25 @@ diagnostic_show_locus (diagnostic_context * context,
 
   context->last_location = diagnostic->location;
   s = expand_location_to_spelling_point (diagnostic->location);
-  line = location_get_source_line (s);
+  line = location_get_source_line (s, line_width);
   if (line == NULL)
     return;
 
   max_width = context->caret_max_width;
-  line = adjust_line (line, max_width, &(s.column));
+  line = adjust_line (line, line_width, max_width, &(s.column));
 
   pp_newline (context->printer);
   saved_prefix = pp_get_prefix (context->printer);
   pp_set_prefix (context->printer, NULL);
   pp_space (context->printer);
-  while (max_width > 0 && *line != '\0')
+  while (max_width > 0 && line_width > 0)
     {
       char c = *line == '\t' ? ' ' : *line;
+      if (c == '\0')
+	c = ' ';
       pp_character (context->printer, c);
       max_width--;
+      line_width--;
       line++;
     }
   pp_newline (context->printer);
diff --git a/gcc/input.c b/gcc/input.c
index a141a92..2ee7882 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -87,44 +87,116 @@ expand_location_1 (source_location loc,
   return xloc;
 }
 
-/* Reads one line from file into a static buffer.  */
+/* This function reads a line that might contain bytes whose value is
+   zero.  It returns the number of bytes read.  Note that this
+   function has been adapted from getline() and _IO_getdelim() GNU C
+   library functions.  It's been duplicated here because the getline()
+   function is not necessarily present on all platforms.
+
+   LINEPTR points to a buffer that is to contain the line read.
+
+   N points to the size of the the LINEPTR buffer.
+
+   FP points to the file to consider.  */
+
+static ssize_t
+get_line (char **lineptr, size_t *n, FILE *fp)
+{
+  ssize_t cur_len = 0, len;
+  char buf[16384];
+
+  if (lineptr == NULL || n == NULL)
+    return -1;
+
+  if (*lineptr == NULL || *n == 0)
+    {
+      *n = 120;
+      *lineptr = XNEWVEC (char, *n);
+    }
+
+  len = fread (buf, 1, sizeof buf, fp);
+  if (ferror (fp))
+    return -1;
+
+  for (;;)
+    {
+      size_t needed;
+      char *t = (char*) memchr (buf, '\n', len);
+      if (t != NULL) len = (t - buf) + 1;
+      if (__builtin_expect (len >= SSIZE_MAX - cur_len, 0))
+	return -1;
+      needed = cur_len + len + 1;
+      if (needed > *n)
+	{
+	  char *new_lineptr;
+	  if (needed < 2 * *n)
+	    needed = 2 * *n;
+	  new_lineptr = XRESIZEVEC (char, *lineptr, needed);
+	  *lineptr = new_lineptr;
+	  *n = needed;
+	}
+      memcpy (*lineptr + cur_len, buf, len);
+      cur_len += len;
+      if (t != NULL)
+	break;
+      len = fread (buf, 1, sizeof buf, fp);
+      if (ferror (fp))
+	return -1;
+      if (len == 0)
+	break;
+    }
+  (*lineptr)[cur_len] = '\0';
+  return cur_len;
+}
+
+/* Reads one line from FILE into a static buffer.  LINE_LENGTH is set
+   by this function to the length of the returned line.  Note that the
+   returned line can contain several zero bytes.  */
 static const char *
-read_line (FILE *file)
+read_line (FILE *file, int *line_length)
 {
   static char *string;
-  static size_t string_len;
+  static size_t string_len, cur_len;
   size_t pos = 0;
   char *ptr;
 
   if (!string_len)
     {
       string_len = 200;
-      string = XNEWVEC (char, string_len);
+      string = XCNEWVEC (char, string_len);
     }
+  else
+    memset (string, 0, string_len);
 
-  while ((ptr = fgets (string + pos, string_len - pos, file)))
+  ptr = string;
+  cur_len = string_len;
+  while (size_t len = get_line (&ptr, &cur_len, file))
     {
-      size_t len = strlen (string + pos);
-
-      if (string[pos + len - 1] == '\n')
+      if (ptr[len - 1] == '\n')
 	{
-	  string[pos + len - 1] = 0;
+	  ptr[len - 1] = 0;
+	  *line_length = len;
 	  return string;
 	}
       pos += len;
       string = XRESIZEVEC (char, string, string_len * 2);
       string_len *= 2;
-    }
-      
+      ptr = string + pos;
+      cur_len = string_len - pos;
+     }
+
+  *line_length = pos ? string_len : 0;
   return pos ? string : NULL;
 }
 
 /* Return the physical source line that corresponds to xloc in a
    buffer that is statically allocated.  The newline is replaced by
-   the null character.  */
+   the null character.  Note that the line can contain several null
+   characters, so LINE_LEN contains the actual length of the line.  */
 
 const char *
-location_get_source_line (expanded_location xloc)
+location_get_source_line (expanded_location xloc,
+			  int& line_len)
 {
   const char *buffer;
   int lines = 1;
@@ -132,7 +204,7 @@ location_get_source_line (expanded_location xloc)
   if (!stream)
     return NULL;
 
-  while ((buffer = read_line (stream)) && lines < xloc.line)
+  while ((buffer = read_line (stream, &line_len)) && lines < xloc.line)
     lines++;
 
   fclose (stream);
diff --git a/gcc/input.h b/gcc/input.h
index 8fdc7b2..79b3a10 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -37,7 +37,8 @@ extern char builtins_location_check[(BUILTINS_LOCATION
 				     < RESERVED_LOCATION_COUNT) ? 1 : -1];
 
 extern expanded_location expand_location (source_location);
-extern const char *location_get_source_line (expanded_location xloc);
+extern const char *location_get_source_line (expanded_location xloc,
+					     int& line_size);
 extern expanded_location expand_location_to_spelling_point (source_location);
 extern source_location expansion_point_location_if_in_system_header (source_location);
 
diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c b/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c
new file mode 100644
index 0000000000000000000000000000000000000000..ff2ed962ac96e47ae05b0b040f4e10b8e09637e2
GIT binary patch
literal 240
zcmdPbSEyD<N!LxuS12e-Ehx%QPAx80sO92PVo*}h*HVDUmM0eFW#*+TDCL#r<R~O(
UBo-wmm!uXcDby-x=?^KT09Xk|)&Kwi

literal 0
HcmV?d00001

-- 
		Dodji

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-11-04 15:42           ` Dodji Seketeli
@ 2013-11-05  0:10             ` Bernd Edlinger
  2013-11-05  9:50               ` Dodji Seketeli
  0 siblings, 1 reply; 46+ messages in thread
From: Bernd Edlinger @ 2013-11-05  0:10 UTC (permalink / raw)
  To: Dodji Seketeli, Jakub Jelinek; +Cc: gcc-patches

Hi,

On Mon, 4 Nov 2013 16:40:38, Dodji Seketeli wrote:
> +static ssize_t
> +get_line (char **lineptr, size_t *n, FILE *fp)
> +{
> + ssize_t cur_len = 0, len;
> + char buf[16384];
> +
> + if (lineptr == NULL || n == NULL)
> + return -1;
> +
> + if (*lineptr == NULL || *n == 0)
> + {
> + *n = 120;
> + *lineptr = XNEWVEC (char, *n);
> + }
> +
> + len = fread (buf, 1, sizeof buf, fp);
> + if (ferror (fp))
> + return -1;
> +
> + for (;;)
> + {
> + size_t needed;
> + char *t = (char*) memchr (buf, '\n', len);
> + if (t != NULL) len = (t - buf) + 1;
> + if (__builtin_expect (len>= SSIZE_MAX - cur_len, 0))
> + return -1;
> + needed = cur_len + len + 1;
> + if (needed> *n)
> + {
> + char *new_lineptr;
> + if (needed < 2 * *n)
> + needed = 2 * *n;
> + new_lineptr = XRESIZEVEC (char, *lineptr, needed);
> + *lineptr = new_lineptr;
> + *n = needed;
> + }
> + memcpy (*lineptr + cur_len, buf, len);
> + cur_len += len;
> + if (t != NULL)
> + break;
> + len = fread (buf, 1, sizeof buf, fp);
> + if (ferror (fp))
> + return -1;
> + if (len == 0)
> + break;
> + }
> + (*lineptr)[cur_len] = '\0';
> + return cur_len;
> +}
> +
> +/* Reads one line from FILE into a static buffer. LINE_LENGTH is set
> + by this function to the length of the returned line. Note that the
> + returned line can contain several zero bytes. */
> static const char *
> -read_line (FILE *file)
> +read_line (FILE *file, int *line_length)
> {
> static char *string;
> - static size_t string_len;
> + static size_t string_len, cur_len;
> size_t pos = 0;
> char *ptr;
>
> if (!string_len)
> {
> string_len = 200;
> - string = XNEWVEC (char, string_len);
> + string = XCNEWVEC (char, string_len);
> }
> + else
> + memset (string, 0, string_len);

Is this memset still necessary?

If the previous invocation of read_line already had read
some characters of the following line, how is that information
recovered? How is it detected if another file is to be read this time?

>
> - while ((ptr = fgets (string + pos, string_len - pos, file)))
> + ptr = string;
> + cur_len = string_len;
> + while (size_t len = get_line (&ptr, &cur_len, file))
> {
> - size_t len = strlen (string + pos);
> -
> - if (string[pos + len - 1] == '\n')
> + if (ptr[len - 1] == '\n')
> {
> - string[pos + len - 1] = 0;
> + ptr[len - 1] = 0;
> + *line_length = len;
> return string;
> }
> pos += len;
> string = XRESIZEVEC (char, string, string_len * 2);
> string_len *= 2;
> - }
> -
> + ptr = string + pos;

If "ptr" is passed to get_line it will try to reallocate it,
which must fail, right?

Maybe, this line of code is unreachable?

Who is responsible for reallocating "string" get_line or read_line?

> + cur_len = string_len - pos;
> + }
> +
> + *line_length = pos ? string_len : 0;
> return pos ? string : NULL;
> }
>
> /* Return the physical source line that corresponds to xloc in a
> buffer that is statically allocated. The newline is replaced by
> - the null character. */
> + the null character. Note that the line can contain several null
> + characters, so LINE_LEN contains the actual length of the line. */
>
> const char *
> -location_get_source_line (expanded_location xloc)
> +location_get_source_line (expanded_location xloc,
> + int& line_len)
> {
> const char *buffer;
> int lines = 1;
> @@ -132,7 +204,7 @@ location_get_source_line (expanded_location xloc)
> if (!stream)
> return NULL;
>
> - while ((buffer = read_line (stream)) && lines < xloc.line)
> + while ((buffer = read_line (stream, &line_len)) && lines < xloc.line)
> lines++;
>
> fclose (stream);


Regards
Bernd. 		 	   		  

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-11-05  0:10             ` Bernd Edlinger
@ 2013-11-05  9:50               ` Dodji Seketeli
  2013-11-05 11:19                 ` Bernd Edlinger
  2013-11-06 22:27                 ` Bernd Edlinger
  0 siblings, 2 replies; 46+ messages in thread
From: Dodji Seketeli @ 2013-11-05  9:50 UTC (permalink / raw)
  To: Bernd Edlinger
  Cc: Jakub Jelinek, Manuel López-Ibáñez, gcc-patches

Bernd Edlinger <bernd.edlinger@hotmail.de> writes:

[...]

>> if (!string_len)
>> {
>> string_len = 200;
>> - string = XNEWVEC (char, string_len);
>> + string = XCNEWVEC (char, string_len);
>> }
>> + else
>> + memset (string, 0, string_len);
>
> Is this memset still necessary?

Of course not ...

[...]

> If "ptr" is passed to get_line it will try to reallocate it,
> which must fail, right?
>
> Maybe, this line of code is unreachable?
>
> Who is responsible for reallocating "string" get_line or read_line?

Correct, these are real concerns.


I am wondering what I was thinking.  Actually, I think read_line should
almost just call get_line now.  Like what is done in the new version of
the patch below; basically if there is a line to return, read_line just
gets it (the static buffer containing the line) from get_line and
returns it, otherwise the static buffer containing the last read line is
left untouched and read_line returns a NULL constant.


I guess this resolves the valid concern that you raised below:

    > If the previous invocation of read_line already had read some
    > characters of the following line, how is that information
    > recovered? How is it detected if another file is to be read this
    > time?

Thank you very much for this thorough review.

Here is the updated patch that I am bootstrapping:

gcc/ChangeLog:

	* input.h (location_get_source_line): Take an additional line_size
	parameter.
	* input.c (get_line): New static function definition.
	(read_line): Take an additional line_length output parameter to be
	set to the size of the line.  Use the new get_line function do the
	actual line reading.
	(location_get_source_line): Take an additional output line_len
	parameter.  Update the use of read_line to pass it the line_len
	parameter.
	* diagnostic.c (adjust_line): Take an additional input parameter
	for the length of the line, rather than calculating it with
	strlen.
	(diagnostic_show_locus): Adjust the use of
	location_get_source_line and adjust_line with respect to their new
	signature.  While displaying a line now, do not stop at the first
	null byte.  Rather, display the zero byte as a space and keep
	going until we reach the size of the line.

gcc/testsuite/ChangeLog:

	* c-c++-common/cpp/warning-zero-in-literals-1.c: New test file.
---
 gcc/diagnostic.c                                   |  17 ++--
 gcc/input.c                                        | 111 ++++++++++++++++-----
 gcc/input.h                                        |   3 +-
 .../c-c++-common/cpp/warning-zero-in-literals-1.c  | Bin 0 -> 240 bytes
 4 files changed, 97 insertions(+), 34 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c

diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 36094a1..e0c5d9d 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -259,12 +259,13 @@ diagnostic_build_prefix (diagnostic_context *context,
    MAX_WIDTH by some margin, then adjust the start of the line such
    that the COLUMN is smaller than MAX_WIDTH minus the margin.  The
    margin is either 10 characters or the difference between the column
-   and the length of the line, whatever is smaller.  */
+   and the length of the line, whatever is smaller.  The length of
+   LINE is given by LINE_WIDTH.  */
 static const char *
-adjust_line (const char *line, int max_width, int *column_p)
+adjust_line (const char *line, int line_width,
+	     int max_width, int *column_p)
 {
   int right_margin = 10;
-  int line_width = strlen (line);
   int column = *column_p;
 
   right_margin = MIN (line_width - column, right_margin);
@@ -284,6 +285,7 @@ diagnostic_show_locus (diagnostic_context * context,
 		       const diagnostic_info *diagnostic)
 {
   const char *line;
+  int line_width;
   char *buffer;
   expanded_location s;
   int max_width;
@@ -297,22 +299,25 @@ diagnostic_show_locus (diagnostic_context * context,
 
   context->last_location = diagnostic->location;
   s = expand_location_to_spelling_point (diagnostic->location);
-  line = location_get_source_line (s);
+  line = location_get_source_line (s, &line_width);
   if (line == NULL)
     return;
 
   max_width = context->caret_max_width;
-  line = adjust_line (line, max_width, &(s.column));
+  line = adjust_line (line, line_width, max_width, &(s.column));
 
   pp_newline (context->printer);
   saved_prefix = pp_get_prefix (context->printer);
   pp_set_prefix (context->printer, NULL);
   pp_space (context->printer);
-  while (max_width > 0 && *line != '\0')
+  while (max_width > 0 && line_width > 0)
     {
       char c = *line == '\t' ? ' ' : *line;
+      if (c == '\0')
+	c = ' ';
       pp_character (context->printer, c);
       max_width--;
+      line_width--;
       line++;
     }
   pp_newline (context->printer);
diff --git a/gcc/input.c b/gcc/input.c
index a141a92..9526d88 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -87,53 +87,110 @@ expand_location_1 (source_location loc,
   return xloc;
 }
 
-/* Reads one line from file into a static buffer.  */
-static const char *
-read_line (FILE *file)
+/* This function reads a line that might contain bytes whose value is
+   zero.  It returns the number of bytes read.  The 'end-of-line'
+   character found at the end of the line is not contained in the
+   returned buffer.  Note that this function has been adapted from
+   getline() and _IO_getdelim() GNU C library functions.  It's been
+   duplicated here because the getline() function is not necessarily
+   present on all platforms.
+
+   LINEPTR points to a buffer that is to contain the line read.
+
+   N points to the size of the the LINEPTR buffer.
+
+   FP points to the file to consider.  */
+
+static ssize_t
+get_line (char **lineptr, size_t *n, FILE *fp)
 {
-  static char *string;
-  static size_t string_len;
-  size_t pos = 0;
-  char *ptr;
+  ssize_t cur_len = 0, len;
+  char buf[16384];
+
+  if (lineptr == NULL || n == NULL)
+    return -1;
 
-  if (!string_len)
+  if (*lineptr == NULL || *n == 0)
     {
-      string_len = 200;
-      string = XNEWVEC (char, string_len);
+      *n = 120;
+      *lineptr = XNEWVEC (char, *n);
     }
 
-  while ((ptr = fgets (string + pos, string_len - pos, file)))
-    {
-      size_t len = strlen (string + pos);
+  len = fread (buf, 1, sizeof buf, fp);
+  if (ferror (fp))
+    return -1;
 
-      if (string[pos + len - 1] == '\n')
+  for (;;)
+    {
+      size_t needed;
+      char *t = (char*) memchr (buf, '\n', len);
+      if (t != NULL) len = (t - buf);
+      if (__builtin_expect (len >= SSIZE_MAX - cur_len, 0))
+	return -1;
+      needed = cur_len + len + 1;
+      if (needed > *n)
 	{
-	  string[pos + len - 1] = 0;
-	  return string;
+	  char *new_lineptr;
+	  if (needed < 2 * *n)
+	    needed = 2 * *n;
+	  new_lineptr = XRESIZEVEC (char, *lineptr, needed);
+	  *lineptr = new_lineptr;
+	  *n = needed;
 	}
-      pos += len;
-      string = XRESIZEVEC (char, string, string_len * 2);
-      string_len *= 2;
+      memcpy (*lineptr + cur_len, buf, len);
+      cur_len += len;
+      if (t != NULL)
+	break;
+      len = fread (buf, 1, sizeof buf, fp);
+      if (ferror (fp))
+	return -1;
+      if (len == 0)
+	break;
     }
-      
-  return pos ? string : NULL;
+
+  if (cur_len)
+    (*lineptr)[cur_len] = '\0';
+  return cur_len;
+}
+
+/* Reads one line from FILE into a static buffer.  LINE_LENGTH is set
+   by this function to the length of the returned line.  Note that the
+   returned line can contain several zero bytes.  Also note that the
+   returned string is allocated in static storage that is going to be
+   re-used by subsequent invocations of read_line.  */
+static const char *
+read_line (FILE *file, int *line_length)
+{
+  static char *string;
+  static size_t string_len;
+
+  *line_length = get_line (&string, &string_len, file);
+  return *line_length ? string : NULL;
 }
 
 /* Return the physical source line that corresponds to xloc in a
    buffer that is statically allocated.  The newline is replaced by
-   the null character.  */
+   the null character.  Note that the line can contain several null
+   characters, so LINE_LEN, if non-null, points to the actual length
+   of the line.  */
 
 const char *
-location_get_source_line (expanded_location xloc)
+location_get_source_line (expanded_location xloc,
+			  int *line_len)
 {
-  const char *buffer;
-  int lines = 1;
+  const char *buffer = NULL, *ptr;
+  int lines = 0, len = 0;
   FILE *stream = xloc.file ? fopen (xloc.file, "r") : NULL;
   if (!stream)
     return NULL;
 
-  while ((buffer = read_line (stream)) && lines < xloc.line)
-    lines++;
+  while ((ptr = read_line (stream, &len)) && lines < xloc.line)
+    {
+      buffer = ptr;
+      lines++;
+      if (line_len)
+	*line_len = len;
+    }
 
   fclose (stream);
   return buffer;
diff --git a/gcc/input.h b/gcc/input.h
index 8fdc7b2..128e28c 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -37,7 +37,8 @@ extern char builtins_location_check[(BUILTINS_LOCATION
 				     < RESERVED_LOCATION_COUNT) ? 1 : -1];
 
 extern expanded_location expand_location (source_location);
-extern const char *location_get_source_line (expanded_location xloc);
+extern const char *location_get_source_line (expanded_location xloc,
+					     int *line_size);
 extern expanded_location expand_location_to_spelling_point (source_location);
 extern source_location expansion_point_location_if_in_system_header (source_location);
 
diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c b/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c
new file mode 100644
index 0000000000000000000000000000000000000000..ff2ed962ac96e47ae05b0b040f4e10b8e09637e2
GIT binary patch
literal 240
zcmdPbSEyD<N!LxuS12e-Ehx%QPAx80sO92PVo*}h*HVDUmM0eFW#*+TDCL#r<R~O(
UBo-wmm!uXcDby-x=?^KT09Xk|)&Kwi

literal 0
HcmV?d00001

-- 
		Dodji

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-11-05  9:50               ` Dodji Seketeli
@ 2013-11-05 11:19                 ` Bernd Edlinger
  2013-11-05 11:43                   ` Dodji Seketeli
  2013-11-06 22:27                 ` Bernd Edlinger
  1 sibling, 1 reply; 46+ messages in thread
From: Bernd Edlinger @ 2013-11-05 11:19 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: Jakub Jelinek, Manuel López-Ibáñez, gcc-patches

Hi,

you're welcome.
Just one more thought on the design.

If you want to have at least a chance to survive something like:


dd if=/dev/zero of=test.c bs=10240 count=10000000

gcc -Wall test.c


Then you should change the implementation of read_line to
_not_ returning something like 100GB of zeros.

IMHO it would be nice to limit lines returned to 10.000 bytes,
maybe add "..." or "<line too long>" if the limit is reached.
Just skip over-sized bytes until the newline is consumed, to make
the line numbers consistent.

And maybe it would make the life of read_line's callers lots easier
if the zero-chars are silently replaced with spaces in the returned
line buffer.

That would allow to keep the current interface, and somehow
reduce the complexity of this patch.

What do you think?

Regards
Bernd.



On Tue, 5 Nov 2013 10:41:19, Dodji Seketeli wrote:
>
> Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
>
> [...]
>
>>> if (!string_len)
>>> {
>>> string_len = 200;
>>> - string = XNEWVEC (char, string_len);
>>> + string = XCNEWVEC (char, string_len);
>>> }
>>> + else
>>> + memset (string, 0, string_len);
>>
>> Is this memset still necessary?
>
> Of course not ...
>
> [...]
>
>> If "ptr" is passed to get_line it will try to reallocate it,
>> which must fail, right?
>>
>> Maybe, this line of code is unreachable?
>>
>> Who is responsible for reallocating "string" get_line or read_line?
>
> Correct, these are real concerns.
>
>
> I am wondering what I was thinking. Actually, I think read_line should
> almost just call get_line now. Like what is done in the new version of
> the patch below; basically if there is a line to return, read_line just
> gets it (the static buffer containing the line) from get_line and
> returns it, otherwise the static buffer containing the last read line is
> left untouched and read_line returns a NULL constant.
>
>
> I guess this resolves the valid concern that you raised below:
>
>> If the previous invocation of read_line already had read some
>> characters of the following line, how is that information
>> recovered? How is it detected if another file is to be read this
>> time?
>
> Thank you very much for this thorough review.
>
> Here is the updated patch that I am bootstrapping:
>
> gcc/ChangeLog:
>
> * input.h (location_get_source_line): Take an additional line_size
> parameter.
> * input.c (get_line): New static function definition.
> (read_line): Take an additional line_length output parameter to be
> set to the size of the line. Use the new get_line function do the
> actual line reading.
> (location_get_source_line): Take an additional output line_len
> parameter. Update the use of read_line to pass it the line_len
> parameter.
> * diagnostic.c (adjust_line): Take an additional input parameter
> for the length of the line, rather than calculating it with
> strlen.
> (diagnostic_show_locus): Adjust the use of
> location_get_source_line and adjust_line with respect to their new
> signature. While displaying a line now, do not stop at the first
> null byte. Rather, display the zero byte as a space and keep
> going until we reach the size of the line.
>
> gcc/testsuite/ChangeLog:
>
> * c-c++-common/cpp/warning-zero-in-literals-1.c: New test file.
> ---
> gcc/diagnostic.c | 17 ++--
> gcc/input.c | 111 ++++++++++++++++-----
> gcc/input.h | 3 +-
> .../c-c++-common/cpp/warning-zero-in-literals-1.c | Bin 0 -> 240 bytes
> 4 files changed, 97 insertions(+), 34 deletions(-)
> create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c
>
> diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
> index 36094a1..e0c5d9d 100644
> --- a/gcc/diagnostic.c
> +++ b/gcc/diagnostic.c
> @@ -259,12 +259,13 @@ diagnostic_build_prefix (diagnostic_context *context,
> MAX_WIDTH by some margin, then adjust the start of the line such
> that the COLUMN is smaller than MAX_WIDTH minus the margin. The
> margin is either 10 characters or the difference between the column
> - and the length of the line, whatever is smaller. */
> + and the length of the line, whatever is smaller. The length of
> + LINE is given by LINE_WIDTH. */
> static const char *
> -adjust_line (const char *line, int max_width, int *column_p)
> +adjust_line (const char *line, int line_width,
> + int max_width, int *column_p)
> {
> int right_margin = 10;
> - int line_width = strlen (line);
> int column = *column_p;
>
> right_margin = MIN (line_width - column, right_margin);
> @@ -284,6 +285,7 @@ diagnostic_show_locus (diagnostic_context * context,
> const diagnostic_info *diagnostic)
> {
> const char *line;
> + int line_width;
> char *buffer;
> expanded_location s;
> int max_width;
> @@ -297,22 +299,25 @@ diagnostic_show_locus (diagnostic_context * context,
>
> context->last_location = diagnostic->location;
> s = expand_location_to_spelling_point (diagnostic->location);
> - line = location_get_source_line (s);
> + line = location_get_source_line (s, &line_width);
> if (line == NULL)
> return;
>
> max_width = context->caret_max_width;
> - line = adjust_line (line, max_width, &(s.column));
> + line = adjust_line (line, line_width, max_width, &(s.column));
>
> pp_newline (context->printer);
> saved_prefix = pp_get_prefix (context->printer);
> pp_set_prefix (context->printer, NULL);
> pp_space (context->printer);
> - while (max_width> 0 && *line != '\0')
> + while (max_width> 0 && line_width> 0)
> {
> char c = *line == '\t' ? ' ' : *line;
> + if (c == '\0')
> + c = ' ';
> pp_character (context->printer, c);
> max_width--;
> + line_width--;
> line++;
> }
> pp_newline (context->printer);
> diff --git a/gcc/input.c b/gcc/input.c
> index a141a92..9526d88 100644
> --- a/gcc/input.c
> +++ b/gcc/input.c
> @@ -87,53 +87,110 @@ expand_location_1 (source_location loc,
> return xloc;
> }
>
> -/* Reads one line from file into a static buffer. */
> -static const char *
> -read_line (FILE *file)
> +/* This function reads a line that might contain bytes whose value is
> + zero. It returns the number of bytes read. The 'end-of-line'
> + character found at the end of the line is not contained in the
> + returned buffer. Note that this function has been adapted from
> + getline() and _IO_getdelim() GNU C library functions. It's been
> + duplicated here because the getline() function is not necessarily
> + present on all platforms.
> +
> + LINEPTR points to a buffer that is to contain the line read.
> +
> + N points to the size of the the LINEPTR buffer.
> +
> + FP points to the file to consider. */
> +
> +static ssize_t
> +get_line (char **lineptr, size_t *n, FILE *fp)
> {
> - static char *string;
> - static size_t string_len;
> - size_t pos = 0;
> - char *ptr;
> + ssize_t cur_len = 0, len;
> + char buf[16384];
> +
> + if (lineptr == NULL || n == NULL)
> + return -1;
>
> - if (!string_len)
> + if (*lineptr == NULL || *n == 0)
> {
> - string_len = 200;
> - string = XNEWVEC (char, string_len);
> + *n = 120;
> + *lineptr = XNEWVEC (char, *n);
> }
>
> - while ((ptr = fgets (string + pos, string_len - pos, file)))
> - {
> - size_t len = strlen (string + pos);
> + len = fread (buf, 1, sizeof buf, fp);
> + if (ferror (fp))
> + return -1;
>
> - if (string[pos + len - 1] == '\n')
> + for (;;)
> + {
> + size_t needed;
> + char *t = (char*) memchr (buf, '\n', len);
> + if (t != NULL) len = (t - buf);
> + if (__builtin_expect (len>= SSIZE_MAX - cur_len, 0))
> + return -1;
> + needed = cur_len + len + 1;
> + if (needed> *n)
> {
> - string[pos + len - 1] = 0;
> - return string;
> + char *new_lineptr;
> + if (needed < 2 * *n)
> + needed = 2 * *n;
> + new_lineptr = XRESIZEVEC (char, *lineptr, needed);
> + *lineptr = new_lineptr;
> + *n = needed;
> }
> - pos += len;
> - string = XRESIZEVEC (char, string, string_len * 2);
> - string_len *= 2;
> + memcpy (*lineptr + cur_len, buf, len);
> + cur_len += len;
> + if (t != NULL)
> + break;
> + len = fread (buf, 1, sizeof buf, fp);
> + if (ferror (fp))
> + return -1;
> + if (len == 0)
> + break;
> }
> -
> - return pos ? string : NULL;
> +
> + if (cur_len)
> + (*lineptr)[cur_len] = '\0';
> + return cur_len;
> +}
> +
> +/* Reads one line from FILE into a static buffer. LINE_LENGTH is set
> + by this function to the length of the returned line. Note that the
> + returned line can contain several zero bytes. Also note that the
> + returned string is allocated in static storage that is going to be
> + re-used by subsequent invocations of read_line. */
> +static const char *
> +read_line (FILE *file, int *line_length)
> +{
> + static char *string;
> + static size_t string_len;
> +
> + *line_length = get_line (&string, &string_len, file);
> + return *line_length ? string : NULL;
> }
>
> /* Return the physical source line that corresponds to xloc in a
> buffer that is statically allocated. The newline is replaced by
> - the null character. */
> + the null character. Note that the line can contain several null
> + characters, so LINE_LEN, if non-null, points to the actual length
> + of the line. */
>
> const char *
> -location_get_source_line (expanded_location xloc)
> +location_get_source_line (expanded_location xloc,
> + int *line_len)
> {
> - const char *buffer;
> - int lines = 1;
> + const char *buffer = NULL, *ptr;
> + int lines = 0, len = 0;
> FILE *stream = xloc.file ? fopen (xloc.file, "r") : NULL;
> if (!stream)
> return NULL;
>
> - while ((buffer = read_line (stream)) && lines < xloc.line)
> - lines++;
> + while ((ptr = read_line (stream, &len)) && lines < xloc.line)
> + {
> + buffer = ptr;
> + lines++;
> + if (line_len)
> + *line_len = len;
> + }
>
> fclose (stream);
> return buffer;
> diff --git a/gcc/input.h b/gcc/input.h
> index 8fdc7b2..128e28c 100644
> --- a/gcc/input.h
> +++ b/gcc/input.h
> @@ -37,7 +37,8 @@ extern char builtins_location_check[(BUILTINS_LOCATION
> < RESERVED_LOCATION_COUNT) ? 1 : -1];
>
> extern expanded_location expand_location (source_location);
> -extern const char *location_get_source_line (expanded_location xloc);
> +extern const char *location_get_source_line (expanded_location xloc,
> + int *line_size);
> extern expanded_location expand_location_to_spelling_point (source_location);
> extern source_location expansion_point_location_if_in_system_header (source_location);
>
> diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c b/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..ff2ed962ac96e47ae05b0b040f4e10b8e09637e2
> GIT binary patch
> literal 240
> zcmdPbSEyD<N!LxuS12e-Ehx%QPAx80sO92PVo*}h*HVDUmM0eFW#*+TDCL#r<R~O(
> UBo-wmm!uXcDby-x=?^KT09Xk|)&Kwi
>
> literal 0
> HcmV?d00001
>
> --
> Dodji 		 	   		  

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-11-05 11:19                 ` Bernd Edlinger
@ 2013-11-05 11:43                   ` Dodji Seketeli
  0 siblings, 0 replies; 46+ messages in thread
From: Dodji Seketeli @ 2013-11-05 11:43 UTC (permalink / raw)
  To: Bernd Edlinger
  Cc: Jakub Jelinek, Manuel López-Ibáñez, gcc-patches

Bernd Edlinger <bernd.edlinger@hotmail.de> writes:

> If you want to have at least a chance to survive something like:
>
>
> dd if=/dev/zero of=test.c bs=10240 count=10000000
>
> gcc -Wall test.c
>
>
> Then you should change the implementation of read_line to
> _not_ returning something like 100GB of zeros.

I'd say that in that case, we'd rather just die in an OOM condition and
be done with it.  Otherwise, If fear that read_line might become too
slow; you'd have to detect that the content is just zeros, for instance.

> IMHO it would be nice to limit lines returned to 10.000 bytes,
> maybe add "..." or "<line too long>" if the limit is reached.

In general, setting a limit for pathological cases like this is a good
idea, I think.  But that seems a bit ouf of the scope of this particular
bug fix; we'd need to e.g, define a new command line argument to extend
that limit if need be, for instance.  If people really feel strongly
about this I can propose a later patch to set a limit in get_line and
define a command like argument that would override that parameter.

> And maybe it would make the life of read_line's callers lots easier
> if the zero-chars are silently replaced with spaces in the returned
> line buffer.

As speed seemed to be a concern (even if, in my opinion, we are dealing
with diagnostics that are being emitted when the compilation has been
halted anyway, so we shouldn't be too concerned, unless we are talking
about pathological cases), I think that read_line should be fast by
default.  If a particular caller doesn't want to see the zeros (and thus
is ready to pay the speed price) then it can replace the zeros with
white space.  Otherwise, let's have read_line be as fast as possible.

Also keep in mind that in subsequent patches, read_line might be re-used
by e.g, gcov in nominal contexts where we don't have zeros in the middle
of the line.  In that case, speed can be a concern.

Thanks for the helpful thoughts.

-- 
		Dodji

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-11-05  9:50               ` Dodji Seketeli
  2013-11-05 11:19                 ` Bernd Edlinger
@ 2013-11-06 22:27                 ` Bernd Edlinger
  1 sibling, 0 replies; 46+ messages in thread
From: Bernd Edlinger @ 2013-11-06 22:27 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: Jakub Jelinek, Manuel López-Ibáñez, gcc-patches

Sorry Dodji,

I still do not see how this is supposed to work:

If the previous invocation of get_line already had read some
characters of the following line(s), how is that information
recovered?

I see it is copied behind lineptr[cur_len].
But when get_line is re-entered, cur_len is set to zero again.
and all that contents up to 16K are forgotten. Right?

If an empty line of just a new-line is read, the return value
of get_line is 0, and string is "". But the return value of
read_line is NULL in that case. Now the function
location_get_source_line will leave the while loop.
But there may be more lines, propably not just empty ones?

How did you test your patch?


Regards
Bernd. 		 	   		  

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-10-31 18:26     ` Jakub Jelinek
  2013-11-04 11:52       ` Dodji Seketeli
@ 2013-11-11 10:49       ` Dodji Seketeli
  2013-11-11 14:35         ` Jakub Jelinek
  1 sibling, 1 reply; 46+ messages in thread
From: Dodji Seketeli @ 2013-11-11 10:49 UTC (permalink / raw)
  To: GCC Patches
  Cc: Jakub Jelinek, Tom Tromey, Manuel López-Ibáñez,
	Bernd Edlinger

Hello,

As it appeared that concerns about the speed of
location_get_source_line were as present as the need of just fixing
this bug, I have conflated the two concerns in a new attempt below,
trying to address the points you guys have raised during the previous
reviews.

The patch below introduces a cache for the data read from the file we
want to emit caret diagnostic for.  In that cache it stashes the bytes
read from the file as well as a number of positions of line delimiters
that we encountered while reading the file.  It keeps a number of the
last file caches in memory in case location_get_source_line is later
asked for lines from the same file.

To avoid exploding the memory consumption, the number line delimiter
position saved is fixed (100).  So if a file is smaller than 100 lines
all of its line positions can be saved.  That is, if
location_get_source_line is first asked to return line 20, all the
position of the lines encountered since the beginning of the file --
up to line 20 -- are going to be saved in the cache.  Next time, if
location_get_source_line is asked to return line 10, as the position
of the beginning/end of line 10 is saved in the cache, returning that
line is fast.  If it's asked to return line 25, it will have to start
reading from line 20, not from the beginning of the file.

If the file is bigger than 100, then the patch just saves 100 line
positions.  To evenly spread the line position saved, it needs to know
the total number lines of the file.  Luckily we can usually get this
information from the line map subsystem (from libcpp).  The patch thus
adds a new entry point in the line map
(linemap_get_file_highest_location) that gives the greatest
source_location seen for a given file and uses that to decide what
line position to save in the cache.

The speed gain I have seen is variable, depending on the size (in
number of quasi adjacent lines) of the diagnostics, but on some
pathological cases I have seen, it can divide the time spend
displaying the diagnostics by two ore more.  I had to add hackery in
the code to measure this, unfortunately :-(

The patch doesn't try to reuse the same infrastructure for gcov for
now.  I am letting that for later now when I have more time.

Bootstrapped on x86_64-unknown-linux-gnu against trunk.

PS: To ease the review (especially for Tom Tromey who I am CC-ing
because of the new entry point in the line map sub-system) I am
attaching the cover letter of the patch itself that does the analysis
of the initial bug.  Sorry to the other addressees of this message for
the redundancy.

Thanks.


---------------------------------------->8<-------------------------------
In this problem report, the compiler is fed a (bogus) translation unit
in which some literals contain bytes whose value is zero.  The
preprocessor detects that and proceeds to emit diagnostics for that
king of bogus literals.  But then when the diagnostics machinery
re-reads the input file again to display the bogus literals with a
caret, it attempts to calculate the length of each of the lines it got
using fgets.  The line length calculation is done using strlen.  But
that doesn't work well when the content of the line can have several
zero bytes.  The result is that the read_line never sees the end of
the line because strlen repeatedly reports that the line ends before
the end-of-line character; so read_line thinks its buffer for reading
the line is too small; it thus increases the buffer, leading to a huge
memory consumption and disaster.

Here is what this patch does.

location_get_source_line is modified to return the length of a source
line that can now contain bytes with zero value.
diagnostic_show_locus() is then modified to consider that a line can
have characters of value zero, and so just shows a white space when
instructed to display one of these characters.

Additionally location_get_source_line is modified to avoid re-reading
each and every line from the beginning of the file until it reaches
the line number N that it is instructed to get; this was leading to
annoying quadratic behaviour when reading adjacent lines near the end
of (big) files.  So a cache is now associated to the file opened in
text mode.  When the content of the file is read, that content is
stashed in the file cache.  That file cache is searched for line
delimiters.  A number of line positions are saved in the cache and a
number of file caches are kept in memory.  That way when
location_get_source_line is asked to read line N + 1, it just has to
start reading from line N that it has already read.
---------------------------------------->8<-------------------------------

And now the real patch.

libcpp/ChangeLog:

	* include/line-map.h (linemap_get_file_highest_location): Declare
	new function.
	* line-map.c (linemap_get_file_highest_location): Define it.

gcc/ChangeLog:

	* input.h (location_get_source_line): Take an additional line_size
	parameter.
	(void diagnostics_file_cache_fini): Declare new function.
	* input.c (struct fcache): New type.
	(fcache_tab_size, fcache_buffer_size, fcache_line_record_size):
	New static constants.
	(diagnostic_file_cache_init, lookup_file_in_cache_tab)
	(add_file_to_cache_tab, lookup_or_add_file_to_cache_tab)
	(needs_read, needs_grow, maybe_grow, read_data, maybe_read_data)
	(get_next_line, read_next_line, goto_next_line, read_line_num):
	New static function definitions.
	(diagnostic_file_cache_fini): New function.
	(location_get_source_line): Take an additional output line_len
	parameter.  Re-write using lookup_or_add_file_to_cache_tab and
	read_line_num.
	* diagnostic.c (diagnostic_finish): Call
	diagnostic_file_cache_fini.
	(adjust_line): Take an additional input parameter for the length
	of the line, rather than calculating it with strlen.
	(diagnostic_show_locus): Adjust the use of
	location_get_source_line and adjust_line with respect to their new
	signature.  While displaying a line now, do not stop at the first
	null byte.  Rather, display the zero byte as a space and keep
	going until we reach the size of the line.
	* Makefile.in: Add vec.o to OBJS-libcommon

gcc/testsuite/ChangeLog:

	* c-c++-common/cpp/warning-zero-in-literals-1.c: New test file.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@204453 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/Makefile.in                                    |   2 +-
 gcc/diagnostic.c                                   |  19 +-
 gcc/diagnostic.h                                   |   1 +
 gcc/input.c                                        | 549 +++++++++++++++++++--
 gcc/input.h                                        |   5 +-
 .../c-c++-common/cpp/warning-zero-in-literals-1.c  | Bin 0 -> 240 bytes
 libcpp/include/line-map.h                          |   8 +
 libcpp/line-map.c                                  |  40 ++
 8 files changed, 585 insertions(+), 39 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 49285e5..50c2482 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1469,7 +1469,7 @@ OBJS = \
 
 # Objects in libcommon.a, potentially used by all host binaries and with
 # no target dependencies.
-OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o input.o version.o
+OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o vec.o input.o version.o
 
 # Objects in libcommon-target.a, used by drivers and by the core
 # compiler and containing target-dependent code.
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 36094a1..6c83f03 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -176,6 +176,8 @@ diagnostic_finish (diagnostic_context *context)
 		     progname);
       pp_newline_and_flush (context->printer);
     }
+
+  diagnostic_file_cache_fini ();
 }
 
 /* Initialize DIAGNOSTIC, where the message MSG has already been
@@ -259,12 +261,13 @@ diagnostic_build_prefix (diagnostic_context *context,
    MAX_WIDTH by some margin, then adjust the start of the line such
    that the COLUMN is smaller than MAX_WIDTH minus the margin.  The
    margin is either 10 characters or the difference between the column
-   and the length of the line, whatever is smaller.  */
+   and the length of the line, whatever is smaller.  The length of
+   LINE is given by LINE_WIDTH.  */
 static const char *
-adjust_line (const char *line, int max_width, int *column_p)
+adjust_line (const char *line, int line_width,
+	     int max_width, int *column_p)
 {
   int right_margin = 10;
-  int line_width = strlen (line);
   int column = *column_p;
 
   right_margin = MIN (line_width - column, right_margin);
@@ -284,6 +287,7 @@ diagnostic_show_locus (diagnostic_context * context,
 		       const diagnostic_info *diagnostic)
 {
   const char *line;
+  int line_width;
   char *buffer;
   expanded_location s;
   int max_width;
@@ -297,22 +301,25 @@ diagnostic_show_locus (diagnostic_context * context,
 
   context->last_location = diagnostic->location;
   s = expand_location_to_spelling_point (diagnostic->location);
-  line = location_get_source_line (s);
+  line = location_get_source_line (s, &line_width);
   if (line == NULL)
     return;
 
   max_width = context->caret_max_width;
-  line = adjust_line (line, max_width, &(s.column));
+  line = adjust_line (line, line_width, max_width, &(s.column));
 
   pp_newline (context->printer);
   saved_prefix = pp_get_prefix (context->printer);
   pp_set_prefix (context->printer, NULL);
   pp_space (context->printer);
-  while (max_width > 0 && *line != '\0')
+  while (max_width > 0 && line_width > 0)
     {
       char c = *line == '\t' ? ' ' : *line;
+      if (c == '\0')
+	c = ' ';
       pp_character (context->printer, c);
       max_width--;
+      line_width--;
       line++;
     }
   pp_newline (context->printer);
diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h
index cb38d37..3f30e06 100644
--- a/gcc/diagnostic.h
+++ b/gcc/diagnostic.h
@@ -291,6 +291,7 @@ void default_diagnostic_starter (diagnostic_context *, diagnostic_info *);
 void default_diagnostic_finalizer (diagnostic_context *, diagnostic_info *);
 void diagnostic_set_caret_max_width (diagnostic_context *context, int value);
 
+void diagnostic_file_cache_fini (void);
 
 /* Pure text formatting support functions.  */
 extern char *file_name_as_prefix (diagnostic_context *, const char *);
diff --git a/gcc/input.c b/gcc/input.c
index a141a92..06a0f35 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -22,6 +22,75 @@ along with GCC; see the file COPYING3.  If not see
 #include "coretypes.h"
 #include "intl.h"
 #include "input.h"
+#include "vec.h"
+
+/* This is a cache used by get_next_line to store the content of a
+   file to be searched for file lines.  */
+struct fcache
+{
+  /* These are information used to store a line boundary.  */
+  struct line_info
+  {
+    /* The line number.  It starts from 1.  */
+    size_t line_num;
+
+    /* The position (byte count) of the beginning of the line,
+       relative to the file data pointer.  This starts at zero.  */
+    size_t start_pos;
+
+    /* The position (byte count) the the last byte of the line.  This
+       normally points to the '\n' character, or to one byte after the
+       last byte of the file, if the file doesn't contain a '\n'
+       character.  */
+    size_t end_pos;
+
+    line_info (size_t l, size_t s, size_t e)
+      : line_num (l), start_pos (s), end_pos (e)
+    {}
+
+    line_info ()
+      :line_num (0), start_pos (0), end_pos (0)
+    {}
+  };
+
+  const char *file_path;
+
+  FILE *fp;
+
+  /* This points to the content of the file that we've read so
+     far.  */
+  char *data;
+
+  /*  The size of the DATA array above.*/
+  size_t size;
+
+  /* The number of bytes read from the underlying file so far.  This
+     must be less (or equal) than SIZE above.  */
+  size_t nb_read;
+
+  /* The index of the beginning of the current line.  */
+  size_t line_start_idx;
+
+  /* The number of the previous line read.  This starts at 1.  Zero
+     means we've read no line so far.  */
+  size_t line_num;
+
+  /* This is the total number of lines of the current file.  At the
+     moment, we try to get this information from the line map.  */
+  size_t total_lines;
+
+  /* This is a record of the beginning and end of the lines we've seen
+     while reading the file.  This is useful to avoid walking the data
+     from the beginning when we are asked to read a line that is
+     before LINE_START_IDX above.  Note that the maximum size of this
+     record is fcache_line_record_size, so that the memory consumption
+     doesn't explode.  We thus scale total_lines down to
+     fcache_line_record_size.  */
+  vec<line_info, va_heap> line_record;
+
+  fcache ();
+  ~fcache ();
+};
 
 /* Current position in real source file.  */
 
@@ -29,6 +98,11 @@ location_t input_location;
 
 struct line_maps *line_table;
 
+static fcache *fcache_tab;
+static const size_t fcache_tab_size = 16;
+static const size_t fcache_buffer_size = 4 * 1024;
+static const size_t fcache_line_record_size = 100;
+
 /* Expand the source location LOC into a human readable location.  If
    LOC resolves to a builtin location, the file name of the readable
    location is set to the string "<built-in>". If EXPANSION_POINT_P is
@@ -87,56 +161,469 @@ expand_location_1 (source_location loc,
   return xloc;
 }
 
-/* Reads one line from file into a static buffer.  */
-static const char *
-read_line (FILE *file)
+/* Initialize the set of cache used for files accessed by caret
+   diagnostic.  */
+
+static void
+diagnostic_file_cache_init (void)
+{
+  if (fcache_tab == NULL)
+    fcache_tab = new fcache[fcache_tab_size];
+}
+
+/* Free the ressources used by the set of cache used for files accessed
+   by caret diagnostic.  */
+
+void
+diagnostic_file_cache_fini (void)
+{
+  if (fcache_tab)
+    {
+      delete [] (fcache_tab);
+      fcache_tab = NULL;
+    }
+}
+
+/* Lookup the cache used for the content of a given file accessed by
+   caret diagnostic.  Return the found cached file, or NULL if no
+   cached file was found.  */
+
+static fcache*
+lookup_file_in_cache_tab (const char *file_path)
+{
+  diagnostic_file_cache_init ();
+
+  for (unsigned i = 0; i < fcache_tab_size; ++i)
+    if (fcache_tab[i].file_path && !strcmp (fcache_tab[i].file_path, file_path))
+      return &fcache_tab[i];
+
+  return NULL;
+}
+
+/* Create the cache used for the content of a given file to be
+   accessed by caret diagnostic.  This cache is added to an array of
+   cache and can be retrieved by lookup_file_in_cache_tab.  This
+   function returns the created cache.  Note that only the last
+   fcache_tab_size files are cached.  */
+
+static fcache*
+add_file_to_cache_tab (const char *file_path)
+{
+  static size_t idx;
+  fcache *r;
+
+  FILE *fp = fopen (file_path, "r");
+  if (ferror (fp))
+    {
+      fclose (fp);
+      return NULL;
+    }
+
+  r = &fcache_tab[idx];
+  r->file_path = file_path;
+  if (r->fp)
+    fclose (r->fp);
+  r->fp = fp;
+  r->nb_read = 0;
+  r->line_start_idx = 0;
+  r->line_num = 0;
+  r->line_record.truncate (0);
+
+  source_location l = 0;
+  if (linemap_get_file_highest_location (line_table, file_path, &l))
+    {
+      gcc_assert (l >= RESERVED_LOCATION_COUNT);
+      expanded_location xloc = expand_location (l);
+      r->total_lines = xloc.line;
+    }
+
+  /* Increment idx, modulo fcache_tab_size.  */
+  ++idx;
+  idx = idx % fcache_tab_size;
+
+  return r;
+}
+
+/* Lookup the cache used for the content of a given file accessed by
+   caret diagnostic.  If no cached file was found, create a new cache
+   for this file, add it to the array of cached file and return
+   it.  */
+
+static fcache*
+lookup_or_add_file_to_cache_tab (const char *file_path)
+{
+  fcache * r = lookup_file_in_cache_tab (file_path);
+  if (r ==  NULL)
+    r = add_file_to_cache_tab (file_path);
+  return r;
+}
+
+/* Default constructor for a cache of file used by caret
+   diagnostic.  */
+
+fcache::fcache ()
+: file_path (NULL), fp (NULL), data (0),
+  size (0), nb_read (0), line_start_idx (0), line_num (0),
+  total_lines (0)
+{
+  line_record.create (0);
+}
+
+/* Destructor for a cache of file used by caret diagnostic.  */
+
+fcache::~fcache ()
+{
+  if (fp)
+    {
+      fclose (fp);
+      fp = NULL;
+    }
+  if (data)
+    {
+      XDELETEVEC (data);
+      data = 0;
+    }
+  line_record.release ();
+}
+
+/* Returns TRUE iff the cache would need to be filled with data coming
+   from the file.  That is, either the cache is empty or full or the
+   current line is empty.  Note that if the cache is full, it would
+   need to be extended and filled again.  */
+
+static bool
+needs_read (fcache *c)
+{
+  return (c->nb_read == 0
+	  || c->nb_read == c->size
+	  || (c->line_start_idx >= c->nb_read - 1));
+}
+
+/*  Return TRUE iff the cache is full and thus needs to be
+    extended.  */
+
+static bool
+needs_grow (fcache *c)
+{
+  return c->nb_read == c->size;
+}
+
+/* Grow the cache if it needs to be extended.  */
+
+static void
+maybe_grow (fcache *c)
+{
+  if (!needs_grow (c))
+    return;
+
+  size_t size = c->size == 0 ? fcache_buffer_size : c->size * 2;
+  c->data = XRESIZEVEC (char, c->data, size + 1);
+  c->size = size;
+}
+
+/*  Read more data into the cache.  Extends the cache if need be.
+    Returns TRUE iff new data could be read.  */
+
+static bool
+read_data (fcache *c)
 {
-  static char *string;
-  static size_t string_len;
-  size_t pos = 0;
-  char *ptr;
+  if (feof (c->fp) || ferror (c->fp))
+    return false;
+
+  maybe_grow (c);
+
+  char * from = c->data + c->nb_read;
+  size_t to_read = c->size - c->nb_read;
+  if (ferror (c->fp))
+    return false;
+  size_t nb_read = fread (from, 1, to_read, c->fp);
+
+  if (ferror (c->fp))
+    return false;
 
-  if (!string_len)
+  c->nb_read += nb_read;
+  return !!nb_read;
+}
+
+/* Read new data iff the cache needs to be filled with more data
+   coming from the file FP.  Return TRUE iff the cache was filled with
+   mode data.  */
+
+static bool
+maybe_read_data (fcache *c)
+{
+  if (!needs_read (c))
+    return false;
+  return read_data (c);
+}
+
+/* Read a new line from file FP, using C as a cache for the data
+   coming from the file.  Upon successful completion, *LINE is set to
+   the beginning of the line found.  Space for that line has been
+   allocated in the cache thus *LINE has the same life time as C.
+   This function returns the length of the line, including the
+   terminal '\n' character.  Note that subsequent calls to
+   get_next_line return the next lines of the file and might overwrite
+   the content of *LINE.  */
+
+static ssize_t
+get_next_line (fcache *c, char **line)
+{
+  /* Fill the cache with data to process.  */
+  maybe_read_data (c);
+
+  size_t remaining_size = c->nb_read - c->line_start_idx;
+  if (remaining_size == 0)
+    /* There is no more data to process.  */
+    return 0;
+
+  char *line_start = c->data + c->line_start_idx;
+
+  char *next_line_start = NULL;
+  size_t line_len = 0;
+  char *line_end = (char *) memchr (line_start, '\n', remaining_size);
+  if (line_end == NULL)
+    {
+      /* We haven't found the end-of-line delimiter in the cache.
+	 Fill the cache with more data from the file and look for the
+	 '\n'.  */
+      while (maybe_read_data (c))
+	{
+	  line_start = c->data + c->line_start_idx;
+	  remaining_size = c->nb_read - c->line_start_idx;
+	  line_end = (char *) memchr (line_start, '\n', remaining_size);
+	  if (line_end != NULL)
+	    {
+	      next_line_start = line_end + 1;
+	      line_len = line_end - line_start + 1;
+	      break;
+	    }
+	}
+      if (line_end == NULL)
+	{
+	  /* We've loadded all the file into the cache and still no
+	     '\n'.  Let's say the line ends up at the byte after the
+	     last byte of the file.  */
+	  line_end = c->data + c->nb_read;
+	  line_len = c->nb_read - c->line_start_idx;
+	}
+    }
+  else
     {
-      string_len = 200;
-      string = XNEWVEC (char, string_len);
+      next_line_start = line_end + 1;
+      line_len = line_end - line_start + 1;;
     }
 
-  while ((ptr = fgets (string + pos, string_len - pos, file)))
+  if (ferror (c->fp))
+    return -1;
+
+  /* At this point, we've found the end of the of line.  It either
+     points to the '\n' or to one byte after the last byte of the
+     file.  */
+  gcc_assert (line_end != NULL);
+
+  if (c->line_start_idx < c->nb_read)
+    *line = line_start;
+
+  gcc_assert (line_len > 0);
+
+  ++c->line_num;
+
+  /* Now update our line record so that re-reading lines from the
+     before c->line_start_idx is faster.  */
+  if (c->line_record.length () < fcache_line_record_size)
     {
-      size_t len = strlen (string + pos);
+      /* If the the file lines fits in the line record, we just record
+	 all its lines ...*/
+      if (c->total_lines <= fcache_line_record_size
+	  && c->line_num > c->line_record.length ())
+	c->line_record.safe_push (fcache::line_info (c->line_num,
+						 c->line_start_idx,
+						 line_end - c->data));
+      else if (c->total_lines > fcache_line_record_size)
+	{
+	  /* ... otherwise, we just scale total_lines down to
+	     (fcache_line_record_size lines.  */
+	  size_t n = (c->line_num * fcache_line_record_size) / c->total_lines;
+	  if (c->line_record.length () == 0
+	      || n >= c->line_record.length ())
+	    c->line_record.safe_push (fcache::line_info (c->line_num,
+						     c->line_start_idx,
+						     line_end - c->data));
+	}
+    }
+
+  /* Update c->line_start_idx so that it points to the next line to be
+     read.  */
+  if (next_line_start)
+    c->line_start_idx = next_line_start - c->data;
+  else
+    /* We didn't find any terminal '\n'.  Let's consider that the end
+       of line is the end of the data in the cache.  The next
+       invocation of get_next_line will either read more data from the
+       underlying file or return false early because we've reached the
+       end of the file.  */
+    c->line_start_idx = c->nb_read;
+
+  return line_len;
+}
+
+/* Reads the next line from FILE into *LINE.  If *LINE is too small
+   (or NULL) it is allocated (or extended) to have enough space to
+   containe the line.  *LINE_LENGTH must contain the size of the
+   initial*LINE buffer.  It's then updated by this function to the
+   actual length of the returned line.  Note that the returned line
+   can contain several zero bytes.  Also note that the returned string
+   is allocated in static storage that is going to be re-used by
+   subsequent invocations of read_line.  */
+
+static bool
+read_next_line (fcache *cache, char ** line, ssize_t *line_len)
+{
+  char *l = NULL;
+  ssize_t len = get_next_line (cache, &l);
+
+  if (len > 0)
+    {
+      if (*line == NULL)
+	{
+	  *line = XNEWVEC (char, len);
+	  *line_len = len;
+	}
+      else
+	if (*line_len < len)
+	  *line = XRESIZEVEC (char, *line, len);
+
+      memmove (*line, l, len);
+      (*line)[len - 1] = '\0';
+      *line_len = --len;
+      return true;
+    }
+
+  return false;
+}
 
-      if (string[pos + len - 1] == '\n')
+/* Consume the next bytes coming from the cache (or from its
+   underlying file if there are remaining unread bytes in the file)
+   until we reach the next end-of-line (or end-of-file).  There is no
+   copying from the cache involved.  Return TRUE upon successful
+   completion.  */
+
+static bool
+goto_next_line (fcache *cache)
+{
+  char *l = NULL;
+  ssize_t len = get_next_line (cache, &l);
+  return (len > 0 );
+}
+
+/* Read an arbitrary line number LINE_NUM from the file cached in C.
+   The line is copied into *LINE.  *LINE_LEN must have been set to the
+   length of *LINE.  If *LINE is too small (or NULL) it's extended (or
+   allocated) and *LINE_LEN is adjusted accordingly.  *LINE ends up
+   with a terminal zero byte and can contain additional zero bytes.
+   This function returns bool if a line was read.  */
+
+static bool
+read_line_num (fcache *c, size_t line_num,
+	       char ** line, ssize_t *line_len)
+{
+  gcc_assert (line_num > 0);
+
+  if (line_num <= c->line_num)
+    {
+      /* We've been asked to read lines that are before c->line_num.
+	 So lets use our line record (if it's not empty) to try to
+	 avoid re-reading the file from the beginning again.  */
+
+      if (c->line_record.is_empty ())
+	{
+	  c->line_start_idx = 0;
+	  c->line_num = 0;
+	}
+      else
 	{
-	  string[pos + len - 1] = 0;
-	  return string;
+	  fcache::line_info *i = NULL;
+	  if (c->total_lines <= fcache_line_record_size)
+	    {
+	      /*  Every line we've read has its start/end recorded
+		  here.  So this is going to be fast.  */
+	      gcc_assert (line_num <= c->total_lines);
+	      i = &c->line_record[line_num - 1];
+	      gcc_assert (i->line_num == line_num);
+	    }
+	  else
+	    {
+	      /*  So the file had more lines than our line record
+		  size.  Thus the number of lines we've recorded has
+		  been scaled down to fcache_line_reacord_size.  Let's
+		  pick the start/end of the recorded line that is
+		  closest to line_num.  */
+	      size_t n = line_num * fcache_line_record_size / c->total_lines;
+	      if (n < c->line_record.length ())
+		{
+		  i = &c->line_record[n];
+		  gcc_assert (i->line_num <= line_num);
+		}
+	    }
+
+	  if (i && i->line_num == line_num)
+	    {
+	      /* We have the start/end of the line.  Let's just copy
+		 it again and we are done.  */
+	      ssize_t len = i->end_pos - i->start_pos + 1;
+	      if (*line_len < len)
+		*line = XRESIZEVEC (char, *line, len);
+	      memmove (*line, c->data + i->start_pos, len);
+	      (*line)[len - 1] = '\0';
+	      *line_len = --len;
+	      return true;
+	    }
+
+	  if (i)
+	    {
+	      c->line_start_idx = i->start_pos;
+	      c->line_num = i->line_num - 1;
+	    }
+	  else
+	    {
+	      c->line_start_idx = 0;
+	      c->line_num = 0;
+	    }
 	}
-      pos += len;
-      string = XRESIZEVEC (char, string, string_len * 2);
-      string_len *= 2;
     }
-      
-  return pos ? string : NULL;
+
+  /*  Let's walk from line c->line_num up to line_num - 1, without
+      copying any line.  */
+  while (c->line_num < line_num - 1)
+    if (!goto_next_line (c))
+      return false;
+
+  /* The line we want is the next one.  Let's read and copy it back to
+     the caller.  */
+  return read_next_line (c, line, line_len);
 }
 
 /* Return the physical source line that corresponds to xloc in a
    buffer that is statically allocated.  The newline is replaced by
-   the null character.  */
+   the null character.  Note that the line can contain several null
+   characters, so LINE_LEN, if non-null, points to the actual length
+   of the line.  */
 
 const char *
-location_get_source_line (expanded_location xloc)
+location_get_source_line (expanded_location xloc,
+			  int *line_len)
 {
-  const char *buffer;
-  int lines = 1;
-  FILE *stream = xloc.file ? fopen (xloc.file, "r") : NULL;
-  if (!stream)
-    return NULL;
+  static char *buffer;
+  static ssize_t len;
+
+  fcache * c = lookup_or_add_file_to_cache_tab (xloc.file);
+  bool read = read_line_num (c, xloc.line, &buffer, &len);
 
-  while ((buffer = read_line (stream)) && lines < xloc.line)
-    lines++;
+  if (read && line_len)
+    *line_len = len;
 
-  fclose (stream);
-  return buffer;
+  return read ? buffer : NULL;
 }
 
 /* Expand the source location LOC into a human readable location.  If
diff --git a/gcc/input.h b/gcc/input.h
index 8fdc7b2..c82023f 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -37,7 +37,8 @@ extern char builtins_location_check[(BUILTINS_LOCATION
 				     < RESERVED_LOCATION_COUNT) ? 1 : -1];
 
 extern expanded_location expand_location (source_location);
-extern const char *location_get_source_line (expanded_location xloc);
+extern const char *location_get_source_line (expanded_location xloc,
+					     int *line_size);
 extern expanded_location expand_location_to_spelling_point (source_location);
 extern source_location expansion_point_location_if_in_system_header (source_location);
 
@@ -65,4 +66,6 @@ extern location_t input_location;
 
 void dump_line_table_statistics (void);
 
+void diagnostics_file_cache_fini (void);
+
 #endif
diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c b/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c
new file mode 100644
index 0000000000000000000000000000000000000000..ff2ed962ac96e47ae05b0b040f4e10b8e09637e2
GIT binary patch
literal 240
zcmdPbSEyD<N!LxuS12e-Ehx%QPAx80sO92PVo*}h*HVDUmM0eFW#*+TDCL#r<R~O(
UBo-wmm!uXcDby-x=?^KT09Xk|)&Kwi

literal 0
HcmV?d00001

diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index a0d6da1..3504fd6 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -756,6 +756,14 @@ struct linemap_stats
   long duplicated_macro_maps_locations_size;
 };
 
+/* Return the highest location emitted for a given file for which
+   there is a line map in SET.  FILE_NAME is the file name to
+   consider.  If the function returns TRUE, *LOC is set to the highest
+   location emitted for that file.  */
+bool linemap_get_file_highest_location (struct line_maps * set,
+					const char *file_name,
+					source_location*LOC);
+
 /* Compute and return statistics about the memory consumption of some
    parts of the line table SET.  */
 void linemap_get_statistics (struct line_maps *, struct linemap_stats *);
diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index 2ad7ad2..3c0f74d 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -1502,6 +1502,46 @@ linemap_dump_location (struct line_maps *set,
 	   path, from, l, c, s, (void*)map, e, loc, location);
 }
 
+/* Return the highest location emitted for a given file for which
+   there is a line map in SET.  FILE_NAME is the file name to
+   consider.  If the function returns TRUE, *LOC is set to the highest
+   location emitted for that file.  */
+
+bool
+linemap_get_file_highest_location (struct line_maps *set,
+				   const char *file_name,
+				   source_location *loc)
+{
+  /* If the set is empty or no ordinary map has been created then
+     there is no file to look for ...  */
+  if (set == NULL || set->info_ordinary.used == 0)
+    return false;
+
+  /* Now look for the last ordinary map created for FILE_NAME.  */
+  int i;
+  for (i = set->info_ordinary.used - 1; i >= 0; --i)
+    {
+      const char *fname = set->info_ordinary.maps[i].d.ordinary.to_file;
+      if (fname && !strcmp (fname, file_name))
+	break;
+    }
+
+  if (i < 0)
+    return false;
+
+  /* The highest location for a given map is either the starting
+     location of the next map minus one, or -- if the map is the
+     latest one -- the highest location of the set.  */
+  source_location result;
+  if (i == (int) set->info_ordinary.used - 1)
+    result = set->highest_location;
+  else
+    result = set->info_ordinary.maps[i].start_location - 1;
+
+  *loc = result;
+  return true;
+}
+
 /* Compute and return statistics about the memory consumption of some
    parts of the line table SET.  */
 
-- 
                        Dodji


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-11-11 10:49       ` Dodji Seketeli
@ 2013-11-11 14:35         ` Jakub Jelinek
  2013-11-11 17:13           ` Dodji Seketeli
  0 siblings, 1 reply; 46+ messages in thread
From: Jakub Jelinek @ 2013-11-11 14:35 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: GCC Patches, Tom Tromey, Manuel López-Ibáñez,
	Bernd Edlinger

On Mon, Nov 11, 2013 at 11:19:21AM +0100, Dodji Seketeli wrote:
>  .../c-c++-common/cpp/warning-zero-in-literals-1.c  | Bin 0 -> 240 bytes
>  libcpp/include/line-map.h                          |   8 +
>  libcpp/line-map.c                                  |  40 ++
>  8 files changed, 585 insertions(+), 39 deletions(-)
>  create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c
> 
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 49285e5..50c2482 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1469,7 +1469,7 @@ OBJS = \
>  
>  # Objects in libcommon.a, potentially used by all host binaries and with
>  # no target dependencies.
> -OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o input.o version.o
> +OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o vec.o input.o version.o

Too long line?

> +      if (c == '\0')
> +	c = ' ';
>        pp_character (context->printer, c);

Does that match how libcpp counts the embedded '\0' character in column
computation?

> +    /* The position (byte count) the the last byte of the line.  This
> +       normally points to the '\n' character, or to one byte after the
> +       last byte of the file, if the file doesn't contain a '\n'
> +       character.  */
> +    size_t end_pos;

Does it really help to note this?  You can always just walk the line from
start_pos looking for '\n' or end of file.

> +static fcache*
> +add_file_to_cache_tab (const char *file_path)
> +{
> +  static size_t idx;
> +  fcache *r;
> +
> +  FILE *fp = fopen (file_path, "r");
> +  if (ferror (fp))
> +    {
> +      fclose (fp);
> +      return NULL;
> +    }
> +
> +  r = &fcache_tab[idx];

Wouldn't it be better to use some LRU algorithm to determine which
file to kick out of the cache?  Have some tick counter of cache lookups (or
creation) and assign the tick counter to the just created resp. used
cache entry, and remove the one with the smallest counter?
> +  fcache * r = lookup_file_in_cache_tab (file_path);
> +  if (r ==  NULL)

Formatting (no space after *, extra space after ==).

	Jakub

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-11-11 14:35         ` Jakub Jelinek
@ 2013-11-11 17:13           ` Dodji Seketeli
  2013-11-12 16:42             ` Dodji Seketeli
  0 siblings, 1 reply; 46+ messages in thread
From: Dodji Seketeli @ 2013-11-11 17:13 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: GCC Patches, Tom Tromey, Manuel López-Ibáñez,
	Bernd Edlinger

Jakub Jelinek <jakub@redhat.com> writes:

>> -OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o input.o version.o
>> +OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o vec.o input.o version.o
>
> Too long line?

Fixed in my local copy of the patch, thanks.

>
>> +      if (c == '\0')
>> +	c = ' ';
>>        pp_character (context->printer, c);
>
> Does that match how libcpp counts the embedded '\0' character in column
> computation?

Yes, I think so.  _cpp_lex_direct in libcpp/lex.c considers '\0' just
like a white space and so the column number is incremented when it's
encountered.

>> +    /* The position (byte count) the the last byte of the line.  This
>> +       normally points to the '\n' character, or to one byte after the
>> +       last byte of the file, if the file doesn't contain a '\n'
>> +       character.  */
>> +    size_t end_pos;
>
> Does it really help to note this?  You can always just walk the line from
> start_pos looking for '\n' or end of file.

Yes you are right, it's not strictly necessary.  But with that end_pos,
copying a line is even faster; no need of walking.  I thought the goal
was to avoid re-doing the work we've already done, as much as possible.

>
>> +static fcache*
>> +add_file_to_cache_tab (const char *file_path)
>> +{
>> +  static size_t idx;
>> +  fcache *r;
>> +
>> +  FILE *fp = fopen (file_path, "r");
>> +  if (ferror (fp))
>> +    {
>> +      fclose (fp);
>> +      return NULL;
>> +    }
>> +
>> +  r = &fcache_tab[idx];
>
> Wouldn't it be better to use some LRU algorithm to determine which
> file to kick out of the cache?  Have some tick counter of cache lookups (or
> creation) and assign the tick counter to the just created resp. used
> cache entry, and remove the one with the smallest counter?

Hehe, the LRU idea occurred to me too, but I dismissed the idea as
something probably over-engineered.  But now that you are mentioning it
I guess I should give it a try ;-) I'll post a patch about that later
then.

>> +  fcache * r = lookup_file_in_cache_tab (file_path);
>> +  if (r ==  NULL)
>
> Formatting (no space after *, extra space after ==).

Fixed in my local copy.  Thanks.

-- 
		Dodji

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-11-11 17:13           ` Dodji Seketeli
@ 2013-11-12 16:42             ` Dodji Seketeli
  2013-11-13  5:10               ` Bernd Edlinger
  2013-11-13  9:51               ` Jakub Jelinek
  0 siblings, 2 replies; 46+ messages in thread
From: Dodji Seketeli @ 2013-11-12 16:42 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: GCC Patches, Tom Tromey, Manuel López-Ibáñez,
	Bernd Edlinger

Hello,

Below is the updated patch amended to take your previous comments in
account.

In add_file_to_cache_tab the evicted cache array entry is the one that
was less used.

Incidentally I also fixed some thinkos and issued that I have seen in
the previous patch.

Bootstrapped on x86_64-unknown-linux-gnu against trunk.

libcpp/ChangeLog:

	* include/line-map.h (linemap_get_file_highest_location): Declare
	new function.
	* line-map.c (linemap_get_file_highest_location): Define it.

gcc/ChangeLog:

	* input.h (location_get_source_line): Take an additional line_size
	parameter.
	(void diagnostics_file_cache_fini): Declare new function.
	* input.c (struct fcache): New type.
	(fcache_tab_size, fcache_buffer_size, fcache_line_record_size):
	New static constants.
	(diagnostic_file_cache_init, total_lines_num)
	(lookup_file_in_cache_tab, evicted_cache_tab_entry)
	(add_file_to_cache_tab, lookup_or_add_file_to_cache_tab)
	(needs_read, needs_grow, maybe_grow, read_data, maybe_read_data)
	(get_next_line, read_next_line, goto_next_line, read_line_num):
	New static function definitions.
	(diagnostic_file_cache_fini): New function.
	(location_get_source_line): Take an additional output line_len
	parameter.  Re-write using lookup_or_add_file_to_cache_tab and
	read_line_num.
	* diagnostic.c (diagnostic_finish): Call
	diagnostic_file_cache_fini.
	(adjust_line): Take an additional input parameter for the length
	of the line, rather than calculating it with strlen.
	(diagnostic_show_locus): Adjust the use of
	location_get_source_line and adjust_line with respect to their new
	signature.  While displaying a line now, do not stop at the first
	null byte.  Rather, display the zero byte as a space and keep
	going until we reach the size of the line.
	* Makefile.in: Add vec.o to OBJS-libcommon

gcc/testsuite/ChangeLog:

	* c-c++-common/cpp/warning-zero-in-literals-1.c: New test file.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@204453 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/Makefile.in                                    |   3 +-
 gcc/diagnostic.c                                   |  19 +-
 gcc/diagnostic.h                                   |   1 +
 gcc/input.c                                        | 637 ++++++++++++++++++++-
 gcc/input.h                                        |   5 +-
 .../c-c++-common/cpp/warning-zero-in-literals-1.c  | Bin 0 -> 240 bytes
 libcpp/include/line-map.h                          |   8 +
 libcpp/line-map.c                                  |  40 ++
 8 files changed, 674 insertions(+), 39 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 49285e5..9fe9060 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1469,7 +1469,8 @@ OBJS = \
 
 # Objects in libcommon.a, potentially used by all host binaries and with
 # no target dependencies.
-OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o input.o version.o
+OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o \
+	vec.o  input.o version.o
 
 # Objects in libcommon-target.a, used by drivers and by the core
 # compiler and containing target-dependent code.
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 36094a1..6c83f03 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -176,6 +176,8 @@ diagnostic_finish (diagnostic_context *context)
 		     progname);
       pp_newline_and_flush (context->printer);
     }
+
+  diagnostic_file_cache_fini ();
 }
 
 /* Initialize DIAGNOSTIC, where the message MSG has already been
@@ -259,12 +261,13 @@ diagnostic_build_prefix (diagnostic_context *context,
    MAX_WIDTH by some margin, then adjust the start of the line such
    that the COLUMN is smaller than MAX_WIDTH minus the margin.  The
    margin is either 10 characters or the difference between the column
-   and the length of the line, whatever is smaller.  */
+   and the length of the line, whatever is smaller.  The length of
+   LINE is given by LINE_WIDTH.  */
 static const char *
-adjust_line (const char *line, int max_width, int *column_p)
+adjust_line (const char *line, int line_width,
+	     int max_width, int *column_p)
 {
   int right_margin = 10;
-  int line_width = strlen (line);
   int column = *column_p;
 
   right_margin = MIN (line_width - column, right_margin);
@@ -284,6 +287,7 @@ diagnostic_show_locus (diagnostic_context * context,
 		       const diagnostic_info *diagnostic)
 {
   const char *line;
+  int line_width;
   char *buffer;
   expanded_location s;
   int max_width;
@@ -297,22 +301,25 @@ diagnostic_show_locus (diagnostic_context * context,
 
   context->last_location = diagnostic->location;
   s = expand_location_to_spelling_point (diagnostic->location);
-  line = location_get_source_line (s);
+  line = location_get_source_line (s, &line_width);
   if (line == NULL)
     return;
 
   max_width = context->caret_max_width;
-  line = adjust_line (line, max_width, &(s.column));
+  line = adjust_line (line, line_width, max_width, &(s.column));
 
   pp_newline (context->printer);
   saved_prefix = pp_get_prefix (context->printer);
   pp_set_prefix (context->printer, NULL);
   pp_space (context->printer);
-  while (max_width > 0 && *line != '\0')
+  while (max_width > 0 && line_width > 0)
     {
       char c = *line == '\t' ? ' ' : *line;
+      if (c == '\0')
+	c = ' ';
       pp_character (context->printer, c);
       max_width--;
+      line_width--;
       line++;
     }
   pp_newline (context->printer);
diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h
index cb38d37..3f30e06 100644
--- a/gcc/diagnostic.h
+++ b/gcc/diagnostic.h
@@ -291,6 +291,7 @@ void default_diagnostic_starter (diagnostic_context *, diagnostic_info *);
 void default_diagnostic_finalizer (diagnostic_context *, diagnostic_info *);
 void diagnostic_set_caret_max_width (diagnostic_context *context, int value);
 
+void diagnostic_file_cache_fini (void);
 
 /* Pure text formatting support functions.  */
 extern char *file_name_as_prefix (diagnostic_context *, const char *);
diff --git a/gcc/input.c b/gcc/input.c
index a141a92..5f67d40 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -22,6 +22,86 @@ along with GCC; see the file COPYING3.  If not see
 #include "coretypes.h"
 #include "intl.h"
 #include "input.h"
+#include "vec.h"
+
+/* This is a cache used by get_next_line to store the content of a
+   file to be searched for file lines.  */
+struct fcache
+{
+  /* These are information used to store a line boundary.  */
+  struct line_info
+  {
+    /* The line number.  It starts from 1.  */
+    size_t line_num;
+
+    /* The position (byte count) of the beginning of the line,
+       relative to the file data pointer.  This starts at zero.  */
+    size_t start_pos;
+
+    /* The position (byte count) of the last byte of the line.  This
+       normally points to the '\n' character, or to one byte after the
+       last byte of the file, if the file doesn't contain a '\n'
+       character.  */
+    size_t end_pos;
+
+    line_info (size_t l, size_t s, size_t e)
+      : line_num (l), start_pos (s), end_pos (e)
+    {}
+
+    line_info ()
+      :line_num (0), start_pos (0), end_pos (0)
+    {}
+  };
+
+  /* The number of time this file has been accessed.  This is used
+     to designate which file cache to evict from the cache
+     array.  */
+  unsigned use_count;
+
+  const char *file_path;
+
+  FILE *fp;
+
+  /* This points to the content of the file that we've read so
+     far.  */
+  char *data;
+
+  /*  The size of the DATA array above.*/
+  size_t size;
+
+  /* The number of bytes read from the underlying file so far.  This
+     must be less (or equal) than SIZE above.  */
+  size_t nb_read;
+
+  /* The index of the beginning of the current line.  */
+  size_t line_start_idx;
+
+  /* The number of the previous line read.  This starts at 1.  Zero
+     means we've read no line so far.  */
+  size_t line_num;
+
+  /* This is the total number of lines of the current file.  At the
+     moment, we try to get this information from the line map
+     subsystem.  Note that this is just a hint.  When using the C++
+     front-end, this hint is correct because the input file is then
+     completely tokenized before parsing starts; so the line map knows
+     the number of lines before compilation really starts.  For e.g,
+     the C front-end, it can happen that we start emitting diagnostics
+     before the line map has seen the end of the file.  */
+  size_t total_lines;
+
+  /* This is a record of the beginning and end of the lines we've seen
+     while reading the file.  This is useful to avoid walking the data
+     from the beginning when we are asked to read a line that is
+     before LINE_START_IDX above.  Note that the maximum size of this
+     record is fcache_line_record_size, so that the memory consumption
+     doesn't explode.  We thus scale total_lines down to
+     fcache_line_record_size.  */
+  vec<line_info, va_heap> line_record;
+
+  fcache ();
+  ~fcache ();
+};
 
 /* Current position in real source file.  */
 
@@ -29,6 +109,11 @@ location_t input_location;
 
 struct line_maps *line_table;
 
+static fcache *fcache_tab;
+static const size_t fcache_tab_size = 16;
+static const size_t fcache_buffer_size = 4 * 1024;
+static const size_t fcache_line_record_size = 100;
+
 /* Expand the source location LOC into a human readable location.  If
    LOC resolves to a builtin location, the file name of the readable
    location is set to the string "<built-in>". If EXPANSION_POINT_P is
@@ -87,56 +172,546 @@ expand_location_1 (source_location loc,
   return xloc;
 }
 
-/* Reads one line from file into a static buffer.  */
-static const char *
-read_line (FILE *file)
+/* Initialize the set of cache used for files accessed by caret
+   diagnostic.  */
+
+static void
+diagnostic_file_cache_init (void)
+{
+  if (fcache_tab == NULL)
+    fcache_tab = new fcache[fcache_tab_size];
+}
+
+/* Free the ressources used by the set of cache used for files accessed
+   by caret diagnostic.  */
+
+void
+diagnostic_file_cache_fini (void)
+{
+  if (fcache_tab)
+    {
+      delete [] (fcache_tab);
+      fcache_tab = NULL;
+    }
+}
+
+/* Return the total lines number that have been read so far by the
+   line map (in the preprocessor) so far.  For languages like C++ that
+   entirely preprocess the input file before starting to parse, this
+   equals the actual number of lines of the file.  */
+
+static size_t
+total_lines_num (const char *file_path)
+{
+  size_t r = 0;
+  source_location l = 0;
+  if (linemap_get_file_highest_location (line_table, file_path, &l))
+    {
+      gcc_assert (l >= RESERVED_LOCATION_COUNT);
+      expanded_location xloc = expand_location (l);
+      r = xloc.line;
+    }
+  return r;
+}
+
+/* Lookup the cache used for the content of a given file accessed by
+   caret diagnostic.  Return the found cached file, or NULL if no
+   cached file was found.  */
+
+static fcache*
+lookup_file_in_cache_tab (const char *file_path)
+{
+  if (file_path == NULL)
+    return NULL;
+
+  diagnostic_file_cache_init ();
+
+  /* This will contain the found cached file.  */
+  fcache *r = NULL;
+  for (unsigned i = 0; i < fcache_tab_size; ++i)
+    {
+      fcache *c = &fcache_tab[i];
+      if (c->file_path && !strcmp (c->file_path, file_path))
+	{
+	  ++c->use_count;
+	  r = c;
+	}
+    }
+
+  if (r)
+    ++r->use_count;
+
+  return r;
+}
+
+/* Return the file cache that has been less used, recently, or the
+   first empty one.  If HIGHEST_USE_COUNT is non-null,
+   *HIGHEST_USE_COUNT is set to the highest use count of the entries
+   in the cache table.  */
+
+static fcache*
+evicted_cache_tab_entry (unsigned *highest_use_count)
+{
+  diagnostic_file_cache_init ();
+
+  fcache *to_evict = &fcache_tab[0];
+  unsigned huc = to_evict->use_count;
+  for (unsigned i = 1; i < fcache_tab_size; ++i)
+    {
+      fcache *c = &fcache_tab[i];
+      bool c_is_empty = (c->file_path == NULL);
+
+      if (c->use_count < to_evict->use_count
+	  || (to_evict->file_path && c_is_empty))
+	/* We evict C because it's either an entry with a lower use
+	   count or one that is empty.  */
+	to_evict = c;
+
+      if (huc < c->use_count)
+	huc = c->use_count;
+
+      if (c_is_empty)
+	/* We've reached the end of the cache; subsequent elements are
+	   all empty.  */
+	break;
+    }
+
+  if (highest_use_count)
+    *highest_use_count = huc;
+
+  return to_evict;
+}
+
+/* Create the cache used for the content of a given file to be
+   accessed by caret diagnostic.  This cache is added to an array of
+   cache and can be retrieved by lookup_file_in_cache_tab.  This
+   function returns the created cache.  Note that only the last
+   fcache_tab_size files are cached.  */
+
+static fcache*
+add_file_to_cache_tab (const char *file_path)
+{
+
+  FILE *fp = fopen (file_path, "r");
+  if (ferror (fp))
+    {
+      fclose (fp);
+      return NULL;
+    }
+
+  unsigned highest_use_count = 0;
+  fcache *r = evicted_cache_tab_entry (&highest_use_count);
+  r->file_path = file_path;
+  if (r->fp)
+    fclose (r->fp);
+  r->fp = fp;
+  r->nb_read = 0;
+  r->line_start_idx = 0;
+  r->line_num = 0;
+  r->line_record.truncate (0);
+  /* Ensure that this cache entry doesn't get evicted next time
+     add_file_to_cache_tab is called.  */
+  r->use_count = ++highest_use_count;
+  r->total_lines = total_lines_num (file_path);
+
+  return r;
+}
+
+/* Lookup the cache used for the content of a given file accessed by
+   caret diagnostic.  If no cached file was found, create a new cache
+   for this file, add it to the array of cached file and return
+   it.  */
+
+static fcache*
+lookup_or_add_file_to_cache_tab (const char *file_path)
+{
+  fcache *r = lookup_file_in_cache_tab (file_path);
+  if (r == NULL)
+    r = add_file_to_cache_tab (file_path);
+  return r;
+}
+
+/* Default constructor for a cache of file used by caret
+   diagnostic.  */
+
+fcache::fcache ()
+: use_count (0), file_path (NULL), fp (NULL), data (0),
+  size (0), nb_read (0), line_start_idx (0), line_num (0),
+  total_lines (0)
+{
+  line_record.create (0);
+}
+
+/* Destructor for a cache of file used by caret diagnostic.  */
+
+fcache::~fcache ()
+{
+  if (fp)
+    {
+      fclose (fp);
+      fp = NULL;
+    }
+  if (data)
+    {
+      XDELETEVEC (data);
+      data = 0;
+    }
+  line_record.release ();
+}
+
+/* Returns TRUE iff the cache would need to be filled with data coming
+   from the file.  That is, either the cache is empty or full or the
+   current line is empty.  Note that if the cache is full, it would
+   need to be extended and filled again.  */
+
+static bool
+needs_read (fcache *c)
+{
+  return (c->nb_read == 0
+	  || c->nb_read == c->size
+	  || (c->line_start_idx >= c->nb_read - 1));
+}
+
+/*  Return TRUE iff the cache is full and thus needs to be
+    extended.  */
+
+static bool
+needs_grow (fcache *c)
+{
+  return c->nb_read == c->size;
+}
+
+/* Grow the cache if it needs to be extended.  */
+
+static void
+maybe_grow (fcache *c)
+{
+  if (!needs_grow (c))
+    return;
+
+  size_t size = c->size == 0 ? fcache_buffer_size : c->size * 2;
+  c->data = XRESIZEVEC (char, c->data, size + 1);
+  c->size = size;
+}
+
+/*  Read more data into the cache.  Extends the cache if need be.
+    Returns TRUE iff new data could be read.  */
+
+static bool
+read_data (fcache *c)
+{
+  if (feof (c->fp) || ferror (c->fp))
+    return false;
+
+  maybe_grow (c);
+
+  char * from = c->data + c->nb_read;
+  size_t to_read = c->size - c->nb_read;
+  size_t nb_read = fread (from, 1, to_read, c->fp);
+
+  if (ferror (c->fp))
+    return false;
+
+  c->nb_read += nb_read;
+  return !!nb_read;
+}
+
+/* Read new data iff the cache needs to be filled with more data
+   coming from the file FP.  Return TRUE iff the cache was filled with
+   mode data.  */
+
+static bool
+maybe_read_data (fcache *c)
 {
-  static char *string;
-  static size_t string_len;
-  size_t pos = 0;
-  char *ptr;
+  if (!needs_read (c))
+    return false;
+  return read_data (c);
+}
+
+/* Read a new line from file FP, using C as a cache for the data
+   coming from the file.  Upon successful completion, *LINE is set to
+   the beginning of the line found.  Space for that line has been
+   allocated in the cache thus *LINE has the same life time as C.
+   This function returns the length of the line, including the
+   terminal '\n' character.  Note that subsequent calls to
+   get_next_line return the next lines of the file and might overwrite
+   the content of *LINE.  */
+
+static ssize_t
+get_next_line (fcache *c, char **line)
+{
+  /* Fill the cache with data to process.  */
+  maybe_read_data (c);
+
+  size_t remaining_size = c->nb_read - c->line_start_idx;
+  if (remaining_size == 0)
+    /* There is no more data to process.  */
+    return 0;
+
+  char *line_start = c->data + c->line_start_idx;
 
-  if (!string_len)
+  char *next_line_start = NULL;
+  size_t line_len = 0;
+  char *line_end = (char *) memchr (line_start, '\n', remaining_size);
+  if (line_end == NULL)
     {
-      string_len = 200;
-      string = XNEWVEC (char, string_len);
+      /* We haven't found the end-of-line delimiter in the cache.
+	 Fill the cache with more data from the file and look for the
+	 '\n'.  */
+      while (maybe_read_data (c))
+	{
+	  line_start = c->data + c->line_start_idx;
+	  remaining_size = c->nb_read - c->line_start_idx;
+	  line_end = (char *) memchr (line_start, '\n', remaining_size);
+	  if (line_end != NULL)
+	    {
+	      next_line_start = line_end + 1;
+	      line_len = line_end - line_start + 1;
+	      break;
+	    }
+	}
+      if (line_end == NULL)
+	{
+	  /* We've loadded all the file into the cache and still no
+	     '\n'.  Let's say the line ends up at the byte after the
+	     last byte of the file.  */
+	  line_end = c->data + c->nb_read;
+	  line_len = c->nb_read - c->line_start_idx;
+	}
     }
+  else
+    {
+      next_line_start = line_end + 1;
+      line_len = line_end - line_start + 1;;
+    }
+
+  if (ferror (c->fp))
+    return -1;
+
+  /* At this point, we've found the end of the of line.  It either
+     points to the '\n' or to one byte after the last byte of the
+     file.  */
+  gcc_assert (line_end != NULL);
 
-  while ((ptr = fgets (string + pos, string_len - pos, file)))
+  if (c->line_start_idx < c->nb_read)
+    *line = line_start;
+
+  gcc_assert (line_len > 0);
+
+  ++c->line_num;
+
+  /* Before we update our line record, make sure the hint about the
+     total number of lines of the file is correct.  If it's not, then
+     we give up recording line boundaries from now on.  */
+  bool update_line_record = true;
+  if (c->line_num > c->total_lines)
+    update_line_record = false;
+
+    /* Now update our line record so that re-reading lines from the
+     before c->line_start_idx is faster.  */
+  if (update_line_record
+      && c->line_record.length () < fcache_line_record_size)
     {
-      size_t len = strlen (string + pos);
+      /* If the file lines fits in the line record, we just record all
+	 its lines ...*/
+      if (c->total_lines <= fcache_line_record_size
+	  && c->line_num > c->line_record.length ())
+	c->line_record.safe_push (fcache::line_info (c->line_num,
+						 c->line_start_idx,
+						 line_end - c->data));
+      else if (c->total_lines > fcache_line_record_size)
+	{
+	  /* ... otherwise, we just scale total_lines down to
+	     (fcache_line_record_size lines.  */
+	  size_t n = (c->line_num * fcache_line_record_size) / c->total_lines;
+	  if (c->line_record.length () == 0
+	      || n >= c->line_record.length ())
+	    c->line_record.safe_push (fcache::line_info (c->line_num,
+						     c->line_start_idx,
+						     line_end - c->data));
+	}
+    }
+
+  /* Update c->line_start_idx so that it points to the next line to be
+     read.  */
+  if (next_line_start)
+    c->line_start_idx = next_line_start - c->data;
+  else
+    /* We didn't find any terminal '\n'.  Let's consider that the end
+       of line is the end of the data in the cache.  The next
+       invocation of get_next_line will either read more data from the
+       underlying file or return false early because we've reached the
+       end of the file.  */
+    c->line_start_idx = c->nb_read;
+
+  return line_len;
+}
 
-      if (string[pos + len - 1] == '\n')
+/* Reads the next line from FILE into *LINE.  If *LINE is too small
+   (or NULL) it is allocated (or extended) to have enough space to
+   containe the line.  *LINE_LENGTH must contain the size of the
+   initial*LINE buffer.  It's then updated by this function to the
+   actual length of the returned line.  Note that the returned line
+   can contain several zero bytes.  Also note that the returned string
+   is allocated in static storage that is going to be re-used by
+   subsequent invocations of read_line.  */
+
+static bool
+read_next_line (fcache *cache, char ** line, ssize_t *line_len)
+{
+  char *l = NULL;
+  ssize_t len = get_next_line (cache, &l);
+
+  if (len > 0)
+    {
+      if (*line == NULL)
 	{
-	  string[pos + len - 1] = 0;
-	  return string;
+	  *line = XNEWVEC (char, len);
+	  *line_len = len;
 	}
-      pos += len;
-      string = XRESIZEVEC (char, string, string_len * 2);
-      string_len *= 2;
+      else
+	if (*line_len < len)
+	  *line = XRESIZEVEC (char, *line, len);
+
+      memmove (*line, l, len);
+      (*line)[len - 1] = '\0';
+      *line_len = --len;
+      return true;
     }
-      
-  return pos ? string : NULL;
+
+  return false;
+}
+
+/* Consume the next bytes coming from the cache (or from its
+   underlying file if there are remaining unread bytes in the file)
+   until we reach the next end-of-line (or end-of-file).  There is no
+   copying from the cache involved.  Return TRUE upon successful
+   completion.  */
+
+static bool
+goto_next_line (fcache *cache)
+{
+  char *l = NULL;
+  ssize_t len = get_next_line (cache, &l);
+  return (len > 0 );
+}
+
+/* Read an arbitrary line number LINE_NUM from the file cached in C.
+   The line is copied into *LINE.  *LINE_LEN must have been set to the
+   length of *LINE.  If *LINE is too small (or NULL) it's extended (or
+   allocated) and *LINE_LEN is adjusted accordingly.  *LINE ends up
+   with a terminal zero byte and can contain additional zero bytes.
+   This function returns bool if a line was read.  */
+
+static bool
+read_line_num (fcache *c, size_t line_num,
+	       char ** line, ssize_t *line_len)
+{
+  gcc_assert (line_num > 0);
+
+  if (line_num <= c->line_num)
+    {
+      /* We've been asked to read lines that are before c->line_num.
+	 So lets use our line record (if it's not empty) to try to
+	 avoid re-reading the file from the beginning again.  */
+
+      if (c->line_record.is_empty ())
+	{
+	  c->line_start_idx = 0;
+	  c->line_num = 0;
+	}
+      else
+	{
+	  fcache::line_info *i = NULL;
+	  if (c->total_lines <= fcache_line_record_size)
+	    {
+	      /* In languages where the input file is not totally
+		 preprocessed up front, the c->total_lines hint
+		 can be smaller than the number of lines of the
+		 file.  In that case, only the first
+		 c->total_lines have been recorded.
+
+		 Otherwise, the first c->total_lines we've read have
+		 their start/end recorded here.  */
+	      i = (line_num <= c->total_lines)
+		? &c->line_record[line_num - 1]
+		: &c->line_record[c->total_lines - 1];
+	      gcc_assert (i->line_num <= line_num);
+	    }
+	  else
+	    {
+	      /*  So the file had more lines than our line record
+		  size.  Thus the number of lines we've recorded has
+		  been scaled down to fcache_line_reacord_size.  Let's
+		  pick the start/end of the recorded line that is
+		  closest to line_num.  */
+	      size_t n = (line_num <= c->total_lines)
+		? line_num * fcache_line_record_size / c->total_lines
+		: c ->line_record.length () - 1;
+	      if (n < c->line_record.length ())
+		{
+		  i = &c->line_record[n];
+		  gcc_assert (i->line_num <= line_num);
+		}
+	    }
+
+	  if (i && i->line_num == line_num)
+	    {
+	      /* We have the start/end of the line.  Let's just copy
+		 it again and we are done.  */
+	      ssize_t len = i->end_pos - i->start_pos + 1;
+	      if (*line_len < len)
+		*line = XRESIZEVEC (char, *line, len);
+	      memmove (*line, c->data + i->start_pos, len);
+	      (*line)[len - 1] = '\0';
+	      *line_len = --len;
+	      return true;
+	    }
+
+	  if (i)
+	    {
+	      c->line_start_idx = i->start_pos;
+	      c->line_num = i->line_num - 1;
+	    }
+	  else
+	    {
+	      c->line_start_idx = 0;
+	      c->line_num = 0;
+	    }
+	}
+    }
+
+  /*  Let's walk from line c->line_num up to line_num - 1, without
+      copying any line.  */
+  while (c->line_num < line_num - 1)
+    if (!goto_next_line (c))
+      return false;
+
+  /* The line we want is the next one.  Let's read and copy it back to
+     the caller.  */
+  return read_next_line (c, line, line_len);
 }
 
 /* Return the physical source line that corresponds to xloc in a
    buffer that is statically allocated.  The newline is replaced by
-   the null character.  */
+   the null character.  Note that the line can contain several null
+   characters, so LINE_LEN, if non-null, points to the actual length
+   of the line.  */
 
 const char *
-location_get_source_line (expanded_location xloc)
+location_get_source_line (expanded_location xloc,
+			  int *line_len)
 {
-  const char *buffer;
-  int lines = 1;
-  FILE *stream = xloc.file ? fopen (xloc.file, "r") : NULL;
-  if (!stream)
-    return NULL;
+  static char *buffer;
+  static ssize_t len;
+
+  fcache * c = lookup_or_add_file_to_cache_tab (xloc.file);
+  bool read = read_line_num (c, xloc.line, &buffer, &len);
 
-  while ((buffer = read_line (stream)) && lines < xloc.line)
-    lines++;
+  if (read && line_len)
+    *line_len = len;
 
-  fclose (stream);
-  return buffer;
+  return read ? buffer : NULL;
 }
 
 /* Expand the source location LOC into a human readable location.  If
diff --git a/gcc/input.h b/gcc/input.h
index 8fdc7b2..c82023f 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -37,7 +37,8 @@ extern char builtins_location_check[(BUILTINS_LOCATION
 				     < RESERVED_LOCATION_COUNT) ? 1 : -1];
 
 extern expanded_location expand_location (source_location);
-extern const char *location_get_source_line (expanded_location xloc);
+extern const char *location_get_source_line (expanded_location xloc,
+					     int *line_size);
 extern expanded_location expand_location_to_spelling_point (source_location);
 extern source_location expansion_point_location_if_in_system_header (source_location);
 
@@ -65,4 +66,6 @@ extern location_t input_location;
 
 void dump_line_table_statistics (void);
 
+void diagnostics_file_cache_fini (void);
+
 #endif
diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c b/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c
new file mode 100644
index 0000000000000000000000000000000000000000..ff2ed962ac96e47ae05b0b040f4e10b8e09637e2
GIT binary patch
literal 240
zcmdPbSEyD<N!LxuS12e-Ehx%QPAx80sO92PVo*}h*HVDUmM0eFW#*+TDCL#r<R~O(
UBo-wmm!uXcDby-x=?^KT09Xk|)&Kwi

literal 0
HcmV?d00001

diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index a0d6da1..3504fd6 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -756,6 +756,14 @@ struct linemap_stats
   long duplicated_macro_maps_locations_size;
 };
 
+/* Return the highest location emitted for a given file for which
+   there is a line map in SET.  FILE_NAME is the file name to
+   consider.  If the function returns TRUE, *LOC is set to the highest
+   location emitted for that file.  */
+bool linemap_get_file_highest_location (struct line_maps * set,
+					const char *file_name,
+					source_location*LOC);
+
 /* Compute and return statistics about the memory consumption of some
    parts of the line table SET.  */
 void linemap_get_statistics (struct line_maps *, struct linemap_stats *);
diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index 2ad7ad2..98db486 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -1502,6 +1502,46 @@ linemap_dump_location (struct line_maps *set,
 	   path, from, l, c, s, (void*)map, e, loc, location);
 }
 
+/* Return the highest location emitted for a given file for which
+   there is a line map in SET.  FILE_NAME is the file name to
+   consider.  If the function returns TRUE, *LOC is set to the highest
+   location emitted for that file.  */
+
+bool
+linemap_get_file_highest_location (struct line_maps *set,
+				   const char *file_name,
+				   source_location *loc)
+{
+  /* If the set is empty or no ordinary map has been created then
+     there is no file to look for ...  */
+  if (set == NULL || set->info_ordinary.used == 0)
+    return false;
+
+  /* Now look for the last ordinary map created for FILE_NAME.  */
+  int i;
+  for (i = set->info_ordinary.used - 1; i >= 0; --i)
+    {
+      const char *fname = set->info_ordinary.maps[i].d.ordinary.to_file;
+      if (fname && !strcmp (fname, file_name))
+	break;
+    }
+
+  if (i < 0)
+    return false;
+
+  /* The highest location for a given map is either the starting
+     location of the next map minus one, or -- if the map is the
+     latest one -- the highest location of the set.  */
+  source_location result;
+  if (i == (int) set->info_ordinary.used - 1)
+    result = set->highest_location;
+  else
+    result = set->info_ordinary.maps[i + 1].start_location - 1;
+
+  *loc = result;
+  return true;
+}
+
 /* Compute and return statistics about the memory consumption of some
    parts of the line table SET.  */
 
-- 
		Dodji

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-11-12 16:42             ` Dodji Seketeli
@ 2013-11-13  5:10               ` Bernd Edlinger
  2013-11-13  9:40                 ` Dodji Seketeli
  2013-11-13  9:51               ` Jakub Jelinek
  1 sibling, 1 reply; 46+ messages in thread
From: Bernd Edlinger @ 2013-11-13  5:10 UTC (permalink / raw)
  To: Dodji Seketeli, Jakub Jelinek
  Cc: GCC Patches, Tom Tromey, Manuel López-Ibáñez

Hi,

On Tue, 12 Nov 2013 16:33:41, Dodji Seketeli wrote:
>
> +/* Reads the next line from FILE into *LINE. If *LINE is too small
> + (or NULL) it is allocated (or extended) to have enough space to
> + containe the line. *LINE_LENGTH must contain the size of the
> + initial*LINE buffer. It's then updated by this function to the
> + actual length of the returned line. Note that the returned line
> + can contain several zero bytes. Also note that the returned string
> + is allocated in static storage that is going to be re-used by
> + subsequent invocations of read_line. */
> +
> +static bool
> +read_next_line (fcache *cache, char ** line, ssize_t *line_len)
> +{
> + char *l = NULL;
> + ssize_t len = get_next_line (cache, &l);
> +
> + if (len> 0)
> + {
> + if (*line == NULL)
> {
> - string[pos + len - 1] = 0;
> - return string;
> + *line = XNEWVEC (char, len);
> + *line_len = len;
> }
> - pos += len;
> - string = XRESIZEVEC (char, string, string_len * 2);
> - string_len *= 2;
> + else
> + if (*line_len < len)
> + *line = XRESIZEVEC (char, *line, len);
> +
> + memmove (*line, l, len);
> + (*line)[len - 1] = '\0';
> + *line_len = --len;

Generally, I would prefer to use memcpy,
if it is clear that the memory does not overlap.

You copy one char too much and set it to zero?

Using -- on a value that goes out of scope looks
awkward IMHO.

Bernd.

> + return true;
> }
> -
> - return pos ? string : NULL;
> +
> + return false;
> +} 		 	   		  

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-11-13  5:10               ` Bernd Edlinger
@ 2013-11-13  9:40                 ` Dodji Seketeli
  2013-11-13  9:43                   ` Bernd Edlinger
  0 siblings, 1 reply; 46+ messages in thread
From: Dodji Seketeli @ 2013-11-13  9:40 UTC (permalink / raw)
  To: Bernd Edlinger
  Cc: Jakub Jelinek, GCC Patches, Tom Tromey,
	Manuel López-Ibáñez

Bernd Edlinger <bernd.edlinger@hotmail.de> writes:


>> + memmove (*line, l, len);
>> + (*line)[len - 1] = '\0';
>> + *line_len = --len;
>
> Generally, I would prefer to use memcpy,
> if it is clear that the memory does not overlap.

I don't mind.  I'll change that in my local copy.  Thanks.

> You copy one char too much and set it to zero?

It's not one char too much.  That char is the terminal '\n' in most
cases.

> Using -- on a value that goes out of scope looks
> awkward IMHO.

I don't understand this sentence.  What do you mean by "Using -- on a
value that goes out of scope"?

-- 
		Dodji

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-11-13  9:40                 ` Dodji Seketeli
@ 2013-11-13  9:43                   ` Bernd Edlinger
  2013-11-13  9:49                     ` Dodji Seketeli
  2013-11-13  9:49                     ` Dodji Seketeli
  0 siblings, 2 replies; 46+ messages in thread
From: Bernd Edlinger @ 2013-11-13  9:43 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: Jakub Jelinek, GCC Patches, Tom Tromey,
	Manuel López-Ibáñez

>
>>> + memmove (*line, l, len);
>>> + (*line)[len - 1] = '\0';
>>> + *line_len = --len;
>>
>> Generally, I would prefer to use memcpy,
>> if it is clear that the memory does not overlap.
>
> I don't mind. I'll change that in my local copy. Thanks.
>
>> You copy one char too much and set it to zero?
>
> It's not one char too much. That char is the terminal '\n' in most
> cases.
>

and what is it if there is no terminal '\n' ?

>> Using -- on a value that goes out of scope looks
>> awkward IMHO.
>
> I don't understand this sentence. What do you mean by "Using -- on a
> value that goes out of scope"?
>

I meant the operator --  in  *line_len = --len;

Maybe, You could also avoid the copying completely, if you just hand out
a pointer to the line buffer as const char*, and use the length instead of the
nul-char as end delimiter ?

Bernd. 		 	   		  

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-11-13  9:43                   ` Bernd Edlinger
  2013-11-13  9:49                     ` Dodji Seketeli
@ 2013-11-13  9:49                     ` Dodji Seketeli
  1 sibling, 0 replies; 46+ messages in thread
From: Dodji Seketeli @ 2013-11-13  9:49 UTC (permalink / raw)
  To: Bernd Edlinger
  Cc: Jakub Jelinek, GCC Patches, Tom Tromey,
	Manuel López-Ibáñez

Sorry, I missed one question in the previous email.

Bernd Edlinger <bernd.edlinger@hotmail.de> writes:

> and what is it if there is no terminal '\n' ?

In that case it's that the entire file is made of one line.  For that
case get_next_line has allocated enough space for one
byte-passed-the-end of the file, so that there is no buffer overflow
here.

-- 
		Dodji

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-11-13  9:43                   ` Bernd Edlinger
@ 2013-11-13  9:49                     ` Dodji Seketeli
  2013-11-13  9:49                     ` Dodji Seketeli
  1 sibling, 0 replies; 46+ messages in thread
From: Dodji Seketeli @ 2013-11-13  9:49 UTC (permalink / raw)
  To: Bernd Edlinger
  Cc: Jakub Jelinek, GCC Patches, Tom Tromey,
	Manuel López-Ibáñez

Bernd Edlinger <bernd.edlinger@hotmail.de> writes:

>>> Using -- on a value that goes out of scope looks
>>> awkward IMHO.
>>
>> I don't understand this sentence. What do you mean by "Using -- on a
>> value that goes out of scope"?
>>
>
> I meant the operator --  in  *line_len = --len;

Sorry, I don't see how that is an issue.  This looks like a classical
way of passing an output parameter to me.

> Maybe, You could also avoid the copying completely, if you just hand out
> a pointer to the line buffer as const char*, and use the length instead of the
> nul-char as end delimiter ?

I thought about avoiding the copying of course.  But the issue with that
is that that ties the lifetime of the returned line to the time between
two invocations of read_next_line.  IOW, you'd have to use the line
"quickly" before calling read_next_line again.  Actually that
non-copying API that you are talking about exists in the patch; it's
get_next_line.  And you see that it's what we use when we want to avoid
the copying, e.g, in goto_next_line.  But when we want to give the
"final" user the string, I believe that copying is less surprising.  And
from what I could see from the tests I have done, the copying doesn't
make the thing slower than without the patch.  So I'd like to keep this
unless folks have very strong feeling about it.

-- 
		Dodji

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-11-12 16:42             ` Dodji Seketeli
  2013-11-13  5:10               ` Bernd Edlinger
@ 2013-11-13  9:51               ` Jakub Jelinek
  2013-11-14 15:12                 ` Dodji Seketeli
  1 sibling, 1 reply; 46+ messages in thread
From: Jakub Jelinek @ 2013-11-13  9:51 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: GCC Patches, Tom Tromey, Manuel López-Ibáñez,
	Bernd Edlinger

On Tue, Nov 12, 2013 at 04:33:41PM +0100, Dodji Seketeli wrote:
> +
> +      memmove (*line, l, len);
> +      (*line)[len - 1] = '\0';
> +      *line_len = --len;

Shouldn't this be testing that len > 0 && (*line)[len - 1] == '\n'
first before you decide to overwrite it and decrement len?
Though in that case there would be no '\0' termination of the string
for files not ending in a new-line.  So, either get_next_line should
append '\n' to the buffer, or you should have there space for that, or
you can't rely on zero termination of the string and need to use just
the length.

	Jakub

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-11-13  9:51               ` Jakub Jelinek
@ 2013-11-14 15:12                 ` Dodji Seketeli
  2013-12-09 20:11                   ` Tom Tromey
  2014-01-21 12:28                   ` Bernd Edlinger
  0 siblings, 2 replies; 46+ messages in thread
From: Dodji Seketeli @ 2013-11-14 15:12 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: GCC Patches, Tom Tromey, Manuel López-Ibáñez,
	Bernd Edlinger

Jakub Jelinek <jakub@redhat.com> writes:

> On Tue, Nov 12, 2013 at 04:33:41PM +0100, Dodji Seketeli wrote:
>> +
>> +      memmove (*line, l, len);
>> +      (*line)[len - 1] = '\0';
>> +      *line_len = --len;
>
> Shouldn't this be testing that len > 0 && (*line)[len - 1] == '\n'
> first before you decide to overwrite it and decrement len?

That code above is in a if (len > 0) block.  So checking that condition
again is not necessary.  Also, I think we don't need to test there is a
terminal '\n' at the end because get_next_line always return the line
content followed either by a '\n' or by a "junk byte" that is right
after the last byte of the file -- in case we reach end of file w/o
seeing a '\n'.

> Though in that case there would be no '\0' termination of the string
> for files not ending in a new-line.  So, either get_next_line should
> append '\n' to the buffer, or you should have there space for that, or
> you can't rely on zero termination of the string and need to use just
> the length.

OK, I am settling for doing away with the '\0' altogether.

The patch below makes get_next_line always point to the last character
of the line before the '\n' when it is present.  So '\n' is never
counted int the string.  I guess that's less confusing to people.

Tested on x86_64-unknown-linux-gnu against trunk.

libcpp/ChangeLog:

	* include/line-map.h (linemap_get_file_highest_location): Declare
	new function.
	* line-map.c (linemap_get_file_highest_location): Define it.

gcc/ChangeLog:

	* input.h (location_get_source_line): Take an additional line_size
	parameter.
	(void diagnostics_file_cache_fini): Declare new function.
	* input.c (struct fcache): New type.
	(fcache_tab_size, fcache_buffer_size, fcache_line_record_size):
	New static constants.
	(diagnostic_file_cache_init, total_lines_num)
	(lookup_file_in_cache_tab, evicted_cache_tab_entry)
	(add_file_to_cache_tab, lookup_or_add_file_to_cache_tab)
	(needs_read, needs_grow, maybe_grow, read_data, maybe_read_data)
	(get_next_line, read_next_line, goto_next_line, read_line_num):
	New static function definitions.
	(diagnostic_file_cache_fini): New function.
	(location_get_source_line): Take an additional output line_len
	parameter.  Re-write using lookup_or_add_file_to_cache_tab and
	read_line_num.
	* diagnostic.c (diagnostic_finish): Call
	diagnostic_file_cache_fini.
	(adjust_line): Take an additional input parameter for the length
	of the line, rather than calculating it with strlen.
	(diagnostic_show_locus): Adjust the use of
	location_get_source_line and adjust_line with respect to their new
	signature.  While displaying a line now, do not stop at the first
	null byte.  Rather, display the zero byte as a space and keep
	going until we reach the size of the line.
	* Makefile.in: Add vec.o to OBJS-libcommon

gcc/testsuite/ChangeLog:

	* c-c++-common/cpp/warning-zero-in-literals-1.c: New test file.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@204453 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/Makefile.in                                    |   3 +-
 gcc/diagnostic.c                                   |  19 +-
 gcc/diagnostic.h                                   |   1 +
 gcc/input.c                                        | 633 ++++++++++++++++++++-
 gcc/input.h                                        |   5 +-
 .../c-c++-common/cpp/warning-zero-in-literals-1.c  | Bin 0 -> 240 bytes
 libcpp/include/line-map.h                          |   8 +
 libcpp/line-map.c                                  |  40 ++
 8 files changed, 670 insertions(+), 39 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 49285e5..9fe9060 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1469,7 +1469,8 @@ OBJS = \
 
 # Objects in libcommon.a, potentially used by all host binaries and with
 # no target dependencies.
-OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o input.o version.o
+OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o \
+	vec.o  input.o version.o
 
 # Objects in libcommon-target.a, used by drivers and by the core
 # compiler and containing target-dependent code.
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 36094a1..6c83f03 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -176,6 +176,8 @@ diagnostic_finish (diagnostic_context *context)
 		     progname);
       pp_newline_and_flush (context->printer);
     }
+
+  diagnostic_file_cache_fini ();
 }
 
 /* Initialize DIAGNOSTIC, where the message MSG has already been
@@ -259,12 +261,13 @@ diagnostic_build_prefix (diagnostic_context *context,
    MAX_WIDTH by some margin, then adjust the start of the line such
    that the COLUMN is smaller than MAX_WIDTH minus the margin.  The
    margin is either 10 characters or the difference between the column
-   and the length of the line, whatever is smaller.  */
+   and the length of the line, whatever is smaller.  The length of
+   LINE is given by LINE_WIDTH.  */
 static const char *
-adjust_line (const char *line, int max_width, int *column_p)
+adjust_line (const char *line, int line_width,
+	     int max_width, int *column_p)
 {
   int right_margin = 10;
-  int line_width = strlen (line);
   int column = *column_p;
 
   right_margin = MIN (line_width - column, right_margin);
@@ -284,6 +287,7 @@ diagnostic_show_locus (diagnostic_context * context,
 		       const diagnostic_info *diagnostic)
 {
   const char *line;
+  int line_width;
   char *buffer;
   expanded_location s;
   int max_width;
@@ -297,22 +301,25 @@ diagnostic_show_locus (diagnostic_context * context,
 
   context->last_location = diagnostic->location;
   s = expand_location_to_spelling_point (diagnostic->location);
-  line = location_get_source_line (s);
+  line = location_get_source_line (s, &line_width);
   if (line == NULL)
     return;
 
   max_width = context->caret_max_width;
-  line = adjust_line (line, max_width, &(s.column));
+  line = adjust_line (line, line_width, max_width, &(s.column));
 
   pp_newline (context->printer);
   saved_prefix = pp_get_prefix (context->printer);
   pp_set_prefix (context->printer, NULL);
   pp_space (context->printer);
-  while (max_width > 0 && *line != '\0')
+  while (max_width > 0 && line_width > 0)
     {
       char c = *line == '\t' ? ' ' : *line;
+      if (c == '\0')
+	c = ' ';
       pp_character (context->printer, c);
       max_width--;
+      line_width--;
       line++;
     }
   pp_newline (context->printer);
diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h
index cb38d37..3f30e06 100644
--- a/gcc/diagnostic.h
+++ b/gcc/diagnostic.h
@@ -291,6 +291,7 @@ void default_diagnostic_starter (diagnostic_context *, diagnostic_info *);
 void default_diagnostic_finalizer (diagnostic_context *, diagnostic_info *);
 void diagnostic_set_caret_max_width (diagnostic_context *context, int value);
 
+void diagnostic_file_cache_fini (void);
 
 /* Pure text formatting support functions.  */
 extern char *file_name_as_prefix (diagnostic_context *, const char *);
diff --git a/gcc/input.c b/gcc/input.c
index a141a92..1a5d014 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -22,6 +22,86 @@ along with GCC; see the file COPYING3.  If not see
 #include "coretypes.h"
 #include "intl.h"
 #include "input.h"
+#include "vec.h"
+
+/* This is a cache used by get_next_line to store the content of a
+   file to be searched for file lines.  */
+struct fcache
+{
+  /* These are information used to store a line boundary.  */
+  struct line_info
+  {
+    /* The line number.  It starts from 1.  */
+    size_t line_num;
+
+    /* The position (byte count) of the beginning of the line,
+       relative to the file data pointer.  This starts at zero.  */
+    size_t start_pos;
+
+    /* The position (byte count) of the last byte of the line.  This
+       normally points to the '\n' character, or to one byte after the
+       last byte of the file, if the file doesn't contain a '\n'
+       character.  */
+    size_t end_pos;
+
+    line_info (size_t l, size_t s, size_t e)
+      : line_num (l), start_pos (s), end_pos (e)
+    {}
+
+    line_info ()
+      :line_num (0), start_pos (0), end_pos (0)
+    {}
+  };
+
+  /* The number of time this file has been accessed.  This is used
+     to designate which file cache to evict from the cache
+     array.  */
+  unsigned use_count;
+
+  const char *file_path;
+
+  FILE *fp;
+
+  /* This points to the content of the file that we've read so
+     far.  */
+  char *data;
+
+  /*  The size of the DATA array above.*/
+  size_t size;
+
+  /* The number of bytes read from the underlying file so far.  This
+     must be less (or equal) than SIZE above.  */
+  size_t nb_read;
+
+  /* The index of the beginning of the current line.  */
+  size_t line_start_idx;
+
+  /* The number of the previous line read.  This starts at 1.  Zero
+     means we've read no line so far.  */
+  size_t line_num;
+
+  /* This is the total number of lines of the current file.  At the
+     moment, we try to get this information from the line map
+     subsystem.  Note that this is just a hint.  When using the C++
+     front-end, this hint is correct because the input file is then
+     completely tokenized before parsing starts; so the line map knows
+     the number of lines before compilation really starts.  For e.g,
+     the C front-end, it can happen that we start emitting diagnostics
+     before the line map has seen the end of the file.  */
+  size_t total_lines;
+
+  /* This is a record of the beginning and end of the lines we've seen
+     while reading the file.  This is useful to avoid walking the data
+     from the beginning when we are asked to read a line that is
+     before LINE_START_IDX above.  Note that the maximum size of this
+     record is fcache_line_record_size, so that the memory consumption
+     doesn't explode.  We thus scale total_lines down to
+     fcache_line_record_size.  */
+  vec<line_info, va_heap> line_record;
+
+  fcache ();
+  ~fcache ();
+};
 
 /* Current position in real source file.  */
 
@@ -29,6 +109,11 @@ location_t input_location;
 
 struct line_maps *line_table;
 
+static fcache *fcache_tab;
+static const size_t fcache_tab_size = 16;
+static const size_t fcache_buffer_size = 4 * 1024;
+static const size_t fcache_line_record_size = 100;
+
 /* Expand the source location LOC into a human readable location.  If
    LOC resolves to a builtin location, the file name of the readable
    location is set to the string "<built-in>". If EXPANSION_POINT_P is
@@ -87,56 +172,542 @@ expand_location_1 (source_location loc,
   return xloc;
 }
 
-/* Reads one line from file into a static buffer.  */
-static const char *
-read_line (FILE *file)
+/* Initialize the set of cache used for files accessed by caret
+   diagnostic.  */
+
+static void
+diagnostic_file_cache_init (void)
 {
-  static char *string;
-  static size_t string_len;
-  size_t pos = 0;
-  char *ptr;
+  if (fcache_tab == NULL)
+    fcache_tab = new fcache[fcache_tab_size];
+}
 
-  if (!string_len)
+/* Free the ressources used by the set of cache used for files accessed
+   by caret diagnostic.  */
+
+void
+diagnostic_file_cache_fini (void)
+{
+  if (fcache_tab)
     {
-      string_len = 200;
-      string = XNEWVEC (char, string_len);
+      delete [] (fcache_tab);
+      fcache_tab = NULL;
     }
+}
 
-  while ((ptr = fgets (string + pos, string_len - pos, file)))
+/* Return the total lines number that have been read so far by the
+   line map (in the preprocessor) so far.  For languages like C++ that
+   entirely preprocess the input file before starting to parse, this
+   equals the actual number of lines of the file.  */
+
+static size_t
+total_lines_num (const char *file_path)
+{
+  size_t r = 0;
+  source_location l = 0;
+  if (linemap_get_file_highest_location (line_table, file_path, &l))
     {
-      size_t len = strlen (string + pos);
+      gcc_assert (l >= RESERVED_LOCATION_COUNT);
+      expanded_location xloc = expand_location (l);
+      r = xloc.line;
+    }
+  return r;
+}
+
+/* Lookup the cache used for the content of a given file accessed by
+   caret diagnostic.  Return the found cached file, or NULL if no
+   cached file was found.  */
+
+static fcache*
+lookup_file_in_cache_tab (const char *file_path)
+{
+  if (file_path == NULL)
+    return NULL;
 
-      if (string[pos + len - 1] == '\n')
+  diagnostic_file_cache_init ();
+
+  /* This will contain the found cached file.  */
+  fcache *r = NULL;
+  for (unsigned i = 0; i < fcache_tab_size; ++i)
+    {
+      fcache *c = &fcache_tab[i];
+      if (c->file_path && !strcmp (c->file_path, file_path))
 	{
-	  string[pos + len - 1] = 0;
-	  return string;
+	  ++c->use_count;
+	  r = c;
 	}
-      pos += len;
-      string = XRESIZEVEC (char, string, string_len * 2);
-      string_len *= 2;
     }
-      
-  return pos ? string : NULL;
+
+  if (r)
+    ++r->use_count;
+
+  return r;
+}
+
+/* Return the file cache that has been less used, recently, or the
+   first empty one.  If HIGHEST_USE_COUNT is non-null,
+   *HIGHEST_USE_COUNT is set to the highest use count of the entries
+   in the cache table.  */
+
+static fcache*
+evicted_cache_tab_entry (unsigned *highest_use_count)
+{
+  diagnostic_file_cache_init ();
+
+  fcache *to_evict = &fcache_tab[0];
+  unsigned huc = to_evict->use_count;
+  for (unsigned i = 1; i < fcache_tab_size; ++i)
+    {
+      fcache *c = &fcache_tab[i];
+      bool c_is_empty = (c->file_path == NULL);
+
+      if (c->use_count < to_evict->use_count
+	  || (to_evict->file_path && c_is_empty))
+	/* We evict C because it's either an entry with a lower use
+	   count or one that is empty.  */
+	to_evict = c;
+
+      if (huc < c->use_count)
+	huc = c->use_count;
+
+      if (c_is_empty)
+	/* We've reached the end of the cache; subsequent elements are
+	   all empty.  */
+	break;
+    }
+
+  if (highest_use_count)
+    *highest_use_count = huc;
+
+  return to_evict;
+}
+
+/* Create the cache used for the content of a given file to be
+   accessed by caret diagnostic.  This cache is added to an array of
+   cache and can be retrieved by lookup_file_in_cache_tab.  This
+   function returns the created cache.  Note that only the last
+   fcache_tab_size files are cached.  */
+
+static fcache*
+add_file_to_cache_tab (const char *file_path)
+{
+
+  FILE *fp = fopen (file_path, "r");
+  if (ferror (fp))
+    {
+      fclose (fp);
+      return NULL;
+    }
+
+  unsigned highest_use_count = 0;
+  fcache *r = evicted_cache_tab_entry (&highest_use_count);
+  r->file_path = file_path;
+  if (r->fp)
+    fclose (r->fp);
+  r->fp = fp;
+  r->nb_read = 0;
+  r->line_start_idx = 0;
+  r->line_num = 0;
+  r->line_record.truncate (0);
+  /* Ensure that this cache entry doesn't get evicted next time
+     add_file_to_cache_tab is called.  */
+  r->use_count = ++highest_use_count;
+  r->total_lines = total_lines_num (file_path);
+
+  return r;
+}
+
+/* Lookup the cache used for the content of a given file accessed by
+   caret diagnostic.  If no cached file was found, create a new cache
+   for this file, add it to the array of cached file and return
+   it.  */
+
+static fcache*
+lookup_or_add_file_to_cache_tab (const char *file_path)
+{
+  fcache *r = lookup_file_in_cache_tab (file_path);
+  if (r == NULL)
+    r = add_file_to_cache_tab (file_path);
+  return r;
+}
+
+/* Default constructor for a cache of file used by caret
+   diagnostic.  */
+
+fcache::fcache ()
+: use_count (0), file_path (NULL), fp (NULL), data (0),
+  size (0), nb_read (0), line_start_idx (0), line_num (0),
+  total_lines (0)
+{
+  line_record.create (0);
+}
+
+/* Destructor for a cache of file used by caret diagnostic.  */
+
+fcache::~fcache ()
+{
+  if (fp)
+    {
+      fclose (fp);
+      fp = NULL;
+    }
+  if (data)
+    {
+      XDELETEVEC (data);
+      data = 0;
+    }
+  line_record.release ();
+}
+
+/* Returns TRUE iff the cache would need to be filled with data coming
+   from the file.  That is, either the cache is empty or full or the
+   current line is empty.  Note that if the cache is full, it would
+   need to be extended and filled again.  */
+
+static bool
+needs_read (fcache *c)
+{
+  return (c->nb_read == 0
+	  || c->nb_read == c->size
+	  || (c->line_start_idx >= c->nb_read - 1));
+}
+
+/*  Return TRUE iff the cache is full and thus needs to be
+    extended.  */
+
+static bool
+needs_grow (fcache *c)
+{
+  return c->nb_read == c->size;
+}
+
+/* Grow the cache if it needs to be extended.  */
+
+static void
+maybe_grow (fcache *c)
+{
+  if (!needs_grow (c))
+    return;
+
+  size_t size = c->size == 0 ? fcache_buffer_size : c->size * 2;
+  c->data = XRESIZEVEC (char, c->data, size + 1);
+  c->size = size;
+}
+
+/*  Read more data into the cache.  Extends the cache if need be.
+    Returns TRUE iff new data could be read.  */
+
+static bool
+read_data (fcache *c)
+{
+  if (feof (c->fp) || ferror (c->fp))
+    return false;
+
+  maybe_grow (c);
+
+  char * from = c->data + c->nb_read;
+  size_t to_read = c->size - c->nb_read;
+  size_t nb_read = fread (from, 1, to_read, c->fp);
+
+  if (ferror (c->fp))
+    return false;
+
+  c->nb_read += nb_read;
+  return !!nb_read;
+}
+
+/* Read new data iff the cache needs to be filled with more data
+   coming from the file FP.  Return TRUE iff the cache was filled with
+   mode data.  */
+
+static bool
+maybe_read_data (fcache *c)
+{
+  if (!needs_read (c))
+    return false;
+  return read_data (c);
+}
+
+/* Read a new line from file FP, using C as a cache for the data
+   coming from the file.  Upon successful completion, *LINE is set to
+   the beginning of the line found.  Space for that line has been
+   allocated in the cache thus *LINE has the same life time as C.
+   *LINE_LEN is set to the length of the line.  Note that the line
+   does not contain any terminal delimiter.  This function returns
+   true if some data was read or process from the cache, false
+   otherwise.  Note that subsequent calls to get_next_line return the
+   next lines of the file and might overwrite the content of
+   *LINE.  */
+
+static bool
+get_next_line (fcache *c, char **line, ssize_t *line_len)
+{
+  /* Fill the cache with data to process.  */
+  maybe_read_data (c);
+
+  size_t remaining_size = c->nb_read - c->line_start_idx;
+  if (remaining_size == 0)
+    /* There is no more data to process.  */
+    return false;
+
+  char *line_start = c->data + c->line_start_idx;
+
+  char *next_line_start = NULL;
+  size_t len = 0;
+  char *line_end = (char *) memchr (line_start, '\n', remaining_size);
+  if (line_end == NULL)
+    {
+      /* We haven't found the end-of-line delimiter in the cache.
+	 Fill the cache with more data from the file and look for the
+	 '\n'.  */
+      while (maybe_read_data (c))
+	{
+	  line_start = c->data + c->line_start_idx;
+	  remaining_size = c->nb_read - c->line_start_idx;
+	  line_end = (char *) memchr (line_start, '\n', remaining_size);
+	  if (line_end != NULL)
+	    {
+	      next_line_start = line_end + 1;
+	      break;
+	    }
+	}
+      if (line_end == NULL)
+	/* We've loadded all the file into the cache and still no
+	   '\n'.  Let's say the line ends up at one byte passed the
+	   end of the file.  This is to stay consistent with the case
+	   of when the line ends up with a '\n' and line_end points to
+	   that terminal '\n'.  That consistency is useful below in
+	   the len calculation.  */
+	line_end = c->data + c->nb_read ;
+    }
+  else
+    next_line_start = line_end + 1;
+
+  if (ferror (c->fp))
+    return -1;
+
+  /* At this point, we've found the end of the of line.  It either
+     points to the '\n' or to one byte after the last byte of the
+     file.  */
+  gcc_assert (line_end != NULL);
+
+  len = line_end - line_start;
+
+  if (c->line_start_idx < c->nb_read)
+    *line = line_start;
+
+  ++c->line_num;
+
+  /* Before we update our line record, make sure the hint about the
+     total number of lines of the file is correct.  If it's not, then
+     we give up recording line boundaries from now on.  */
+  bool update_line_record = true;
+  if (c->line_num > c->total_lines)
+    update_line_record = false;
+
+    /* Now update our line record so that re-reading lines from the
+     before c->line_start_idx is faster.  */
+  if (update_line_record
+      && c->line_record.length () < fcache_line_record_size)
+    {
+      /* If the file lines fits in the line record, we just record all
+	 its lines ...*/
+      if (c->total_lines <= fcache_line_record_size
+	  && c->line_num > c->line_record.length ())
+	c->line_record.safe_push (fcache::line_info (c->line_num,
+						 c->line_start_idx,
+						 line_end - c->data));
+      else if (c->total_lines > fcache_line_record_size)
+	{
+	  /* ... otherwise, we just scale total_lines down to
+	     (fcache_line_record_size lines.  */
+	  size_t n = (c->line_num * fcache_line_record_size) / c->total_lines;
+	  if (c->line_record.length () == 0
+	      || n >= c->line_record.length ())
+	    c->line_record.safe_push (fcache::line_info (c->line_num,
+						     c->line_start_idx,
+						     line_end - c->data));
+	}
+    }
+
+  /* Update c->line_start_idx so that it points to the next line to be
+     read.  */
+  if (next_line_start)
+    c->line_start_idx = next_line_start - c->data;
+  else
+    /* We didn't find any terminal '\n'.  Let's consider that the end
+       of line is the end of the data in the cache.  The next
+       invocation of get_next_line will either read more data from the
+       underlying file or return false early because we've reached the
+       end of the file.  */
+    c->line_start_idx = c->nb_read;
+
+  *line_len = len;
+
+  return true;
+}
+
+/* Reads the next line from FILE into *LINE.  If *LINE is too small
+   (or NULL) it is allocated (or extended) to have enough space to
+   containe the line.  *LINE_LENGTH must contain the size of the
+   initial*LINE buffer.  It's then updated by this function to the
+   actual length of the returned line.  Note that the returned line
+   can contain several zero bytes.  Also note that the returned string
+   is allocated in static storage that is going to be re-used by
+   subsequent invocations of read_line.  */
+
+static bool
+read_next_line (fcache *cache, char ** line, ssize_t *line_len)
+{
+  char *l = NULL;
+  ssize_t len = 0;
+
+  if (!get_next_line (cache, &l, &len))
+    return false;
+
+  if (*line == NULL)
+    *line = XNEWVEC (char, len);
+  else
+    if (*line_len < len)
+	*line = XRESIZEVEC (char, *line, len);
+
+  memcpy (*line, l, len);
+  *line_len = len;
+
+  return true;
+}
+
+/* Consume the next bytes coming from the cache (or from its
+   underlying file if there are remaining unread bytes in the file)
+   until we reach the next end-of-line (or end-of-file).  There is no
+   copying from the cache involved.  Return TRUE upon successful
+   completion.  */
+
+static bool
+goto_next_line (fcache *cache)
+{
+  char *l;
+  ssize_t len;
+
+  return get_next_line (cache, &l, &len);
+}
+
+/* Read an arbitrary line number LINE_NUM from the file cached in C.
+   The line is copied into *LINE.  *LINE_LEN must have been set to the
+   length of *LINE.  If *LINE is too small (or NULL) it's extended (or
+   allocated) and *LINE_LEN is adjusted accordingly.  *LINE ends up
+   with a terminal zero byte and can contain additional zero bytes.
+   This function returns bool if a line was read.  */
+
+static bool
+read_line_num (fcache *c, size_t line_num,
+	       char ** line, ssize_t *line_len)
+{
+  gcc_assert (line_num > 0);
+
+  if (line_num <= c->line_num)
+    {
+      /* We've been asked to read lines that are before c->line_num.
+	 So lets use our line record (if it's not empty) to try to
+	 avoid re-reading the file from the beginning again.  */
+
+      if (c->line_record.is_empty ())
+	{
+	  c->line_start_idx = 0;
+	  c->line_num = 0;
+	}
+      else
+	{
+	  fcache::line_info *i = NULL;
+	  if (c->total_lines <= fcache_line_record_size)
+	    {
+	      /* In languages where the input file is not totally
+		 preprocessed up front, the c->total_lines hint
+		 can be smaller than the number of lines of the
+		 file.  In that case, only the first
+		 c->total_lines have been recorded.
+
+		 Otherwise, the first c->total_lines we've read have
+		 their start/end recorded here.  */
+	      i = (line_num <= c->total_lines)
+		? &c->line_record[line_num - 1]
+		: &c->line_record[c->total_lines - 1];
+	      gcc_assert (i->line_num <= line_num);
+	    }
+	  else
+	    {
+	      /*  So the file had more lines than our line record
+		  size.  Thus the number of lines we've recorded has
+		  been scaled down to fcache_line_reacord_size.  Let's
+		  pick the start/end of the recorded line that is
+		  closest to line_num.  */
+	      size_t n = (line_num <= c->total_lines)
+		? line_num * fcache_line_record_size / c->total_lines
+		: c ->line_record.length () - 1;
+	      if (n < c->line_record.length ())
+		{
+		  i = &c->line_record[n];
+		  gcc_assert (i->line_num <= line_num);
+		}
+	    }
+
+	  if (i && i->line_num == line_num)
+	    {
+	      /* We have the start/end of the line.  Let's just copy
+		 it again and we are done.  */
+	      ssize_t len = i->end_pos - i->start_pos + 1;
+	      if (*line_len < len)
+		*line = XRESIZEVEC (char, *line, len);
+	      memmove (*line, c->data + i->start_pos, len);
+	      (*line)[len - 1] = '\0';
+	      *line_len = --len;
+	      return true;
+	    }
+
+	  if (i)
+	    {
+	      c->line_start_idx = i->start_pos;
+	      c->line_num = i->line_num - 1;
+	    }
+	  else
+	    {
+	      c->line_start_idx = 0;
+	      c->line_num = 0;
+	    }
+	}
+    }
+
+  /*  Let's walk from line c->line_num up to line_num - 1, without
+      copying any line.  */
+  while (c->line_num < line_num - 1)
+    if (!goto_next_line (c))
+      return false;
+
+  /* The line we want is the next one.  Let's read and copy it back to
+     the caller.  */
+  return read_next_line (c, line, line_len);
 }
 
 /* Return the physical source line that corresponds to xloc in a
    buffer that is statically allocated.  The newline is replaced by
-   the null character.  */
+   the null character.  Note that the line can contain several null
+   characters, so LINE_LEN, if non-null, points to the actual length
+   of the line.  */
 
 const char *
-location_get_source_line (expanded_location xloc)
+location_get_source_line (expanded_location xloc,
+			  int *line_len)
 {
-  const char *buffer;
-  int lines = 1;
-  FILE *stream = xloc.file ? fopen (xloc.file, "r") : NULL;
-  if (!stream)
-    return NULL;
+  static char *buffer;
+  static ssize_t len;
+
+  fcache * c = lookup_or_add_file_to_cache_tab (xloc.file);
+  bool read = read_line_num (c, xloc.line, &buffer, &len);
 
-  while ((buffer = read_line (stream)) && lines < xloc.line)
-    lines++;
+  if (read && line_len)
+    *line_len = len;
 
-  fclose (stream);
-  return buffer;
+  return read ? buffer : NULL;
 }
 
 /* Expand the source location LOC into a human readable location.  If
diff --git a/gcc/input.h b/gcc/input.h
index 8fdc7b2..c82023f 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -37,7 +37,8 @@ extern char builtins_location_check[(BUILTINS_LOCATION
 				     < RESERVED_LOCATION_COUNT) ? 1 : -1];
 
 extern expanded_location expand_location (source_location);
-extern const char *location_get_source_line (expanded_location xloc);
+extern const char *location_get_source_line (expanded_location xloc,
+					     int *line_size);
 extern expanded_location expand_location_to_spelling_point (source_location);
 extern source_location expansion_point_location_if_in_system_header (source_location);
 
@@ -65,4 +66,6 @@ extern location_t input_location;
 
 void dump_line_table_statistics (void);
 
+void diagnostics_file_cache_fini (void);
+
 #endif
diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c b/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c
new file mode 100644
index 0000000000000000000000000000000000000000..ff2ed962ac96e47ae05b0b040f4e10b8e09637e2
GIT binary patch
literal 240
zcmdPbSEyD<N!LxuS12e-Ehx%QPAx80sO92PVo*}h*HVDUmM0eFW#*+TDCL#r<R~O(
UBo-wmm!uXcDby-x=?^KT09Xk|)&Kwi

literal 0
HcmV?d00001

diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index a0d6da1..3504fd6 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -756,6 +756,14 @@ struct linemap_stats
   long duplicated_macro_maps_locations_size;
 };
 
+/* Return the highest location emitted for a given file for which
+   there is a line map in SET.  FILE_NAME is the file name to
+   consider.  If the function returns TRUE, *LOC is set to the highest
+   location emitted for that file.  */
+bool linemap_get_file_highest_location (struct line_maps * set,
+					const char *file_name,
+					source_location*LOC);
+
 /* Compute and return statistics about the memory consumption of some
    parts of the line table SET.  */
 void linemap_get_statistics (struct line_maps *, struct linemap_stats *);
diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index 2ad7ad2..98db486 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -1502,6 +1502,46 @@ linemap_dump_location (struct line_maps *set,
 	   path, from, l, c, s, (void*)map, e, loc, location);
 }
 
+/* Return the highest location emitted for a given file for which
+   there is a line map in SET.  FILE_NAME is the file name to
+   consider.  If the function returns TRUE, *LOC is set to the highest
+   location emitted for that file.  */
+
+bool
+linemap_get_file_highest_location (struct line_maps *set,
+				   const char *file_name,
+				   source_location *loc)
+{
+  /* If the set is empty or no ordinary map has been created then
+     there is no file to look for ...  */
+  if (set == NULL || set->info_ordinary.used == 0)
+    return false;
+
+  /* Now look for the last ordinary map created for FILE_NAME.  */
+  int i;
+  for (i = set->info_ordinary.used - 1; i >= 0; --i)
+    {
+      const char *fname = set->info_ordinary.maps[i].d.ordinary.to_file;
+      if (fname && !strcmp (fname, file_name))
+	break;
+    }
+
+  if (i < 0)
+    return false;
+
+  /* The highest location for a given map is either the starting
+     location of the next map minus one, or -- if the map is the
+     latest one -- the highest location of the set.  */
+  source_location result;
+  if (i == (int) set->info_ordinary.used - 1)
+    result = set->highest_location;
+  else
+    result = set->info_ordinary.maps[i + 1].start_location - 1;
+
+  *loc = result;
+  return true;
+}
+
 /* Compute and return statistics about the memory consumption of some
    parts of the line table SET.  */
 
-- 
		Dodji

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-11-14 15:12                 ` Dodji Seketeli
@ 2013-12-09 20:11                   ` Tom Tromey
  2014-01-21 12:28                   ` Bernd Edlinger
  1 sibling, 0 replies; 46+ messages in thread
From: Tom Tromey @ 2013-12-09 20:11 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: Jakub Jelinek, GCC Patches, Manuel López-Ibáñez,
	Bernd Edlinger

>>>>> "Dodji" == Dodji Seketeli <dodji@redhat.com> writes:

Dodji> 	* include/line-map.h (linemap_get_file_highest_location): Declare
Dodji> 	new function.
Dodji> 	* line-map.c (linemap_get_file_highest_location): Define it.

I wasn't sure if this is the patch you were needing review for ...

Dodji> +bool linemap_get_file_highest_location (struct line_maps * set,
Dodji> +					const char *file_name,
Dodji> +					source_location*LOC);

The spacing is slight off -- one too many before "set", one too few
before LOC.  And LOC presumably shouldn't be uppercase here.

Dodji> +      const char *fname = set->info_ordinary.maps[i].d.ordinary.to_file;
Dodji> +      if (fname && !strcmp (fname, file_name))

Other spots in this code use filename_cmp.

Otherwise the libcpp bits look ok to me.

Tom

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-11-14 15:12                 ` Dodji Seketeli
  2013-12-09 20:11                   ` Tom Tromey
@ 2014-01-21 12:28                   ` Bernd Edlinger
  2014-01-22  8:16                     ` Dodji Seketeli
  1 sibling, 1 reply; 46+ messages in thread
From: Bernd Edlinger @ 2014-01-21 12:28 UTC (permalink / raw)
  To: Dodji Seketeli, Jakub Jelinek
  Cc: GCC Patches, Tom Tromey, Manuel López-Ibáñez

[-- Attachment #1: Type: text/plain, Size: 32107 bytes --]

Hi,

since there was no progress in the last 2 months on that matter,
and we are quite late in Phase 3 now,
I dare to propose an alternative, very simple solution for this bug now.

It does not try to improve or degade the perfomance at all, instead it simply
detects binary files with embedded NULs and stops parsing at that point.

Boot-strapped and regression-tested on X86_64-linux-gnu.
Ok for trunk?


Bernd.



On Thu, 14 Nov 2013 15:01:59, Dodji Seketeli wrote:
>
> Jakub Jelinek <jakub@redhat.com> writes:
>
>> On Tue, Nov 12, 2013 at 04:33:41PM +0100, Dodji Seketeli wrote:
>>> +
>>> + memmove (*line, l, len);
>>> + (*line)[len - 1] = '\0';
>>> + *line_len = --len;
>>
>> Shouldn't this be testing that len> 0 && (*line)[len - 1] == '\n'
>> first before you decide to overwrite it and decrement len?
>
> That code above is in a if (len> 0) block. So checking that condition
> again is not necessary. Also, I think we don't need to test there is a
> terminal '\n' at the end because get_next_line always return the line
> content followed either by a '\n' or by a "junk byte" that is right
> after the last byte of the file -- in case we reach end of file w/o
> seeing a '\n'.
>
>> Though in that case there would be no '\0' termination of the string
>> for files not ending in a new-line. So, either get_next_line should
>> append '\n' to the buffer, or you should have there space for that, or
>> you can't rely on zero termination of the string and need to use just
>> the length.
>
> OK, I am settling for doing away with the '\0' altogether.
>
> The patch below makes get_next_line always point to the last character
> of the line before the '\n' when it is present. So '\n' is never
> counted int the string. I guess that's less confusing to people.
>
> Tested on x86_64-unknown-linux-gnu against trunk.
>
> libcpp/ChangeLog:
>
> * include/line-map.h (linemap_get_file_highest_location): Declare
> new function.
> * line-map.c (linemap_get_file_highest_location): Define it.
>
> gcc/ChangeLog:
>
> * input.h (location_get_source_line): Take an additional line_size
> parameter.
> (void diagnostics_file_cache_fini): Declare new function.
> * input.c (struct fcache): New type.
> (fcache_tab_size, fcache_buffer_size, fcache_line_record_size):
> New static constants.
> (diagnostic_file_cache_init, total_lines_num)
> (lookup_file_in_cache_tab, evicted_cache_tab_entry)
> (add_file_to_cache_tab, lookup_or_add_file_to_cache_tab)
> (needs_read, needs_grow, maybe_grow, read_data, maybe_read_data)
> (get_next_line, read_next_line, goto_next_line, read_line_num):
> New static function definitions.
> (diagnostic_file_cache_fini): New function.
> (location_get_source_line): Take an additional output line_len
> parameter. Re-write using lookup_or_add_file_to_cache_tab and
> read_line_num.
> * diagnostic.c (diagnostic_finish): Call
> diagnostic_file_cache_fini.
> (adjust_line): Take an additional input parameter for the length
> of the line, rather than calculating it with strlen.
> (diagnostic_show_locus): Adjust the use of
> location_get_source_line and adjust_line with respect to their new
> signature. While displaying a line now, do not stop at the first
> null byte. Rather, display the zero byte as a space and keep
> going until we reach the size of the line.
> * Makefile.in: Add vec.o to OBJS-libcommon
>
> gcc/testsuite/ChangeLog:
>
> * c-c++-common/cpp/warning-zero-in-literals-1.c: New test file.
>
> git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@204453 138bc75d-0d04-0410-961f-82ee72b054a4
> ---
> gcc/Makefile.in | 3 +-
> gcc/diagnostic.c | 19 +-
> gcc/diagnostic.h | 1 +
> gcc/input.c | 633 ++++++++++++++++++++-
> gcc/input.h | 5 +-
> .../c-c++-common/cpp/warning-zero-in-literals-1.c | Bin 0 -> 240 bytes
> libcpp/include/line-map.h | 8 +
> libcpp/line-map.c | 40 ++
> 8 files changed, 670 insertions(+), 39 deletions(-)
> create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 49285e5..9fe9060 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1469,7 +1469,8 @@ OBJS = \
>
> # Objects in libcommon.a, potentially used by all host binaries and with
> # no target dependencies.
> -OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o input.o version.o
> +OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o \
> + vec.o input.o version.o
>
> # Objects in libcommon-target.a, used by drivers and by the core
> # compiler and containing target-dependent code.
> diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
> index 36094a1..6c83f03 100644
> --- a/gcc/diagnostic.c
> +++ b/gcc/diagnostic.c
> @@ -176,6 +176,8 @@ diagnostic_finish (diagnostic_context *context)
> progname);
> pp_newline_and_flush (context->printer);
> }
> +
> + diagnostic_file_cache_fini ();
> }
>
> /* Initialize DIAGNOSTIC, where the message MSG has already been
> @@ -259,12 +261,13 @@ diagnostic_build_prefix (diagnostic_context *context,
> MAX_WIDTH by some margin, then adjust the start of the line such
> that the COLUMN is smaller than MAX_WIDTH minus the margin. The
> margin is either 10 characters or the difference between the column
> - and the length of the line, whatever is smaller. */
> + and the length of the line, whatever is smaller. The length of
> + LINE is given by LINE_WIDTH. */
> static const char *
> -adjust_line (const char *line, int max_width, int *column_p)
> +adjust_line (const char *line, int line_width,
> + int max_width, int *column_p)
> {
> int right_margin = 10;
> - int line_width = strlen (line);
> int column = *column_p;
>
> right_margin = MIN (line_width - column, right_margin);
> @@ -284,6 +287,7 @@ diagnostic_show_locus (diagnostic_context * context,
> const diagnostic_info *diagnostic)
> {
> const char *line;
> + int line_width;
> char *buffer;
> expanded_location s;
> int max_width;
> @@ -297,22 +301,25 @@ diagnostic_show_locus (diagnostic_context * context,
>
> context->last_location = diagnostic->location;
> s = expand_location_to_spelling_point (diagnostic->location);
> - line = location_get_source_line (s);
> + line = location_get_source_line (s, &line_width);
> if (line == NULL)
> return;
>
> max_width = context->caret_max_width;
> - line = adjust_line (line, max_width, &(s.column));
> + line = adjust_line (line, line_width, max_width, &(s.column));
>
> pp_newline (context->printer);
> saved_prefix = pp_get_prefix (context->printer);
> pp_set_prefix (context->printer, NULL);
> pp_space (context->printer);
> - while (max_width> 0 && *line != '\0')
> + while (max_width> 0 && line_width> 0)
> {
> char c = *line == '\t' ? ' ' : *line;
> + if (c == '\0')
> + c = ' ';
> pp_character (context->printer, c);
> max_width--;
> + line_width--;
> line++;
> }
> pp_newline (context->printer);
> diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h
> index cb38d37..3f30e06 100644
> --- a/gcc/diagnostic.h
> +++ b/gcc/diagnostic.h
> @@ -291,6 +291,7 @@ void default_diagnostic_starter (diagnostic_context *, diagnostic_info *);
> void default_diagnostic_finalizer (diagnostic_context *, diagnostic_info *);
> void diagnostic_set_caret_max_width (diagnostic_context *context, int value);
>
> +void diagnostic_file_cache_fini (void);
>
> /* Pure text formatting support functions. */
> extern char *file_name_as_prefix (diagnostic_context *, const char *);
> diff --git a/gcc/input.c b/gcc/input.c
> index a141a92..1a5d014 100644
> --- a/gcc/input.c
> +++ b/gcc/input.c
> @@ -22,6 +22,86 @@ along with GCC; see the file COPYING3. If not see
> #include "coretypes.h"
> #include "intl.h"
> #include "input.h"
> +#include "vec.h"
> +
> +/* This is a cache used by get_next_line to store the content of a
> + file to be searched for file lines. */
> +struct fcache
> +{
> + /* These are information used to store a line boundary. */
> + struct line_info
> + {
> + /* The line number. It starts from 1. */
> + size_t line_num;
> +
> + /* The position (byte count) of the beginning of the line,
> + relative to the file data pointer. This starts at zero. */
> + size_t start_pos;
> +
> + /* The position (byte count) of the last byte of the line. This
> + normally points to the '\n' character, or to one byte after the
> + last byte of the file, if the file doesn't contain a '\n'
> + character. */
> + size_t end_pos;
> +
> + line_info (size_t l, size_t s, size_t e)
> + : line_num (l), start_pos (s), end_pos (e)
> + {}
> +
> + line_info ()
> + :line_num (0), start_pos (0), end_pos (0)
> + {}
> + };
> +
> + /* The number of time this file has been accessed. This is used
> + to designate which file cache to evict from the cache
> + array. */
> + unsigned use_count;
> +
> + const char *file_path;
> +
> + FILE *fp;
> +
> + /* This points to the content of the file that we've read so
> + far. */
> + char *data;
> +
> + /* The size of the DATA array above.*/
> + size_t size;
> +
> + /* The number of bytes read from the underlying file so far. This
> + must be less (or equal) than SIZE above. */
> + size_t nb_read;
> +
> + /* The index of the beginning of the current line. */
> + size_t line_start_idx;
> +
> + /* The number of the previous line read. This starts at 1. Zero
> + means we've read no line so far. */
> + size_t line_num;
> +
> + /* This is the total number of lines of the current file. At the
> + moment, we try to get this information from the line map
> + subsystem. Note that this is just a hint. When using the C++
> + front-end, this hint is correct because the input file is then
> + completely tokenized before parsing starts; so the line map knows
> + the number of lines before compilation really starts. For e.g,
> + the C front-end, it can happen that we start emitting diagnostics
> + before the line map has seen the end of the file. */
> + size_t total_lines;
> +
> + /* This is a record of the beginning and end of the lines we've seen
> + while reading the file. This is useful to avoid walking the data
> + from the beginning when we are asked to read a line that is
> + before LINE_START_IDX above. Note that the maximum size of this
> + record is fcache_line_record_size, so that the memory consumption
> + doesn't explode. We thus scale total_lines down to
> + fcache_line_record_size. */
> + vec<line_info, va_heap> line_record;
> +
> + fcache ();
> + ~fcache ();
> +};
>
> /* Current position in real source file. */
>
> @@ -29,6 +109,11 @@ location_t input_location;
>
> struct line_maps *line_table;
>
> +static fcache *fcache_tab;
> +static const size_t fcache_tab_size = 16;
> +static const size_t fcache_buffer_size = 4 * 1024;
> +static const size_t fcache_line_record_size = 100;
> +
> /* Expand the source location LOC into a human readable location. If
> LOC resolves to a builtin location, the file name of the readable
> location is set to the string "<built-in>". If EXPANSION_POINT_P is
> @@ -87,56 +172,542 @@ expand_location_1 (source_location loc,
> return xloc;
> }
>
> -/* Reads one line from file into a static buffer. */
> -static const char *
> -read_line (FILE *file)
> +/* Initialize the set of cache used for files accessed by caret
> + diagnostic. */
> +
> +static void
> +diagnostic_file_cache_init (void)
> {
> - static char *string;
> - static size_t string_len;
> - size_t pos = 0;
> - char *ptr;
> + if (fcache_tab == NULL)
> + fcache_tab = new fcache[fcache_tab_size];
> +}
>
> - if (!string_len)
> +/* Free the ressources used by the set of cache used for files accessed
> + by caret diagnostic. */
> +
> +void
> +diagnostic_file_cache_fini (void)
> +{
> + if (fcache_tab)
> {
> - string_len = 200;
> - string = XNEWVEC (char, string_len);
> + delete [] (fcache_tab);
> + fcache_tab = NULL;
> }
> +}
>
> - while ((ptr = fgets (string + pos, string_len - pos, file)))
> +/* Return the total lines number that have been read so far by the
> + line map (in the preprocessor) so far. For languages like C++ that
> + entirely preprocess the input file before starting to parse, this
> + equals the actual number of lines of the file. */
> +
> +static size_t
> +total_lines_num (const char *file_path)
> +{
> + size_t r = 0;
> + source_location l = 0;
> + if (linemap_get_file_highest_location (line_table, file_path, &l))
> {
> - size_t len = strlen (string + pos);
> + gcc_assert (l>= RESERVED_LOCATION_COUNT);
> + expanded_location xloc = expand_location (l);
> + r = xloc.line;
> + }
> + return r;
> +}
> +
> +/* Lookup the cache used for the content of a given file accessed by
> + caret diagnostic. Return the found cached file, or NULL if no
> + cached file was found. */
> +
> +static fcache*
> +lookup_file_in_cache_tab (const char *file_path)
> +{
> + if (file_path == NULL)
> + return NULL;
>
> - if (string[pos + len - 1] == '\n')
> + diagnostic_file_cache_init ();
> +
> + /* This will contain the found cached file. */
> + fcache *r = NULL;
> + for (unsigned i = 0; i < fcache_tab_size; ++i)
> + {
> + fcache *c = &fcache_tab[i];
> + if (c->file_path && !strcmp (c->file_path, file_path))
> {
> - string[pos + len - 1] = 0;
> - return string;
> + ++c->use_count;
> + r = c;
> }
> - pos += len;
> - string = XRESIZEVEC (char, string, string_len * 2);
> - string_len *= 2;
> }
> -
> - return pos ? string : NULL;
> +
> + if (r)
> + ++r->use_count;
> +
> + return r;
> +}
> +
> +/* Return the file cache that has been less used, recently, or the
> + first empty one. If HIGHEST_USE_COUNT is non-null,
> + *HIGHEST_USE_COUNT is set to the highest use count of the entries
> + in the cache table. */
> +
> +static fcache*
> +evicted_cache_tab_entry (unsigned *highest_use_count)
> +{
> + diagnostic_file_cache_init ();
> +
> + fcache *to_evict = &fcache_tab[0];
> + unsigned huc = to_evict->use_count;
> + for (unsigned i = 1; i < fcache_tab_size; ++i)
> + {
> + fcache *c = &fcache_tab[i];
> + bool c_is_empty = (c->file_path == NULL);
> +
> + if (c->use_count < to_evict->use_count
> + || (to_evict->file_path && c_is_empty))
> + /* We evict C because it's either an entry with a lower use
> + count or one that is empty. */
> + to_evict = c;
> +
> + if (huc < c->use_count)
> + huc = c->use_count;
> +
> + if (c_is_empty)
> + /* We've reached the end of the cache; subsequent elements are
> + all empty. */
> + break;
> + }
> +
> + if (highest_use_count)
> + *highest_use_count = huc;
> +
> + return to_evict;
> +}
> +
> +/* Create the cache used for the content of a given file to be
> + accessed by caret diagnostic. This cache is added to an array of
> + cache and can be retrieved by lookup_file_in_cache_tab. This
> + function returns the created cache. Note that only the last
> + fcache_tab_size files are cached. */
> +
> +static fcache*
> +add_file_to_cache_tab (const char *file_path)
> +{
> +
> + FILE *fp = fopen (file_path, "r");
> + if (ferror (fp))
> + {
> + fclose (fp);
> + return NULL;
> + }
> +
> + unsigned highest_use_count = 0;
> + fcache *r = evicted_cache_tab_entry (&highest_use_count);
> + r->file_path = file_path;
> + if (r->fp)
> + fclose (r->fp);
> + r->fp = fp;
> + r->nb_read = 0;
> + r->line_start_idx = 0;
> + r->line_num = 0;
> + r->line_record.truncate (0);
> + /* Ensure that this cache entry doesn't get evicted next time
> + add_file_to_cache_tab is called. */
> + r->use_count = ++highest_use_count;
> + r->total_lines = total_lines_num (file_path);
> +
> + return r;
> +}
> +
> +/* Lookup the cache used for the content of a given file accessed by
> + caret diagnostic. If no cached file was found, create a new cache
> + for this file, add it to the array of cached file and return
> + it. */
> +
> +static fcache*
> +lookup_or_add_file_to_cache_tab (const char *file_path)
> +{
> + fcache *r = lookup_file_in_cache_tab (file_path);
> + if (r == NULL)
> + r = add_file_to_cache_tab (file_path);
> + return r;
> +}
> +
> +/* Default constructor for a cache of file used by caret
> + diagnostic. */
> +
> +fcache::fcache ()
> +: use_count (0), file_path (NULL), fp (NULL), data (0),
> + size (0), nb_read (0), line_start_idx (0), line_num (0),
> + total_lines (0)
> +{
> + line_record.create (0);
> +}
> +
> +/* Destructor for a cache of file used by caret diagnostic. */
> +
> +fcache::~fcache ()
> +{
> + if (fp)
> + {
> + fclose (fp);
> + fp = NULL;
> + }
> + if (data)
> + {
> + XDELETEVEC (data);
> + data = 0;
> + }
> + line_record.release ();
> +}
> +
> +/* Returns TRUE iff the cache would need to be filled with data coming
> + from the file. That is, either the cache is empty or full or the
> + current line is empty. Note that if the cache is full, it would
> + need to be extended and filled again. */
> +
> +static bool
> +needs_read (fcache *c)
> +{
> + return (c->nb_read == 0
> + || c->nb_read == c->size
> + || (c->line_start_idx>= c->nb_read - 1));
> +}
> +
> +/* Return TRUE iff the cache is full and thus needs to be
> + extended. */
> +
> +static bool
> +needs_grow (fcache *c)
> +{
> + return c->nb_read == c->size;
> +}
> +
> +/* Grow the cache if it needs to be extended. */
> +
> +static void
> +maybe_grow (fcache *c)
> +{
> + if (!needs_grow (c))
> + return;
> +
> + size_t size = c->size == 0 ? fcache_buffer_size : c->size * 2;
> + c->data = XRESIZEVEC (char, c->data, size + 1);
> + c->size = size;
> +}
> +
> +/* Read more data into the cache. Extends the cache if need be.
> + Returns TRUE iff new data could be read. */
> +
> +static bool
> +read_data (fcache *c)
> +{
> + if (feof (c->fp) || ferror (c->fp))
> + return false;
> +
> + maybe_grow (c);
> +
> + char * from = c->data + c->nb_read;
> + size_t to_read = c->size - c->nb_read;
> + size_t nb_read = fread (from, 1, to_read, c->fp);
> +
> + if (ferror (c->fp))
> + return false;
> +
> + c->nb_read += nb_read;
> + return !!nb_read;
> +}
> +
> +/* Read new data iff the cache needs to be filled with more data
> + coming from the file FP. Return TRUE iff the cache was filled with
> + mode data. */
> +
> +static bool
> +maybe_read_data (fcache *c)
> +{
> + if (!needs_read (c))
> + return false;
> + return read_data (c);
> +}
> +
> +/* Read a new line from file FP, using C as a cache for the data
> + coming from the file. Upon successful completion, *LINE is set to
> + the beginning of the line found. Space for that line has been
> + allocated in the cache thus *LINE has the same life time as C.
> + *LINE_LEN is set to the length of the line. Note that the line
> + does not contain any terminal delimiter. This function returns
> + true if some data was read or process from the cache, false
> + otherwise. Note that subsequent calls to get_next_line return the
> + next lines of the file and might overwrite the content of
> + *LINE. */
> +
> +static bool
> +get_next_line (fcache *c, char **line, ssize_t *line_len)
> +{
> + /* Fill the cache with data to process. */
> + maybe_read_data (c);
> +
> + size_t remaining_size = c->nb_read - c->line_start_idx;
> + if (remaining_size == 0)
> + /* There is no more data to process. */
> + return false;
> +
> + char *line_start = c->data + c->line_start_idx;
> +
> + char *next_line_start = NULL;
> + size_t len = 0;
> + char *line_end = (char *) memchr (line_start, '\n', remaining_size);
> + if (line_end == NULL)
> + {
> + /* We haven't found the end-of-line delimiter in the cache.
> + Fill the cache with more data from the file and look for the
> + '\n'. */
> + while (maybe_read_data (c))
> + {
> + line_start = c->data + c->line_start_idx;
> + remaining_size = c->nb_read - c->line_start_idx;
> + line_end = (char *) memchr (line_start, '\n', remaining_size);
> + if (line_end != NULL)
> + {
> + next_line_start = line_end + 1;
> + break;
> + }
> + }
> + if (line_end == NULL)
> + /* We've loadded all the file into the cache and still no
> + '\n'. Let's say the line ends up at one byte passed the
> + end of the file. This is to stay consistent with the case
> + of when the line ends up with a '\n' and line_end points to
> + that terminal '\n'. That consistency is useful below in
> + the len calculation. */
> + line_end = c->data + c->nb_read ;
> + }
> + else
> + next_line_start = line_end + 1;
> +
> + if (ferror (c->fp))
> + return -1;
> +
> + /* At this point, we've found the end of the of line. It either
> + points to the '\n' or to one byte after the last byte of the
> + file. */
> + gcc_assert (line_end != NULL);
> +
> + len = line_end - line_start;
> +
> + if (c->line_start_idx < c->nb_read)
> + *line = line_start;
> +
> + ++c->line_num;
> +
> + /* Before we update our line record, make sure the hint about the
> + total number of lines of the file is correct. If it's not, then
> + we give up recording line boundaries from now on. */
> + bool update_line_record = true;
> + if (c->line_num> c->total_lines)
> + update_line_record = false;
> +
> + /* Now update our line record so that re-reading lines from the
> + before c->line_start_idx is faster. */
> + if (update_line_record
> + && c->line_record.length () < fcache_line_record_size)
> + {
> + /* If the file lines fits in the line record, we just record all
> + its lines ...*/
> + if (c->total_lines <= fcache_line_record_size
> + && c->line_num> c->line_record.length ())
> + c->line_record.safe_push (fcache::line_info (c->line_num,
> + c->line_start_idx,
> + line_end - c->data));
> + else if (c->total_lines> fcache_line_record_size)
> + {
> + /* ... otherwise, we just scale total_lines down to
> + (fcache_line_record_size lines. */
> + size_t n = (c->line_num * fcache_line_record_size) / c->total_lines;
> + if (c->line_record.length () == 0
> + || n>= c->line_record.length ())
> + c->line_record.safe_push (fcache::line_info (c->line_num,
> + c->line_start_idx,
> + line_end - c->data));
> + }
> + }
> +
> + /* Update c->line_start_idx so that it points to the next line to be
> + read. */
> + if (next_line_start)
> + c->line_start_idx = next_line_start - c->data;
> + else
> + /* We didn't find any terminal '\n'. Let's consider that the end
> + of line is the end of the data in the cache. The next
> + invocation of get_next_line will either read more data from the
> + underlying file or return false early because we've reached the
> + end of the file. */
> + c->line_start_idx = c->nb_read;
> +
> + *line_len = len;
> +
> + return true;
> +}
> +
> +/* Reads the next line from FILE into *LINE. If *LINE is too small
> + (or NULL) it is allocated (or extended) to have enough space to
> + containe the line. *LINE_LENGTH must contain the size of the
> + initial*LINE buffer. It's then updated by this function to the
> + actual length of the returned line. Note that the returned line
> + can contain several zero bytes. Also note that the returned string
> + is allocated in static storage that is going to be re-used by
> + subsequent invocations of read_line. */
> +
> +static bool
> +read_next_line (fcache *cache, char ** line, ssize_t *line_len)
> +{
> + char *l = NULL;
> + ssize_t len = 0;
> +
> + if (!get_next_line (cache, &l, &len))
> + return false;
> +
> + if (*line == NULL)
> + *line = XNEWVEC (char, len);
> + else
> + if (*line_len < len)
> + *line = XRESIZEVEC (char, *line, len);
> +
> + memcpy (*line, l, len);
> + *line_len = len;
> +
> + return true;
> +}
> +
> +/* Consume the next bytes coming from the cache (or from its
> + underlying file if there are remaining unread bytes in the file)
> + until we reach the next end-of-line (or end-of-file). There is no
> + copying from the cache involved. Return TRUE upon successful
> + completion. */
> +
> +static bool
> +goto_next_line (fcache *cache)
> +{
> + char *l;
> + ssize_t len;
> +
> + return get_next_line (cache, &l, &len);
> +}
> +
> +/* Read an arbitrary line number LINE_NUM from the file cached in C.
> + The line is copied into *LINE. *LINE_LEN must have been set to the
> + length of *LINE. If *LINE is too small (or NULL) it's extended (or
> + allocated) and *LINE_LEN is adjusted accordingly. *LINE ends up
> + with a terminal zero byte and can contain additional zero bytes.
> + This function returns bool if a line was read. */
> +
> +static bool
> +read_line_num (fcache *c, size_t line_num,
> + char ** line, ssize_t *line_len)
> +{
> + gcc_assert (line_num> 0);
> +
> + if (line_num <= c->line_num)
> + {
> + /* We've been asked to read lines that are before c->line_num.
> + So lets use our line record (if it's not empty) to try to
> + avoid re-reading the file from the beginning again. */
> +
> + if (c->line_record.is_empty ())
> + {
> + c->line_start_idx = 0;
> + c->line_num = 0;
> + }
> + else
> + {
> + fcache::line_info *i = NULL;
> + if (c->total_lines <= fcache_line_record_size)
> + {
> + /* In languages where the input file is not totally
> + preprocessed up front, the c->total_lines hint
> + can be smaller than the number of lines of the
> + file. In that case, only the first
> + c->total_lines have been recorded.
> +
> + Otherwise, the first c->total_lines we've read have
> + their start/end recorded here. */
> + i = (line_num <= c->total_lines)
> + ? &c->line_record[line_num - 1]
> + : &c->line_record[c->total_lines - 1];
> + gcc_assert (i->line_num <= line_num);
> + }
> + else
> + {
> + /* So the file had more lines than our line record
> + size. Thus the number of lines we've recorded has
> + been scaled down to fcache_line_reacord_size. Let's
> + pick the start/end of the recorded line that is
> + closest to line_num. */
> + size_t n = (line_num <= c->total_lines)
> + ? line_num * fcache_line_record_size / c->total_lines
> + : c ->line_record.length () - 1;
> + if (n < c->line_record.length ())
> + {
> + i = &c->line_record[n];
> + gcc_assert (i->line_num <= line_num);
> + }
> + }
> +
> + if (i && i->line_num == line_num)
> + {
> + /* We have the start/end of the line. Let's just copy
> + it again and we are done. */
> + ssize_t len = i->end_pos - i->start_pos + 1;
> + if (*line_len < len)
> + *line = XRESIZEVEC (char, *line, len);
> + memmove (*line, c->data + i->start_pos, len);
> + (*line)[len - 1] = '\0';
> + *line_len = --len;
> + return true;
> + }
> +
> + if (i)
> + {
> + c->line_start_idx = i->start_pos;
> + c->line_num = i->line_num - 1;
> + }
> + else
> + {
> + c->line_start_idx = 0;
> + c->line_num = 0;
> + }
> + }
> + }
> +
> + /* Let's walk from line c->line_num up to line_num - 1, without
> + copying any line. */
> + while (c->line_num < line_num - 1)
> + if (!goto_next_line (c))
> + return false;
> +
> + /* The line we want is the next one. Let's read and copy it back to
> + the caller. */
> + return read_next_line (c, line, line_len);
> }
>
> /* Return the physical source line that corresponds to xloc in a
> buffer that is statically allocated. The newline is replaced by
> - the null character. */
> + the null character. Note that the line can contain several null
> + characters, so LINE_LEN, if non-null, points to the actual length
> + of the line. */
>
> const char *
> -location_get_source_line (expanded_location xloc)
> +location_get_source_line (expanded_location xloc,
> + int *line_len)
> {
> - const char *buffer;
> - int lines = 1;
> - FILE *stream = xloc.file ? fopen (xloc.file, "r") : NULL;
> - if (!stream)
> - return NULL;
> + static char *buffer;
> + static ssize_t len;
> +
> + fcache * c = lookup_or_add_file_to_cache_tab (xloc.file);
> + bool read = read_line_num (c, xloc.line, &buffer, &len);
>
> - while ((buffer = read_line (stream)) && lines < xloc.line)
> - lines++;
> + if (read && line_len)
> + *line_len = len;
>
> - fclose (stream);
> - return buffer;
> + return read ? buffer : NULL;
> }
>
> /* Expand the source location LOC into a human readable location. If
> diff --git a/gcc/input.h b/gcc/input.h
> index 8fdc7b2..c82023f 100644
> --- a/gcc/input.h
> +++ b/gcc/input.h
> @@ -37,7 +37,8 @@ extern char builtins_location_check[(BUILTINS_LOCATION
> < RESERVED_LOCATION_COUNT) ? 1 : -1];
>
> extern expanded_location expand_location (source_location);
> -extern const char *location_get_source_line (expanded_location xloc);
> +extern const char *location_get_source_line (expanded_location xloc,
> + int *line_size);
> extern expanded_location expand_location_to_spelling_point (source_location);
> extern source_location expansion_point_location_if_in_system_header (source_location);
>
> @@ -65,4 +66,6 @@ extern location_t input_location;
>
> void dump_line_table_statistics (void);
>
> +void diagnostics_file_cache_fini (void);
> +
> #endif
> diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c b/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..ff2ed962ac96e47ae05b0b040f4e10b8e09637e2
> GIT binary patch
> literal 240
> zcmdPbSEyD<N!LxuS12e-Ehx%QPAx80sO92PVo*}h*HVDUmM0eFW#*+TDCL#r<R~O(
> UBo-wmm!uXcDby-x=?^KT09Xk|)&Kwi
>
> literal 0
> HcmV?d00001
>
> diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
> index a0d6da1..3504fd6 100644
> --- a/libcpp/include/line-map.h
> +++ b/libcpp/include/line-map.h
> @@ -756,6 +756,14 @@ struct linemap_stats
> long duplicated_macro_maps_locations_size;
> };
>
> +/* Return the highest location emitted for a given file for which
> + there is a line map in SET. FILE_NAME is the file name to
> + consider. If the function returns TRUE, *LOC is set to the highest
> + location emitted for that file. */
> +bool linemap_get_file_highest_location (struct line_maps * set,
> + const char *file_name,
> + source_location*LOC);
> +
> /* Compute and return statistics about the memory consumption of some
> parts of the line table SET. */
> void linemap_get_statistics (struct line_maps *, struct linemap_stats *);
> diff --git a/libcpp/line-map.c b/libcpp/line-map.c
> index 2ad7ad2..98db486 100644
> --- a/libcpp/line-map.c
> +++ b/libcpp/line-map.c
> @@ -1502,6 +1502,46 @@ linemap_dump_location (struct line_maps *set,
> path, from, l, c, s, (void*)map, e, loc, location);
> }
>
> +/* Return the highest location emitted for a given file for which
> + there is a line map in SET. FILE_NAME is the file name to
> + consider. If the function returns TRUE, *LOC is set to the highest
> + location emitted for that file. */
> +
> +bool
> +linemap_get_file_highest_location (struct line_maps *set,
> + const char *file_name,
> + source_location *loc)
> +{
> + /* If the set is empty or no ordinary map has been created then
> + there is no file to look for ... */
> + if (set == NULL || set->info_ordinary.used == 0)
> + return false;
> +
> + /* Now look for the last ordinary map created for FILE_NAME. */
> + int i;
> + for (i = set->info_ordinary.used - 1; i>= 0; --i)
> + {
> + const char *fname = set->info_ordinary.maps[i].d.ordinary.to_file;
> + if (fname && !strcmp (fname, file_name))
> + break;
> + }
> +
> + if (i < 0)
> + return false;
> +
> + /* The highest location for a given map is either the starting
> + location of the next map minus one, or -- if the map is the
> + latest one -- the highest location of the set. */
> + source_location result;
> + if (i == (int) set->info_ordinary.used - 1)
> + result = set->highest_location;
> + else
> + result = set->info_ordinary.maps[i + 1].start_location - 1;
> +
> + *loc = result;
> + return true;
> +}
> +
> /* Compute and return statistics about the memory consumption of some
> parts of the line table SET. */
>
> --
> Dodji 		 	   		  

[-- Attachment #2: changelog-readline.txt --]
[-- Type: text/plain, Size: 250 bytes --]

2014-01-21  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	PR preprocessor/58580
	Fix possible OOM error with embedded NULs or with incomplete last
	lines.
	* gcov.c (read_line): Improve error handling.
	* input.c (read_line): Improve error handling.

[-- Attachment #3: patch-readline.diff --]
[-- Type: application/octet-stream, Size: 2129 bytes --]

Index: gcc/gcov.c
===================================================================
--- gcc/gcov.c	(revision 206281)
+++ gcc/gcov.c	(working copy)
@@ -2382,12 +2382,25 @@ read_line (FILE *file)
     {
       size_t len = strlen (string + pos);
 
+      /* If fgets returns non-zero but the string has zero length,
+	 we have a line that starts with a NUL character.
+	 Return NULL in this case.  */
+      if (len == 0)
+	return NULL;
       if (string[pos + len - 1] == '\n')
 	{
 	  string[pos + len - 1] = 0;
 	  return string;
 	}
       pos += len;
+      /* If fgets returns a short line without NL at the end, and the
+	 file pointer is at EOF, we might have an incomplete last line,
+	 or we have a NUL character on the last line.
+	 Return the string as usual in this case.
+	 However if the file is not at EOF we have a line with an
+	 embedded NUL character.  Return NULL in this case.  */
+      if (pos < string_len - 1)
+	return feof (file) ? string : NULL;
       string = XRESIZEVEC (char, string, string_len * 2);
       string_len *= 2;
     }
Index: gcc/input.c
===================================================================
--- gcc/input.c	(revision 206281)
+++ gcc/input.c	(working copy)
@@ -106,12 +106,25 @@ read_line (FILE *file)
     {
       size_t len = strlen (string + pos);
 
+      /* If fgets returns non-zero but the string has zero length,
+	 we have a line that starts with a NUL character.
+	 Return NULL in this case.  */
+      if (len == 0)
+	return NULL;
       if (string[pos + len - 1] == '\n')
 	{
 	  string[pos + len - 1] = 0;
 	  return string;
 	}
       pos += len;
+      /* If fgets returns a short line without NL at the end, and the
+	 file pointer is at EOF, we might have an incomplete last line,
+	 or we have a NUL character on the last line.
+	 Return the string as usual in this case.
+	 However if the file is not at EOF we have a line with an
+	 embedded NUL character.  Return NULL in this case.  */
+      if (pos < string_len - 1)
+	return feof (file) ? string : NULL;
       string = XRESIZEVEC (char, string, string_len * 2);
       string_len *= 2;
     }

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2014-01-21 12:28                   ` Bernd Edlinger
@ 2014-01-22  8:16                     ` Dodji Seketeli
  2014-01-23 17:12                       ` Jakub Jelinek
  2014-01-24 15:05                       ` Markus Trippelsdorf
  0 siblings, 2 replies; 46+ messages in thread
From: Dodji Seketeli @ 2014-01-22  8:16 UTC (permalink / raw)
  To: Bernd Edlinger
  Cc: Jakub Jelinek, GCC Patches, Tom Tromey,
	Manuel López-Ibáñez

Bernd Edlinger <bernd.edlinger@hotmail.de> writes:

> Hi,

Hello,

> since there was no progress in the last 2 months on that matter,

Sorry, this is my bad.  I got sidetracked by something else and forgot
that I had the patch working et al, and all its bits that need approval
got approved.  It still can go in right now.  It improves performance
and fixes the issue the way it was discussed.

Here it is, regtested on x86_64-linux-gnu against trunk.

If nobody objects in the next 24 hours, I'll commit it.

libcpp/ChangeLog:

	* include/line-map.h (linemap_get_file_highest_location): Declare
	new function.
	* line-map.c (linemap_get_file_highest_location): Define it.

gcc/ChangeLog:

	* input.h (location_get_source_line): Take an additional line_size
	parameter.
	(void diagnostics_file_cache_fini): Declare new function.
	* input.c (struct fcache): New type.
	(fcache_tab_size, fcache_buffer_size, fcache_line_record_size):
	New static constants.
	(diagnostic_file_cache_init, total_lines_num)
	(lookup_file_in_cache_tab, evicted_cache_tab_entry)
	(add_file_to_cache_tab, lookup_or_add_file_to_cache_tab)
	(needs_read, needs_grow, maybe_grow, read_data, maybe_read_data)
	(get_next_line, read_next_line, goto_next_line, read_line_num):
	New static function definitions.
	(diagnostic_file_cache_fini): New function.
	(location_get_source_line): Take an additional output line_len
	parameter.  Re-write using lookup_or_add_file_to_cache_tab and
	read_line_num.
	* diagnostic.c (diagnostic_finish): Call
	diagnostic_file_cache_fini.
	(adjust_line): Take an additional input parameter for the length
	of the line, rather than calculating it with strlen.
	(diagnostic_show_locus): Adjust the use of
	location_get_source_line and adjust_line with respect to their new
	signature.  While displaying a line now, do not stop at the first
	null byte.  Rather, display the zero byte as a space and keep
	going until we reach the size of the line.
	* Makefile.in: Add vec.o to OBJS-libcommon

gcc/testsuite/ChangeLog:

	* c-c++-common/cpp/warning-zero-in-literals-1.c: New test file.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@204453 138bc75d-0d04-0410-961f-82ee72b054a4

Signed-off-by: Dodji Seketeli <dodji@seketeli.org>
---
 gcc/Makefile.in                                    |   3 +-
 gcc/diagnostic.c                                   |  19 +-
 gcc/diagnostic.h                                   |   1 +
 gcc/input.c                                        | 633 ++++++++++++++++++++-
 gcc/input.h                                        |   5 +-
 .../c-c++-common/cpp/warning-zero-in-literals-1.c  | Bin 0 -> 240 bytes
 libcpp/include/line-map.h                          |   8 +
 libcpp/line-map.c                                  |  40 ++
 8 files changed, 670 insertions(+), 39 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 4d683a0..06c617a 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1472,7 +1472,8 @@ OBJS = \
 
 # Objects in libcommon.a, potentially used by all host binaries and with
 # no target dependencies.
-OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o input.o version.o
+OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o \
+	vec.o  input.o version.o
 
 # Objects in libcommon-target.a, used by drivers and by the core
 # compiler and containing target-dependent code.
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 36094a1..6c83f03 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -176,6 +176,8 @@ diagnostic_finish (diagnostic_context *context)
 		     progname);
       pp_newline_and_flush (context->printer);
     }
+
+  diagnostic_file_cache_fini ();
 }
 
 /* Initialize DIAGNOSTIC, where the message MSG has already been
@@ -259,12 +261,13 @@ diagnostic_build_prefix (diagnostic_context *context,
    MAX_WIDTH by some margin, then adjust the start of the line such
    that the COLUMN is smaller than MAX_WIDTH minus the margin.  The
    margin is either 10 characters or the difference between the column
-   and the length of the line, whatever is smaller.  */
+   and the length of the line, whatever is smaller.  The length of
+   LINE is given by LINE_WIDTH.  */
 static const char *
-adjust_line (const char *line, int max_width, int *column_p)
+adjust_line (const char *line, int line_width,
+	     int max_width, int *column_p)
 {
   int right_margin = 10;
-  int line_width = strlen (line);
   int column = *column_p;
 
   right_margin = MIN (line_width - column, right_margin);
@@ -284,6 +287,7 @@ diagnostic_show_locus (diagnostic_context * context,
 		       const diagnostic_info *diagnostic)
 {
   const char *line;
+  int line_width;
   char *buffer;
   expanded_location s;
   int max_width;
@@ -297,22 +301,25 @@ diagnostic_show_locus (diagnostic_context * context,
 
   context->last_location = diagnostic->location;
   s = expand_location_to_spelling_point (diagnostic->location);
-  line = location_get_source_line (s);
+  line = location_get_source_line (s, &line_width);
   if (line == NULL)
     return;
 
   max_width = context->caret_max_width;
-  line = adjust_line (line, max_width, &(s.column));
+  line = adjust_line (line, line_width, max_width, &(s.column));
 
   pp_newline (context->printer);
   saved_prefix = pp_get_prefix (context->printer);
   pp_set_prefix (context->printer, NULL);
   pp_space (context->printer);
-  while (max_width > 0 && *line != '\0')
+  while (max_width > 0 && line_width > 0)
     {
       char c = *line == '\t' ? ' ' : *line;
+      if (c == '\0')
+	c = ' ';
       pp_character (context->printer, c);
       max_width--;
+      line_width--;
       line++;
     }
   pp_newline (context->printer);
diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h
index 49cb8c0..2bf1361 100644
--- a/gcc/diagnostic.h
+++ b/gcc/diagnostic.h
@@ -291,6 +291,7 @@ void default_diagnostic_starter (diagnostic_context *, diagnostic_info *);
 void default_diagnostic_finalizer (diagnostic_context *, diagnostic_info *);
 void diagnostic_set_caret_max_width (diagnostic_context *context, int value);
 
+void diagnostic_file_cache_fini (void);
 
 /* Pure text formatting support functions.  */
 extern char *file_name_as_prefix (diagnostic_context *, const char *);
diff --git a/gcc/input.c b/gcc/input.c
index a141a92..1a5d014 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -22,6 +22,86 @@ along with GCC; see the file COPYING3.  If not see
 #include "coretypes.h"
 #include "intl.h"
 #include "input.h"
+#include "vec.h"
+
+/* This is a cache used by get_next_line to store the content of a
+   file to be searched for file lines.  */
+struct fcache
+{
+  /* These are information used to store a line boundary.  */
+  struct line_info
+  {
+    /* The line number.  It starts from 1.  */
+    size_t line_num;
+
+    /* The position (byte count) of the beginning of the line,
+       relative to the file data pointer.  This starts at zero.  */
+    size_t start_pos;
+
+    /* The position (byte count) of the last byte of the line.  This
+       normally points to the '\n' character, or to one byte after the
+       last byte of the file, if the file doesn't contain a '\n'
+       character.  */
+    size_t end_pos;
+
+    line_info (size_t l, size_t s, size_t e)
+      : line_num (l), start_pos (s), end_pos (e)
+    {}
+
+    line_info ()
+      :line_num (0), start_pos (0), end_pos (0)
+    {}
+  };
+
+  /* The number of time this file has been accessed.  This is used
+     to designate which file cache to evict from the cache
+     array.  */
+  unsigned use_count;
+
+  const char *file_path;
+
+  FILE *fp;
+
+  /* This points to the content of the file that we've read so
+     far.  */
+  char *data;
+
+  /*  The size of the DATA array above.*/
+  size_t size;
+
+  /* The number of bytes read from the underlying file so far.  This
+     must be less (or equal) than SIZE above.  */
+  size_t nb_read;
+
+  /* The index of the beginning of the current line.  */
+  size_t line_start_idx;
+
+  /* The number of the previous line read.  This starts at 1.  Zero
+     means we've read no line so far.  */
+  size_t line_num;
+
+  /* This is the total number of lines of the current file.  At the
+     moment, we try to get this information from the line map
+     subsystem.  Note that this is just a hint.  When using the C++
+     front-end, this hint is correct because the input file is then
+     completely tokenized before parsing starts; so the line map knows
+     the number of lines before compilation really starts.  For e.g,
+     the C front-end, it can happen that we start emitting diagnostics
+     before the line map has seen the end of the file.  */
+  size_t total_lines;
+
+  /* This is a record of the beginning and end of the lines we've seen
+     while reading the file.  This is useful to avoid walking the data
+     from the beginning when we are asked to read a line that is
+     before LINE_START_IDX above.  Note that the maximum size of this
+     record is fcache_line_record_size, so that the memory consumption
+     doesn't explode.  We thus scale total_lines down to
+     fcache_line_record_size.  */
+  vec<line_info, va_heap> line_record;
+
+  fcache ();
+  ~fcache ();
+};
 
 /* Current position in real source file.  */
 
@@ -29,6 +109,11 @@ location_t input_location;
 
 struct line_maps *line_table;
 
+static fcache *fcache_tab;
+static const size_t fcache_tab_size = 16;
+static const size_t fcache_buffer_size = 4 * 1024;
+static const size_t fcache_line_record_size = 100;
+
 /* Expand the source location LOC into a human readable location.  If
    LOC resolves to a builtin location, the file name of the readable
    location is set to the string "<built-in>". If EXPANSION_POINT_P is
@@ -87,56 +172,542 @@ expand_location_1 (source_location loc,
   return xloc;
 }
 
-/* Reads one line from file into a static buffer.  */
-static const char *
-read_line (FILE *file)
+/* Initialize the set of cache used for files accessed by caret
+   diagnostic.  */
+
+static void
+diagnostic_file_cache_init (void)
 {
-  static char *string;
-  static size_t string_len;
-  size_t pos = 0;
-  char *ptr;
+  if (fcache_tab == NULL)
+    fcache_tab = new fcache[fcache_tab_size];
+}
 
-  if (!string_len)
+/* Free the ressources used by the set of cache used for files accessed
+   by caret diagnostic.  */
+
+void
+diagnostic_file_cache_fini (void)
+{
+  if (fcache_tab)
     {
-      string_len = 200;
-      string = XNEWVEC (char, string_len);
+      delete [] (fcache_tab);
+      fcache_tab = NULL;
     }
+}
 
-  while ((ptr = fgets (string + pos, string_len - pos, file)))
+/* Return the total lines number that have been read so far by the
+   line map (in the preprocessor) so far.  For languages like C++ that
+   entirely preprocess the input file before starting to parse, this
+   equals the actual number of lines of the file.  */
+
+static size_t
+total_lines_num (const char *file_path)
+{
+  size_t r = 0;
+  source_location l = 0;
+  if (linemap_get_file_highest_location (line_table, file_path, &l))
     {
-      size_t len = strlen (string + pos);
+      gcc_assert (l >= RESERVED_LOCATION_COUNT);
+      expanded_location xloc = expand_location (l);
+      r = xloc.line;
+    }
+  return r;
+}
+
+/* Lookup the cache used for the content of a given file accessed by
+   caret diagnostic.  Return the found cached file, or NULL if no
+   cached file was found.  */
+
+static fcache*
+lookup_file_in_cache_tab (const char *file_path)
+{
+  if (file_path == NULL)
+    return NULL;
 
-      if (string[pos + len - 1] == '\n')
+  diagnostic_file_cache_init ();
+
+  /* This will contain the found cached file.  */
+  fcache *r = NULL;
+  for (unsigned i = 0; i < fcache_tab_size; ++i)
+    {
+      fcache *c = &fcache_tab[i];
+      if (c->file_path && !strcmp (c->file_path, file_path))
 	{
-	  string[pos + len - 1] = 0;
-	  return string;
+	  ++c->use_count;
+	  r = c;
 	}
-      pos += len;
-      string = XRESIZEVEC (char, string, string_len * 2);
-      string_len *= 2;
     }
-      
-  return pos ? string : NULL;
+
+  if (r)
+    ++r->use_count;
+
+  return r;
+}
+
+/* Return the file cache that has been less used, recently, or the
+   first empty one.  If HIGHEST_USE_COUNT is non-null,
+   *HIGHEST_USE_COUNT is set to the highest use count of the entries
+   in the cache table.  */
+
+static fcache*
+evicted_cache_tab_entry (unsigned *highest_use_count)
+{
+  diagnostic_file_cache_init ();
+
+  fcache *to_evict = &fcache_tab[0];
+  unsigned huc = to_evict->use_count;
+  for (unsigned i = 1; i < fcache_tab_size; ++i)
+    {
+      fcache *c = &fcache_tab[i];
+      bool c_is_empty = (c->file_path == NULL);
+
+      if (c->use_count < to_evict->use_count
+	  || (to_evict->file_path && c_is_empty))
+	/* We evict C because it's either an entry with a lower use
+	   count or one that is empty.  */
+	to_evict = c;
+
+      if (huc < c->use_count)
+	huc = c->use_count;
+
+      if (c_is_empty)
+	/* We've reached the end of the cache; subsequent elements are
+	   all empty.  */
+	break;
+    }
+
+  if (highest_use_count)
+    *highest_use_count = huc;
+
+  return to_evict;
+}
+
+/* Create the cache used for the content of a given file to be
+   accessed by caret diagnostic.  This cache is added to an array of
+   cache and can be retrieved by lookup_file_in_cache_tab.  This
+   function returns the created cache.  Note that only the last
+   fcache_tab_size files are cached.  */
+
+static fcache*
+add_file_to_cache_tab (const char *file_path)
+{
+
+  FILE *fp = fopen (file_path, "r");
+  if (ferror (fp))
+    {
+      fclose (fp);
+      return NULL;
+    }
+
+  unsigned highest_use_count = 0;
+  fcache *r = evicted_cache_tab_entry (&highest_use_count);
+  r->file_path = file_path;
+  if (r->fp)
+    fclose (r->fp);
+  r->fp = fp;
+  r->nb_read = 0;
+  r->line_start_idx = 0;
+  r->line_num = 0;
+  r->line_record.truncate (0);
+  /* Ensure that this cache entry doesn't get evicted next time
+     add_file_to_cache_tab is called.  */
+  r->use_count = ++highest_use_count;
+  r->total_lines = total_lines_num (file_path);
+
+  return r;
+}
+
+/* Lookup the cache used for the content of a given file accessed by
+   caret diagnostic.  If no cached file was found, create a new cache
+   for this file, add it to the array of cached file and return
+   it.  */
+
+static fcache*
+lookup_or_add_file_to_cache_tab (const char *file_path)
+{
+  fcache *r = lookup_file_in_cache_tab (file_path);
+  if (r == NULL)
+    r = add_file_to_cache_tab (file_path);
+  return r;
+}
+
+/* Default constructor for a cache of file used by caret
+   diagnostic.  */
+
+fcache::fcache ()
+: use_count (0), file_path (NULL), fp (NULL), data (0),
+  size (0), nb_read (0), line_start_idx (0), line_num (0),
+  total_lines (0)
+{
+  line_record.create (0);
+}
+
+/* Destructor for a cache of file used by caret diagnostic.  */
+
+fcache::~fcache ()
+{
+  if (fp)
+    {
+      fclose (fp);
+      fp = NULL;
+    }
+  if (data)
+    {
+      XDELETEVEC (data);
+      data = 0;
+    }
+  line_record.release ();
+}
+
+/* Returns TRUE iff the cache would need to be filled with data coming
+   from the file.  That is, either the cache is empty or full or the
+   current line is empty.  Note that if the cache is full, it would
+   need to be extended and filled again.  */
+
+static bool
+needs_read (fcache *c)
+{
+  return (c->nb_read == 0
+	  || c->nb_read == c->size
+	  || (c->line_start_idx >= c->nb_read - 1));
+}
+
+/*  Return TRUE iff the cache is full and thus needs to be
+    extended.  */
+
+static bool
+needs_grow (fcache *c)
+{
+  return c->nb_read == c->size;
+}
+
+/* Grow the cache if it needs to be extended.  */
+
+static void
+maybe_grow (fcache *c)
+{
+  if (!needs_grow (c))
+    return;
+
+  size_t size = c->size == 0 ? fcache_buffer_size : c->size * 2;
+  c->data = XRESIZEVEC (char, c->data, size + 1);
+  c->size = size;
+}
+
+/*  Read more data into the cache.  Extends the cache if need be.
+    Returns TRUE iff new data could be read.  */
+
+static bool
+read_data (fcache *c)
+{
+  if (feof (c->fp) || ferror (c->fp))
+    return false;
+
+  maybe_grow (c);
+
+  char * from = c->data + c->nb_read;
+  size_t to_read = c->size - c->nb_read;
+  size_t nb_read = fread (from, 1, to_read, c->fp);
+
+  if (ferror (c->fp))
+    return false;
+
+  c->nb_read += nb_read;
+  return !!nb_read;
+}
+
+/* Read new data iff the cache needs to be filled with more data
+   coming from the file FP.  Return TRUE iff the cache was filled with
+   mode data.  */
+
+static bool
+maybe_read_data (fcache *c)
+{
+  if (!needs_read (c))
+    return false;
+  return read_data (c);
+}
+
+/* Read a new line from file FP, using C as a cache for the data
+   coming from the file.  Upon successful completion, *LINE is set to
+   the beginning of the line found.  Space for that line has been
+   allocated in the cache thus *LINE has the same life time as C.
+   *LINE_LEN is set to the length of the line.  Note that the line
+   does not contain any terminal delimiter.  This function returns
+   true if some data was read or process from the cache, false
+   otherwise.  Note that subsequent calls to get_next_line return the
+   next lines of the file and might overwrite the content of
+   *LINE.  */
+
+static bool
+get_next_line (fcache *c, char **line, ssize_t *line_len)
+{
+  /* Fill the cache with data to process.  */
+  maybe_read_data (c);
+
+  size_t remaining_size = c->nb_read - c->line_start_idx;
+  if (remaining_size == 0)
+    /* There is no more data to process.  */
+    return false;
+
+  char *line_start = c->data + c->line_start_idx;
+
+  char *next_line_start = NULL;
+  size_t len = 0;
+  char *line_end = (char *) memchr (line_start, '\n', remaining_size);
+  if (line_end == NULL)
+    {
+      /* We haven't found the end-of-line delimiter in the cache.
+	 Fill the cache with more data from the file and look for the
+	 '\n'.  */
+      while (maybe_read_data (c))
+	{
+	  line_start = c->data + c->line_start_idx;
+	  remaining_size = c->nb_read - c->line_start_idx;
+	  line_end = (char *) memchr (line_start, '\n', remaining_size);
+	  if (line_end != NULL)
+	    {
+	      next_line_start = line_end + 1;
+	      break;
+	    }
+	}
+      if (line_end == NULL)
+	/* We've loadded all the file into the cache and still no
+	   '\n'.  Let's say the line ends up at one byte passed the
+	   end of the file.  This is to stay consistent with the case
+	   of when the line ends up with a '\n' and line_end points to
+	   that terminal '\n'.  That consistency is useful below in
+	   the len calculation.  */
+	line_end = c->data + c->nb_read ;
+    }
+  else
+    next_line_start = line_end + 1;
+
+  if (ferror (c->fp))
+    return -1;
+
+  /* At this point, we've found the end of the of line.  It either
+     points to the '\n' or to one byte after the last byte of the
+     file.  */
+  gcc_assert (line_end != NULL);
+
+  len = line_end - line_start;
+
+  if (c->line_start_idx < c->nb_read)
+    *line = line_start;
+
+  ++c->line_num;
+
+  /* Before we update our line record, make sure the hint about the
+     total number of lines of the file is correct.  If it's not, then
+     we give up recording line boundaries from now on.  */
+  bool update_line_record = true;
+  if (c->line_num > c->total_lines)
+    update_line_record = false;
+
+    /* Now update our line record so that re-reading lines from the
+     before c->line_start_idx is faster.  */
+  if (update_line_record
+      && c->line_record.length () < fcache_line_record_size)
+    {
+      /* If the file lines fits in the line record, we just record all
+	 its lines ...*/
+      if (c->total_lines <= fcache_line_record_size
+	  && c->line_num > c->line_record.length ())
+	c->line_record.safe_push (fcache::line_info (c->line_num,
+						 c->line_start_idx,
+						 line_end - c->data));
+      else if (c->total_lines > fcache_line_record_size)
+	{
+	  /* ... otherwise, we just scale total_lines down to
+	     (fcache_line_record_size lines.  */
+	  size_t n = (c->line_num * fcache_line_record_size) / c->total_lines;
+	  if (c->line_record.length () == 0
+	      || n >= c->line_record.length ())
+	    c->line_record.safe_push (fcache::line_info (c->line_num,
+						     c->line_start_idx,
+						     line_end - c->data));
+	}
+    }
+
+  /* Update c->line_start_idx so that it points to the next line to be
+     read.  */
+  if (next_line_start)
+    c->line_start_idx = next_line_start - c->data;
+  else
+    /* We didn't find any terminal '\n'.  Let's consider that the end
+       of line is the end of the data in the cache.  The next
+       invocation of get_next_line will either read more data from the
+       underlying file or return false early because we've reached the
+       end of the file.  */
+    c->line_start_idx = c->nb_read;
+
+  *line_len = len;
+
+  return true;
+}
+
+/* Reads the next line from FILE into *LINE.  If *LINE is too small
+   (or NULL) it is allocated (or extended) to have enough space to
+   containe the line.  *LINE_LENGTH must contain the size of the
+   initial*LINE buffer.  It's then updated by this function to the
+   actual length of the returned line.  Note that the returned line
+   can contain several zero bytes.  Also note that the returned string
+   is allocated in static storage that is going to be re-used by
+   subsequent invocations of read_line.  */
+
+static bool
+read_next_line (fcache *cache, char ** line, ssize_t *line_len)
+{
+  char *l = NULL;
+  ssize_t len = 0;
+
+  if (!get_next_line (cache, &l, &len))
+    return false;
+
+  if (*line == NULL)
+    *line = XNEWVEC (char, len);
+  else
+    if (*line_len < len)
+	*line = XRESIZEVEC (char, *line, len);
+
+  memcpy (*line, l, len);
+  *line_len = len;
+
+  return true;
+}
+
+/* Consume the next bytes coming from the cache (or from its
+   underlying file if there are remaining unread bytes in the file)
+   until we reach the next end-of-line (or end-of-file).  There is no
+   copying from the cache involved.  Return TRUE upon successful
+   completion.  */
+
+static bool
+goto_next_line (fcache *cache)
+{
+  char *l;
+  ssize_t len;
+
+  return get_next_line (cache, &l, &len);
+}
+
+/* Read an arbitrary line number LINE_NUM from the file cached in C.
+   The line is copied into *LINE.  *LINE_LEN must have been set to the
+   length of *LINE.  If *LINE is too small (or NULL) it's extended (or
+   allocated) and *LINE_LEN is adjusted accordingly.  *LINE ends up
+   with a terminal zero byte and can contain additional zero bytes.
+   This function returns bool if a line was read.  */
+
+static bool
+read_line_num (fcache *c, size_t line_num,
+	       char ** line, ssize_t *line_len)
+{
+  gcc_assert (line_num > 0);
+
+  if (line_num <= c->line_num)
+    {
+      /* We've been asked to read lines that are before c->line_num.
+	 So lets use our line record (if it's not empty) to try to
+	 avoid re-reading the file from the beginning again.  */
+
+      if (c->line_record.is_empty ())
+	{
+	  c->line_start_idx = 0;
+	  c->line_num = 0;
+	}
+      else
+	{
+	  fcache::line_info *i = NULL;
+	  if (c->total_lines <= fcache_line_record_size)
+	    {
+	      /* In languages where the input file is not totally
+		 preprocessed up front, the c->total_lines hint
+		 can be smaller than the number of lines of the
+		 file.  In that case, only the first
+		 c->total_lines have been recorded.
+
+		 Otherwise, the first c->total_lines we've read have
+		 their start/end recorded here.  */
+	      i = (line_num <= c->total_lines)
+		? &c->line_record[line_num - 1]
+		: &c->line_record[c->total_lines - 1];
+	      gcc_assert (i->line_num <= line_num);
+	    }
+	  else
+	    {
+	      /*  So the file had more lines than our line record
+		  size.  Thus the number of lines we've recorded has
+		  been scaled down to fcache_line_reacord_size.  Let's
+		  pick the start/end of the recorded line that is
+		  closest to line_num.  */
+	      size_t n = (line_num <= c->total_lines)
+		? line_num * fcache_line_record_size / c->total_lines
+		: c ->line_record.length () - 1;
+	      if (n < c->line_record.length ())
+		{
+		  i = &c->line_record[n];
+		  gcc_assert (i->line_num <= line_num);
+		}
+	    }
+
+	  if (i && i->line_num == line_num)
+	    {
+	      /* We have the start/end of the line.  Let's just copy
+		 it again and we are done.  */
+	      ssize_t len = i->end_pos - i->start_pos + 1;
+	      if (*line_len < len)
+		*line = XRESIZEVEC (char, *line, len);
+	      memmove (*line, c->data + i->start_pos, len);
+	      (*line)[len - 1] = '\0';
+	      *line_len = --len;
+	      return true;
+	    }
+
+	  if (i)
+	    {
+	      c->line_start_idx = i->start_pos;
+	      c->line_num = i->line_num - 1;
+	    }
+	  else
+	    {
+	      c->line_start_idx = 0;
+	      c->line_num = 0;
+	    }
+	}
+    }
+
+  /*  Let's walk from line c->line_num up to line_num - 1, without
+      copying any line.  */
+  while (c->line_num < line_num - 1)
+    if (!goto_next_line (c))
+      return false;
+
+  /* The line we want is the next one.  Let's read and copy it back to
+     the caller.  */
+  return read_next_line (c, line, line_len);
 }
 
 /* Return the physical source line that corresponds to xloc in a
    buffer that is statically allocated.  The newline is replaced by
-   the null character.  */
+   the null character.  Note that the line can contain several null
+   characters, so LINE_LEN, if non-null, points to the actual length
+   of the line.  */
 
 const char *
-location_get_source_line (expanded_location xloc)
+location_get_source_line (expanded_location xloc,
+			  int *line_len)
 {
-  const char *buffer;
-  int lines = 1;
-  FILE *stream = xloc.file ? fopen (xloc.file, "r") : NULL;
-  if (!stream)
-    return NULL;
+  static char *buffer;
+  static ssize_t len;
+
+  fcache * c = lookup_or_add_file_to_cache_tab (xloc.file);
+  bool read = read_line_num (c, xloc.line, &buffer, &len);
 
-  while ((buffer = read_line (stream)) && lines < xloc.line)
-    lines++;
+  if (read && line_len)
+    *line_len = len;
 
-  fclose (stream);
-  return buffer;
+  return read ? buffer : NULL;
 }
 
 /* Expand the source location LOC into a human readable location.  If
diff --git a/gcc/input.h b/gcc/input.h
index 55bd426..4a57bb8 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -37,7 +37,8 @@ extern char builtins_location_check[(BUILTINS_LOCATION
 				     < RESERVED_LOCATION_COUNT) ? 1 : -1];
 
 extern expanded_location expand_location (source_location);
-extern const char *location_get_source_line (expanded_location xloc);
+extern const char *location_get_source_line (expanded_location xloc,
+					     int *line_size);
 extern expanded_location expand_location_to_spelling_point (source_location);
 extern source_location expansion_point_location_if_in_system_header (source_location);
 
@@ -62,4 +63,6 @@ extern location_t input_location;
 
 void dump_line_table_statistics (void);
 
+void diagnostics_file_cache_fini (void);
+
 #endif
diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c b/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c
new file mode 100644
index 0000000000000000000000000000000000000000..ff2ed962ac96e47ae05b0b040f4e10b8e09637e2
GIT binary patch
literal 240
zcmdPbSEyD<N!LxuS12e-Ehx%QPAx80sO92PVo*}h*HVDUmM0eFW#*+TDCL#r<R~O(
UBo-wmm!uXcDby-x=?^KT09Xk|)&Kwi

literal 0
HcmV?d00001

diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index a0d6da1..a4d78cc 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -756,6 +756,14 @@ struct linemap_stats
   long duplicated_macro_maps_locations_size;
 };
 
+/* Return the highest location emitted for a given file for which
+   there is a line map in SET.  FILE_NAME is the file name to
+   consider.  If the function returns TRUE, *LOC is set to the highest
+   location emitted for that file.  */
+bool linemap_get_file_highest_location (struct line_maps * set,
+					const char *file_name,
+					source_location *loc);
+
 /* Compute and return statistics about the memory consumption of some
    parts of the line table SET.  */
 void linemap_get_statistics (struct line_maps *, struct linemap_stats *);
diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index 2ad7ad2..bfde98e 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -1502,6 +1502,46 @@ linemap_dump_location (struct line_maps *set,
 	   path, from, l, c, s, (void*)map, e, loc, location);
 }
 
+/* Return the highest location emitted for a given file for which
+   there is a line map in SET.  FILE_NAME is the file name to
+   consider.  If the function returns TRUE, *LOC is set to the highest
+   location emitted for that file.  */
+
+bool
+linemap_get_file_highest_location (struct line_maps *set,
+				   const char *file_name,
+				   source_location *loc)
+{
+  /* If the set is empty or no ordinary map has been created then
+     there is no file to look for ...  */
+  if (set == NULL || set->info_ordinary.used == 0)
+    return false;
+
+  /* Now look for the last ordinary map created for FILE_NAME.  */
+  int i;
+  for (i = set->info_ordinary.used - 1; i >= 0; --i)
+    {
+      const char *fname = set->info_ordinary.maps[i].d.ordinary.to_file;
+      if (fname && !filename_cmp (fname, file_name))
+	break;
+    }
+
+  if (i < 0)
+    return false;
+
+  /* The highest location for a given map is either the starting
+     location of the next map minus one, or -- if the map is the
+     latest one -- the highest location of the set.  */
+  source_location result;
+  if (i == (int) set->info_ordinary.used - 1)
+    result = set->highest_location;
+  else
+    result = set->info_ordinary.maps[i + 1].start_location - 1;
+
+  *loc = result;
+  return true;
+}
+
 /* Compute and return statistics about the memory consumption of some
    parts of the line table SET.  */
-- 
		Dodji

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2014-01-22  8:16                     ` Dodji Seketeli
@ 2014-01-23 17:12                       ` Jakub Jelinek
  2014-01-24  2:58                         ` Bernd Edlinger
  2014-01-24  7:53                         ` Dodji Seketeli
  2014-01-24 15:05                       ` Markus Trippelsdorf
  1 sibling, 2 replies; 46+ messages in thread
From: Jakub Jelinek @ 2014-01-23 17:12 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: Bernd Edlinger, GCC Patches, Tom Tromey,
	Manuel López-Ibáñez

On Wed, Jan 22, 2014 at 09:16:02AM +0100, Dodji Seketeli wrote:
> +static fcache*
> +add_file_to_cache_tab (const char *file_path)
> +{
> +
> +  FILE *fp = fopen (file_path, "r");
> +  if (ferror (fp))
> +    {
> +      fclose (fp);
> +      return NULL;
> +    }

I've seen various segfaults here when playing with preprocessed sources
from PRs (obviously don't have the original source files).
When fopen fails, it just returns NULL, so I don't see why you just don't
do
  if (fp == NULL)
    return fp;

	Jakub

^ permalink raw reply	[flat|nested] 46+ messages in thread

* RE: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2014-01-23 17:12                       ` Jakub Jelinek
@ 2014-01-24  2:58                         ` Bernd Edlinger
  2014-01-24  7:53                         ` Dodji Seketeli
  1 sibling, 0 replies; 46+ messages in thread
From: Bernd Edlinger @ 2014-01-24  2:58 UTC (permalink / raw)
  To: Jakub Jelinek, Dodji Seketeli
  Cc: GCC Patches, Tom Tromey, Manuel López-Ibáñez

Hi,
On Thu, 23 Jan 2014 18:12:45, Jakub Jelinek wrote:
> 
> On Wed, Jan 22, 2014 at 09:16:02AM +0100, Dodji Seketeli wrote:
>> +static fcache*
>> +add_file_to_cache_tab (const char *file_path)
>> +{
>> +
>> + FILE *fp = fopen (file_path, "r");
>> + if (ferror (fp))
>> + {
>> + fclose (fp);
>> + return NULL;
>> + }
> 
> I've seen various segfaults here when playing with preprocessed sources
> from PRs (obviously don't have the original source files).
> When fopen fails, it just returns NULL, so I don't see why you just don't
> do
> if (fp == NULL)
> return fp;
> 
> Jakub

This would be a good idea for test cases too.
However the test system always calls the compiler with
-fno-diagnostics-show-caret so I doubt your test case
is actually testing anything when it is called from the
test environment with that option.

Bernd. 		 	   		  

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2014-01-23 17:12                       ` Jakub Jelinek
  2014-01-24  2:58                         ` Bernd Edlinger
@ 2014-01-24  7:53                         ` Dodji Seketeli
  1 sibling, 0 replies; 46+ messages in thread
From: Dodji Seketeli @ 2014-01-24  7:53 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Bernd Edlinger, GCC Patches, Tom Tromey,
	Manuel López-Ibáñez

Jakub Jelinek <jakub@redhat.com> writes:

> On Wed, Jan 22, 2014 at 09:16:02AM +0100, Dodji Seketeli wrote:
>> +static fcache*
>> +add_file_to_cache_tab (const char *file_path)
>> +{
>> +
>> +  FILE *fp = fopen (file_path, "r");
>> +  if (ferror (fp))
>> +    {
>> +      fclose (fp);
>> +      return NULL;
>> +    }
>
> I've seen various segfaults here when playing with preprocessed sources
> from PRs (obviously don't have the original source files).
> When fopen fails, it just returns NULL, so I don't see why you just don't
> do
>   if (fp == NULL)
>     return fp;

Right, I am testing the patch below.

	* input.c (add_file_to_cache_tab): Handle the case where fopen
	returns NULL.

diff --git a/gcc/input.c b/gcc/input.c
index 290680c..547c177 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -293,11 +293,8 @@ add_file_to_cache_tab (const char *file_path)
 {
 
   FILE *fp = fopen (file_path, "r");
-  if (ferror (fp))
-    {
-      fclose (fp);
-      return NULL;
-    }
+  if (fp == NULL)
+    return NULL;
 
   unsigned highest_use_count = 0;
   fcache *r = evicted_cache_tab_entry (&highest_use_count);
-- 
		Dodji

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2014-01-22  8:16                     ` Dodji Seketeli
  2014-01-23 17:12                       ` Jakub Jelinek
@ 2014-01-24 15:05                       ` Markus Trippelsdorf
  2014-01-24 15:41                         ` Dodji Seketeli
  1 sibling, 1 reply; 46+ messages in thread
From: Markus Trippelsdorf @ 2014-01-24 15:05 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: Bernd Edlinger, Jakub Jelinek, GCC Patches, Tom Tromey,
	Manuel López-Ibáñez

On 2014.01.22 at 09:16 +0100, Dodji Seketeli wrote:
> Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
> 
> > Hi,
> 
> Hello,
> 
> > since there was no progress in the last 2 months on that matter,
> 
> Sorry, this is my bad.  I got sidetracked by something else and forgot
> that I had the patch working et al, and all its bits that need approval
> got approved.  It still can go in right now.  It improves performance
> and fixes the issue the way it was discussed.
> 
> Here it is, regtested on x86_64-linux-gnu against trunk.
> 
> If nobody objects in the next 24 hours, I'll commit it.

The patch causes http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59935 .
The follow-up patch (fp == NULL check) doesn't help.

-- 
Markus

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2014-01-24 15:05                       ` Markus Trippelsdorf
@ 2014-01-24 15:41                         ` Dodji Seketeli
  2014-01-24 15:44                           ` Jakub Jelinek
  0 siblings, 1 reply; 46+ messages in thread
From: Dodji Seketeli @ 2014-01-24 15:41 UTC (permalink / raw)
  To: Markus Trippelsdorf
  Cc: Bernd Edlinger, Jakub Jelinek, GCC Patches, Tom Tromey,
	Manuel López-Ibáñez

Markus Trippelsdorf <markus@trippelsdorf.de> writes:

> On 2014.01.22 at 09:16 +0100, Dodji Seketeli wrote:
>> Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
>> 
>> > Hi,
>> 
>> Hello,
>> 
>> > since there was no progress in the last 2 months on that matter,
>> 
>> Sorry, this is my bad.  I got sidetracked by something else and forgot
>> that I had the patch working et al, and all its bits that need approval
>> got approved.  It still can go in right now.  It improves performance
>> and fixes the issue the way it was discussed.
>> 
>> Here it is, regtested on x86_64-linux-gnu against trunk.
>> 
>> If nobody objects in the next 24 hours, I'll commit it.
>
> The patch causes http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59935 .
> The follow-up patch (fp == NULL check) doesn't help.

I am looking into that, sorry for the inconvenience.

-- 
		Dodji

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2014-01-24 15:41                         ` Dodji Seketeli
@ 2014-01-24 15:44                           ` Jakub Jelinek
  2014-01-24 16:09                             ` Dodji Seketeli
  0 siblings, 1 reply; 46+ messages in thread
From: Jakub Jelinek @ 2014-01-24 15:44 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: Markus Trippelsdorf, Bernd Edlinger, GCC Patches, Tom Tromey,
	Manuel López-Ibáñez

On Fri, Jan 24, 2014 at 04:40:52PM +0100, Dodji Seketeli wrote:
> > The patch causes http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59935 .
> > The follow-up patch (fp == NULL check) doesn't help.
> 
> I am looking into that, sorry for the inconvenience.

I'd say we want something like following.  Note that while the c == NULL
bailout would be usually sufficient, if you'll do:
echo foobar > '<command-line>'
it would still crash.  Line 0 is used only for the special locations
(command line, built-in macros) and there is no file associated with it
anyway.

--- gcc/input.c.jj	2014-01-24 16:32:34.000000000 +0100
+++ gcc/input.c	2014-01-24 16:41:42.012671452 +0100
@@ -698,7 +698,13 @@ location_get_source_line (expanded_locat
   static char *buffer;
   static ssize_t len;
 
-  fcache * c = lookup_or_add_file_to_cache_tab (xloc.file);
+  if (xloc.line == 0)
+    return NULL;
+
+  fcache *c = lookup_or_add_file_to_cache_tab (xloc.file);
+  if (c == NULL)
+    return NULL;
+
   bool read = read_line_num (c, xloc.line, &buffer, &len);
 
   if (read && line_len)


	Jakub

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2014-01-24 15:44                           ` Jakub Jelinek
@ 2014-01-24 16:09                             ` Dodji Seketeli
  2014-01-24 16:13                               ` Jakub Jelinek
  2014-01-24 23:02                               ` Markus Trippelsdorf
  0 siblings, 2 replies; 46+ messages in thread
From: Dodji Seketeli @ 2014-01-24 16:09 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Markus Trippelsdorf, Bernd Edlinger, GCC Patches, Tom Tromey,
	Manuel López-Ibáñez

Jakub Jelinek <jakub@redhat.com> writes:

> On Fri, Jan 24, 2014 at 04:40:52PM +0100, Dodji Seketeli wrote:
>> > The patch causes http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59935 .
>> > The follow-up patch (fp == NULL check) doesn't help.
>> 
>> I am looking into that, sorry for the inconvenience.
>
> I'd say we want something like following.  Note that while the c == NULL
> bailout would be usually sufficient, if you'll do:
> echo foobar > '<command-line>'
> it would still crash.  Line 0 is used only for the special locations
> (command line, built-in macros) and there is no file associated with it
> anyway.
>
> --- gcc/input.c.jj	2014-01-24 16:32:34.000000000 +0100
> +++ gcc/input.c	2014-01-24 16:41:42.012671452 +0100
> @@ -698,7 +698,13 @@ location_get_source_line (expanded_locat
>    static char *buffer;
>    static ssize_t len;
>  
> -  fcache * c = lookup_or_add_file_to_cache_tab (xloc.file);
> +  if (xloc.line == 0)
> +    return NULL;
> +
> +  fcache *c = lookup_or_add_file_to_cache_tab (xloc.file);
> +  if (c == NULL)
> +    return NULL;
> +
>    bool read = read_line_num (c, xloc.line, &buffer, &len);
>  
>    if (read && line_len)

Indeed.

Though, I am testing the patch below that makes read_line_num gracefully
handle empty caches or zero locations.  The rest of the code should just
work with that as is.

	* input.c (read_line_num): Gracefully handle non-file locations or
	empty caches.

diff --git a/gcc/input.c b/gcc/input.c
index 547c177..b05e1da 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -600,7 +600,8 @@ static bool
 read_line_num (fcache *c, size_t line_num,
 	       char ** line, ssize_t *line_len)
 {
-  gcc_assert (line_num > 0);
+  if (!c || line_num < 1)
+    return false;
 
   if (line_num <= c->line_num)
     {
diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-location.c b/gcc/testsuite/c-c++-common/cpp/warning-zero-location.c
new file mode 100644
index 0000000..04a06b2
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/warning-zero-location.c
@@ -0,0 +1,6 @@
+/*
+   { dg-options "-D _GNU_SOURCE" }
+   { dg-do compile }
+ */
+
+#define _GNU_SOURCE 	/* { dg-warning "redefined" } */
-- 
		Dodji

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2014-01-24 16:09                             ` Dodji Seketeli
@ 2014-01-24 16:13                               ` Jakub Jelinek
  2014-01-24 23:02                               ` Markus Trippelsdorf
  1 sibling, 0 replies; 46+ messages in thread
From: Jakub Jelinek @ 2014-01-24 16:13 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: Markus Trippelsdorf, Bernd Edlinger, GCC Patches, Tom Tromey,
	Manuel López-Ibáñez

On Fri, Jan 24, 2014 at 05:09:29PM +0100, Dodji Seketeli wrote:
> 	* input.c (read_line_num): Gracefully handle non-file locations or
> 	empty caches.
> 
> diff --git a/gcc/input.c b/gcc/input.c
> index 547c177..b05e1da 100644
> --- a/gcc/input.c
> +++ b/gcc/input.c
> @@ -600,7 +600,8 @@ static bool
>  read_line_num (fcache *c, size_t line_num,
>  	       char ** line, ssize_t *line_len)
>  {
> -  gcc_assert (line_num > 0);
> +  if (!c || line_num < 1)
> +    return false;
>  
>    if (line_num <= c->line_num)
>      {

Ok.

> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/cpp/warning-zero-location.c
> @@ -0,0 +1,6 @@
> +/*
> +   { dg-options "-D _GNU_SOURCE" }
> +   { dg-do compile }
> + */
> +
> +#define _GNU_SOURCE 	/* { dg-warning "redefined" } */

I doubt this would fail without the patch though, because
fno-diagnostics-show-caret is added by default to flags.
So, I'd say you need also -fdiagnostics-show-caret in dg-options to
reproduce it.

	Jakub

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2014-01-24 16:09                             ` Dodji Seketeli
  2014-01-24 16:13                               ` Jakub Jelinek
@ 2014-01-24 23:02                               ` Markus Trippelsdorf
  2014-01-24 23:20                                 ` Markus Trippelsdorf
  1 sibling, 1 reply; 46+ messages in thread
From: Markus Trippelsdorf @ 2014-01-24 23:02 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: Jakub Jelinek, Bernd Edlinger, GCC Patches, Tom Tromey,
	Manuel López-Ibáñez

On 2014.01.24 at 17:09 +0100, Dodji Seketeli wrote:
> Jakub Jelinek <jakub@redhat.com> writes:
> 
> > On Fri, Jan 24, 2014 at 04:40:52PM +0100, Dodji Seketeli wrote:
> >> > The patch causes http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59935 .
> >> > The follow-up patch (fp == NULL check) doesn't help.
> >> 
> >> I am looking into that, sorry for the inconvenience.
> >
> > I'd say we want something like following.  Note that while the c == NULL
> > bailout would be usually sufficient, if you'll do:
> > echo foobar > '<command-line>'
> > it would still crash.  Line 0 is used only for the special locations
> > (command line, built-in macros) and there is no file associated with it
> > anyway.
> >
> > --- gcc/input.c.jj	2014-01-24 16:32:34.000000000 +0100
> > +++ gcc/input.c	2014-01-24 16:41:42.012671452 +0100
> > @@ -698,7 +698,13 @@ location_get_source_line (expanded_locat
> >    static char *buffer;
> >    static ssize_t len;
> >  
> > -  fcache * c = lookup_or_add_file_to_cache_tab (xloc.file);
> > +  if (xloc.line == 0)
> > +    return NULL;
> > +
> > +  fcache *c = lookup_or_add_file_to_cache_tab (xloc.file);
> > +  if (c == NULL)
> > +    return NULL;
> > +
> >    bool read = read_line_num (c, xloc.line, &buffer, &len);
> >  
> >    if (read && line_len)
> 
> Indeed.
> 
> Though, I am testing the patch below that makes read_line_num gracefully
> handle empty caches or zero locations.  The rest of the code should just
> work with that as is.
> 
> 	* input.c (read_line_num): Gracefully handle non-file locations or
> 	empty caches.

Unfortunately this doesn't fix yet another issue:

markus@x4 /tmp % cat foo.c
#line 4636 "configure"
#include <xxxxxxxxxxxx.h>
int main() { return 0; }

markus@x4 /tmp % gcc foo.c
configure:4636:26: fatal error: xxxxxxxxxxxx.h: No such file or directory
gcc: internal compiler error: Segmentation fault (program cc1)
0x40cc8e execute
        ../../gcc/gcc/gcc.c:2841
0x40cf09 do_spec_1
        ../../gcc/gcc/gcc.c:4641
0x40fc91 process_brace_body
        ../../gcc/gcc/gcc.c:5924
0x40fc91 handle_braces
        ../../gcc/gcc/gcc.c:5838
0x40d692 do_spec_1
        ../../gcc/gcc/gcc.c:5295
0x40fc91 process_brace_body
        ../../gcc/gcc/gcc.c:5924
0x40fc91 handle_braces
        ../../gcc/gcc/gcc.c:5838
0x40d692 do_spec_1
        ../../gcc/gcc/gcc.c:5295
0x40d28e do_spec_1
        ../../gcc/gcc/gcc.c:5410
0x40fc91 process_brace_body
        ../../gcc/gcc/gcc.c:5924
0x40fc91 handle_braces
        ../../gcc/gcc/gcc.c:5838
0x40d692 do_spec_1
        ../../gcc/gcc/gcc.c:5295
0x40fc91 process_brace_body
        ../../gcc/gcc/gcc.c:5924
0x40fc91 handle_braces
        ../../gcc/gcc/gcc.c:5838
0x40d692 do_spec_1
        ../../gcc/gcc/gcc.c:5295
0x40fc91 process_brace_body
        ../../gcc/gcc/gcc.c:5924
0x40fc91 handle_braces
        ../../gcc/gcc/gcc.c:5838
0x40d692 do_spec_1
        ../../gcc/gcc/gcc.c:5295
0x40fc91 process_brace_body
        ../../gcc/gcc/gcc.c:5924
0x40fc91 handle_braces
        ../../gcc/gcc/gcc.c:5838
Please submit a full bug report,
with preprocessed source if appropriate.

-- 
Markus

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2014-01-24 23:02                               ` Markus Trippelsdorf
@ 2014-01-24 23:20                                 ` Markus Trippelsdorf
  2014-01-28 13:20                                   ` Dodji Seketeli
  0 siblings, 1 reply; 46+ messages in thread
From: Markus Trippelsdorf @ 2014-01-24 23:20 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: Jakub Jelinek, Bernd Edlinger, GCC Patches, Tom Tromey,
	Manuel López-Ibáñez

On 2014.01.25 at 00:02 +0100, Markus Trippelsdorf wrote:
> On 2014.01.24 at 17:09 +0100, Dodji Seketeli wrote:
> > Jakub Jelinek <jakub@redhat.com> writes:
> > 
> > > On Fri, Jan 24, 2014 at 04:40:52PM +0100, Dodji Seketeli wrote:
> > >> > The patch causes http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59935 .
> > >> > The follow-up patch (fp == NULL check) doesn't help.
> > >> 
> > >> I am looking into that, sorry for the inconvenience.
> > >
> > > I'd say we want something like following.  Note that while the c == NULL
> > > bailout would be usually sufficient, if you'll do:
> > > echo foobar > '<command-line>'
> > > it would still crash.  Line 0 is used only for the special locations
> > > (command line, built-in macros) and there is no file associated with it
> > > anyway.
> > >
> > > --- gcc/input.c.jj	2014-01-24 16:32:34.000000000 +0100
> > > +++ gcc/input.c	2014-01-24 16:41:42.012671452 +0100
> > > @@ -698,7 +698,13 @@ location_get_source_line (expanded_locat
> > >    static char *buffer;
> > >    static ssize_t len;
> > >  
> > > -  fcache * c = lookup_or_add_file_to_cache_tab (xloc.file);
> > > +  if (xloc.line == 0)
> > > +    return NULL;
> > > +
> > > +  fcache *c = lookup_or_add_file_to_cache_tab (xloc.file);
> > > +  if (c == NULL)
> > > +    return NULL;
> > > +
> > >    bool read = read_line_num (c, xloc.line, &buffer, &len);
> > >  
> > >    if (read && line_len)
> > 
> > Indeed.
> > 
> > Though, I am testing the patch below that makes read_line_num gracefully
> > handle empty caches or zero locations.  The rest of the code should just
> > work with that as is.
> > 
> > 	* input.c (read_line_num): Gracefully handle non-file locations or
> > 	empty caches.
> 
> Unfortunately this doesn't fix yet another issue:

Whereas Jakub's patch is fine.

-- 
Markus

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2014-01-24 23:20                                 ` Markus Trippelsdorf
@ 2014-01-28 13:20                                   ` Dodji Seketeli
  2014-01-28 13:23                                     ` Dodji Seketeli
  0 siblings, 1 reply; 46+ messages in thread
From: Dodji Seketeli @ 2014-01-28 13:20 UTC (permalink / raw)
  To: Markus Trippelsdorf
  Cc: Jakub Jelinek, Bernd Edlinger, GCC Patches, Tom Tromey,
	Manuel López-Ibáñez

Here is the patch I am committing right now.

gcc/ChangeLog

	* input.c (location_get_source_line): Bail out on when line number
	is zero, and test the return value of
	lookup_or_add_file_to_cache_tab.

gcc/testsuite/ChangeLog

	* c-c++-common/cpp/warning-zero-location.c: New test.
	* c-c++-common/cpp/warning-zero-location-2.c: Likewise.

diff --git a/gcc/input.c b/gcc/input.c
index 547c177..63cd062 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -698,7 +698,13 @@ location_get_source_line (expanded_location xloc,
   static char *buffer;
   static ssize_t len;
 
-  fcache * c = lookup_or_add_file_to_cache_tab (xloc.file);
+  if (xloc.line == 0)
+    return NULL;
+
+  fcache *c = lookup_or_add_file_to_cache_tab (xloc.file);
+  if (c == NULL)
+    return NULL;
+
   bool read = read_line_num (c, xloc.line, &buffer, &len);
 
   if (read && line_len)
diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-location-2.c b/gcc/testsuite/c-c++-common/cpp/warning-zero-location-2.c
new file mode 100644
index 0000000..c0e0bf7
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/warning-zero-location-2.c
@@ -0,0 +1,10 @@
+/*
+   { dg-options "-D _GNU_SOURCE -fdiagnostics-show-caret" }
+   { dg-do compile }
+ */
+
+#line 4636 "configure"
+#include <xxxxxxxxxxxx.h>
+int main() { return 0; }
+
+/* { dg-error "No such file or directory" { target *-*-* } 4636 } */
diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-location.c b/gcc/testsuite/c-c++-common/cpp/warning-zero-location.c
new file mode 100644
index 0000000..ca2e102
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/warning-zero-location.c
@@ -0,0 +1,8 @@
+/*
+   { dg-options "-D _GNU_SOURCE -fdiagnostics-show-caret" }
+   { dg-do compile }
+ */
+
+#define _GNU_SOURCE 	/* { dg-warning "redefined" } */
+
+/* { dg-message "" "#define _GNU_SOURCE" {target *-*-* } 0 }
-- 
		Dodji

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2014-01-28 13:20                                   ` Dodji Seketeli
@ 2014-01-28 13:23                                     ` Dodji Seketeli
  2014-01-28 18:40                                       ` H.J. Lu
  0 siblings, 1 reply; 46+ messages in thread
From: Dodji Seketeli @ 2014-01-28 13:23 UTC (permalink / raw)
  To: Markus Trippelsdorf
  Cc: Jakub Jelinek, Bernd Edlinger, GCC Patches, Tom Tromey,
	Manuel López-Ibáñez

Dodji Seketeli <dodji@redhat.com> writes:

> Here is the patch I am committing right now.
>
> gcc/ChangeLog
>
> 	* input.c (location_get_source_line): Bail out on when line number
> 	is zero, and test the return value of
> 	lookup_or_add_file_to_cache_tab.
>
> gcc/testsuite/ChangeLog
>
> 	* c-c++-common/cpp/warning-zero-location.c: New test.
> 	* c-c++-common/cpp/warning-zero-location-2.c: Likewise.

I forgot to say that it passed bootstrap & test on
x86_64-unknown-linux-gnu against trunk.

Thanks.

-- 
		Dodji

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2014-01-28 13:23                                     ` Dodji Seketeli
@ 2014-01-28 18:40                                       ` H.J. Lu
  2014-01-29 11:28                                         ` Dodji Seketeli
  0 siblings, 1 reply; 46+ messages in thread
From: H.J. Lu @ 2014-01-28 18:40 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: Markus Trippelsdorf, Jakub Jelinek, Bernd Edlinger, GCC Patches,
	Tom Tromey, Manuel López-Ibáñez

On Tue, Jan 28, 2014 at 5:23 AM, Dodji Seketeli <dodji@redhat.com> wrote:
> Dodji Seketeli <dodji@redhat.com> writes:
>
>> Here is the patch I am committing right now.
>>
>> gcc/ChangeLog
>>
>>       * input.c (location_get_source_line): Bail out on when line number
>>       is zero, and test the return value of
>>       lookup_or_add_file_to_cache_tab.
>>
>> gcc/testsuite/ChangeLog
>>
>>       * c-c++-common/cpp/warning-zero-location.c: New test.
>>       * c-c++-common/cpp/warning-zero-location-2.c: Likewise.
>
> I forgot to say that it passed bootstrap & test on
> x86_64-unknown-linux-gnu against trunk.
>

The new tests failed on Linux/x86:

ERROR: c-c++-common/cpp/warning-zero-location-2.c -std=gnu++11: syntax
error in target selector "4636" for " dg-error 10 "No such file or
directory" { target *-*-* } 4636 "
ERROR: c-c++-common/cpp/warning-zero-location-2.c -std=gnu++98: syntax
error in target selector "4636" for " dg-error 10 "No such file or
directory" { target *-*-* } 4636 "
ERROR: c-c++-common/cpp/warning-zero-location-2.c  -Wc++-compat :
syntax error in target selector "4636" for " dg-error 10 "No such file
or directory" { target *-*-* } 4636 "




-- 
H.J.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2014-01-28 18:40                                       ` H.J. Lu
@ 2014-01-29 11:28                                         ` Dodji Seketeli
  0 siblings, 0 replies; 46+ messages in thread
From: Dodji Seketeli @ 2014-01-29 11:28 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Markus Trippelsdorf, Jakub Jelinek, Bernd Edlinger, GCC Patches,
	Tom Tromey, Manuel López-Ibáñez

"H.J. Lu" <hjl.tools@gmail.com> writes:

> The new tests failed on Linux/x86:

Woops.

I have committed the patch below under the obvious rule for this.  Sorry
for the inconvenience.


gcc/testsuite/ChangeLog:

	* c-c++-common/cpp/warning-zero-location-2.c: Fix error message
	specifier.

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index a468447..27777da 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2014-01-29  Dodji Seketeli  <dodji@redhat.com>
+
+	* c-c++-common/cpp/warning-zero-location-2.c: Fix error message
+	selector.
+
 2014-01-29  Jakub Jelinek  <jakub@redhat.com>
 
 	PR middle-end/59917
diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-location-2.c b/gcc/testsuite/c-c++-common/cpp/warning-zero-location-2.c
index c0e0bf7..e919bca 100644
--- a/gcc/testsuite/c-c++-common/cpp/warning-zero-location-2.c
+++ b/gcc/testsuite/c-c++-common/cpp/warning-zero-location-2.c
@@ -7,4 +7,4 @@
 #include <xxxxxxxxxxxx.h>
 int main() { return 0; }
 
-/* { dg-error "No such file or directory" { target *-*-* } 4636 } */
+/* { dg-message "" "#include" {target *-*-* } 0 }
-- 
		Dodji

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
  2013-10-31 13:45 Dodji Seketeli
@ 2013-10-31 17:30 ` Manuel López-Ibáñez
  0 siblings, 0 replies; 46+ messages in thread
From: Manuel López-Ibáñez @ 2013-10-31 17:30 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: Tom Tromey, Jakub Jelinek, GCC Patches

On 31 October 2013 05:46, Dodji Seketeli <dodji@redhat.com> wrote:
> +*/
> +static size_t
> +string_length (const char* buf, size_t buf_size)
> +{
> +  for (int i = buf_size - 1; i > 0; --i)
> +    {
> +      if (buf[i] != 0)
> +       return i + 1;
> +
> +      if (buf[i - 1] != 0)
> +       return i;
> +    }
> +  return 0;
> +}

Why do you check both buf[i] and buf[i - 1] within the loop?

Cheers,

Manuel.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
@ 2013-10-31 13:45 Dodji Seketeli
  2013-10-31 17:30 ` Manuel López-Ibáñez
  0 siblings, 1 reply; 46+ messages in thread
From: Dodji Seketeli @ 2013-10-31 13:45 UTC (permalink / raw)
  To: Tom Tromey, Manuel López-Ibáñez, Jakub Jelinek; +Cc: GCC Patches

Hello,

In this problem report, the compiler is fed a (bogus) translation unit
in which some literals contains bytes whose value is zero.  The
preprocessor detects that and proceeds to emit diagnostics for that
king of bogus literals.  But then when the diagnostics machinery
re-reads the input file again to display the bogus literals with a
caret, it attempts to calculate the length of each of the lines it got
using fgets.  The line length calculation is done using strlen.  But
that doesn't work well when the content of the line can have several
zero bytes.  The result is that the read_line never sees the end of
the line because strlen repeatedly reports that the line ends before
the end-of-line character; so read_line thinks its buffer for reading
the line is too small; it thus increases the buffer, leading to a huge
memory consumption, pain and disaster.

The patch below introduces a new string_length() function that can
return the length of a string contained in a buffer even if the string
contains zero bytes; it does so by starting from the end of the buffer
and stops when it encounters the first non-null byte; for that to
work, the buffer must have been totally zeroed before getting data.
read_line is then modified to return the length of the line along
with the line itself, as the line can now contain zero bytes.  Callers
of read_line are adjusted consequently.

diagnostic_show_locus() is modified to consider that a line can have
characters of value zero, and so just show a white space when
instructed to display one of these characters.

Tested on x86_64-unknown-linux-gnu against trunk.

I realize this is diagnostics code and I am supposed to be a maintainer
for it, but I'd appreciate a review for it nonetheless.

Thanks.

gcc/ChangeLog:

	* input.h (location_get_source_line): Take an additional line_size
	parameter by reference.
	* input.c (string_length): New static function definition.
	(read_line): Take an additional line_length output parameter to be
	set to the size of the line.  Use the new string_length function
	to compute the size of the line returned by fgets, rather than
	using strlen.  Ensure that the buffer is initially zeroed; ensure
	that when growing the buffer too.
	(location_get_source_line): Take an additional output line_len
	parameter.  Update the use of read_line to pass it the line_len
	parameter.
	* diagnostic.c (adjust_line): Take an additional input parameter
	for the length of the line, rather than calculating it with
	strlen.
	(diagnostic_show_locus): Adjust the use of
	location_get_source_line and adjust_line with respect to their new
	signature.  While displaying a line now, do not stop at the first
	null byte.  Rather, display the zero byte as a space and keep
	going until we reach the size of the line.

gcc/testsuite/ChangeLog:

	* c-c++-common/cpp/warning-zero-in-literals-1.c: New test file.
---
 gcc/diagnostic.c                                   |  17 +++---
 gcc/input.c                                        |  60 +++++++++++++++++----
 gcc/input.h                                        |   3 +-
 .../c-c++-common/cpp/warning-zero-in-literals-1.c  | Bin 0 -> 240 bytes
 4 files changed, 62 insertions(+), 18 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c

diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 36094a1..0ca7081 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -259,12 +259,13 @@ diagnostic_build_prefix (diagnostic_context *context,
    MAX_WIDTH by some margin, then adjust the start of the line such
    that the COLUMN is smaller than MAX_WIDTH minus the margin.  The
    margin is either 10 characters or the difference between the column
-   and the length of the line, whatever is smaller.  */
+   and the length of the line, whatever is smaller.  The length of
+   LINE is given by LINE_WIDTH.  */
 static const char *
-adjust_line (const char *line, int max_width, int *column_p)
+adjust_line (const char *line, int line_width,
+	     int max_width, int *column_p)
 {
   int right_margin = 10;
-  int line_width = strlen (line);
   int column = *column_p;
 
   right_margin = MIN (line_width - column, right_margin);
@@ -284,6 +285,7 @@ diagnostic_show_locus (diagnostic_context * context,
 		       const diagnostic_info *diagnostic)
 {
   const char *line;
+  int line_width;
   char *buffer;
   expanded_location s;
   int max_width;
@@ -297,22 +299,25 @@ diagnostic_show_locus (diagnostic_context * context,
 
   context->last_location = diagnostic->location;
   s = expand_location_to_spelling_point (diagnostic->location);
-  line = location_get_source_line (s);
+  line = location_get_source_line (s, line_width);
   if (line == NULL)
     return;
 
   max_width = context->caret_max_width;
-  line = adjust_line (line, max_width, &(s.column));
+  line = adjust_line (line, line_width, max_width, &(s.column));
 
   pp_newline (context->printer);
   saved_prefix = pp_get_prefix (context->printer);
   pp_set_prefix (context->printer, NULL);
   pp_space (context->printer);
-  while (max_width > 0 && *line != '\0')
+  while (max_width > 0 && line_width > 0)
     {
       char c = *line == '\t' ? ' ' : *line;
+      if (c == '\0')
+	c = ' ';
       pp_character (context->printer, c);
       max_width--;
+      line_width--;
       line++;
     }
   pp_newline (context->printer);
diff --git a/gcc/input.c b/gcc/input.c
index a141a92..b72183c 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -87,9 +87,37 @@ expand_location_1 (source_location loc,
   return xloc;
 }
 
-/* Reads one line from file into a static buffer.  */
+/* Returns the length of a string contained in a buffer that was
+   initially totally zeroed.  The length of the string is the number
+   of bytes up to the last non-null byte of the buffer.  Note that
+   with this definition, the string can contain bytes which value is
+   zero.
+
+   BUF is a pointer to the beginning of the buffer containing the
+   string to consider.
+
+   BUF_SIZE is the size of the buffer containing the string to
+   consider.  The string length is thus at most BUF_SIZE.
+*/
+static size_t
+string_length (const char* buf, size_t buf_size)
+{
+  for (int i = buf_size - 1; i > 0; --i)
+    {
+      if (buf[i] != 0)
+	return i + 1;
+
+      if (buf[i - 1] != 0)
+	return i;
+    }
+  return 0;
+}
+
+/* Reads one line from FILE into a static buffer.  LINE_LENGTH is set
+   by this function to the length of the returned line.  Note that the
+   returned line can contain several zero bytes.  */
 static const char *
-read_line (FILE *file)
+read_line (FILE *file, int& line_length)
 {
   static char *string;
   static size_t string_len;
@@ -99,32 +127,42 @@ read_line (FILE *file)
   if (!string_len)
     {
       string_len = 200;
-      string = XNEWVEC (char, string_len);
+      string = XCNEWVEC (char, string_len);
     }
+  else
+    memset (string, 0, string_len);
 
   while ((ptr = fgets (string + pos, string_len - pos, file)))
     {
-      size_t len = strlen (string + pos);
+      size_t len = string_length (string + pos, string_len - pos);
 
       if (string[pos + len - 1] == '\n')
 	{
 	  string[pos + len - 1] = 0;
+	  line_length = len;
 	  return string;
 	}
       pos += len;
-      string = XRESIZEVEC (char, string, string_len * 2);
-      string_len *= 2;
-    }
-      
+      size_t string_len2 = string_len * 2;
+      char *string2 = XCNEWVEC (char, string_len2);
+      memmove (string2, string, string_len);
+      XDELETE (string);;
+      string = string2;
+      string_len = string_len2;
+     }
+
+  line_length = pos ? string_len : 0;
   return pos ? string : NULL;
 }
 
 /* Return the physical source line that corresponds to xloc in a
    buffer that is statically allocated.  The newline is replaced by
-   the null character.  */
+   the null character.  Note that the line can contain several null
+   characters, so LINE_LEN contains the actual length of the line.  */
 
 const char *
-location_get_source_line (expanded_location xloc)
+location_get_source_line (expanded_location xloc,
+			  int& line_len)
 {
   const char *buffer;
   int lines = 1;
@@ -132,7 +170,7 @@ location_get_source_line (expanded_location xloc)
   if (!stream)
     return NULL;
 
-  while ((buffer = read_line (stream)) && lines < xloc.line)
+  while ((buffer = read_line (stream, line_len)) && lines < xloc.line)
     lines++;
 
   fclose (stream);
diff --git a/gcc/input.h b/gcc/input.h
index 8fdc7b2..79b3a10 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -37,7 +37,8 @@ extern char builtins_location_check[(BUILTINS_LOCATION
 				     < RESERVED_LOCATION_COUNT) ? 1 : -1];
 
 extern expanded_location expand_location (source_location);
-extern const char *location_get_source_line (expanded_location xloc);
+extern const char *location_get_source_line (expanded_location xloc,
+					     int& line_size);
 extern expanded_location expand_location_to_spelling_point (source_location);
 extern source_location expansion_point_location_if_in_system_header (source_location);
 
diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c b/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c
new file mode 100644
index 0000000000000000000000000000000000000000..ff2ed962ac96e47ae05b0b040f4e10b8e09637e2
GIT binary patch
literal 240
zcmdPbSEyD<N!LxuS12e-Ehx%QPAx80sO92PVo*}h*HVDUmM0eFW#*+TDCL#r<R~O(
UBo-wmm!uXcDby-x=?^KT09Xk|)&Kwi

literal 0
HcmV?d00001

-- 
		Dodji

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2014-01-29 11:28 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-31 14:48 [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals Bernd Edlinger
2013-10-31 15:06 ` Jakub Jelinek
2013-10-31 15:19   ` Dodji Seketeli
2013-10-31 18:26     ` Jakub Jelinek
2013-11-04 11:52       ` Dodji Seketeli
2013-11-04 11:59         ` Jakub Jelinek
2013-11-04 15:42           ` Dodji Seketeli
2013-11-05  0:10             ` Bernd Edlinger
2013-11-05  9:50               ` Dodji Seketeli
2013-11-05 11:19                 ` Bernd Edlinger
2013-11-05 11:43                   ` Dodji Seketeli
2013-11-06 22:27                 ` Bernd Edlinger
2013-11-04 12:06         ` Bernd Edlinger
2013-11-04 12:15           ` Jakub Jelinek
2013-11-04 12:32             ` Bernd Edlinger
2013-11-04 15:21           ` Dodji Seketeli
2013-11-11 10:49       ` Dodji Seketeli
2013-11-11 14:35         ` Jakub Jelinek
2013-11-11 17:13           ` Dodji Seketeli
2013-11-12 16:42             ` Dodji Seketeli
2013-11-13  5:10               ` Bernd Edlinger
2013-11-13  9:40                 ` Dodji Seketeli
2013-11-13  9:43                   ` Bernd Edlinger
2013-11-13  9:49                     ` Dodji Seketeli
2013-11-13  9:49                     ` Dodji Seketeli
2013-11-13  9:51               ` Jakub Jelinek
2013-11-14 15:12                 ` Dodji Seketeli
2013-12-09 20:11                   ` Tom Tromey
2014-01-21 12:28                   ` Bernd Edlinger
2014-01-22  8:16                     ` Dodji Seketeli
2014-01-23 17:12                       ` Jakub Jelinek
2014-01-24  2:58                         ` Bernd Edlinger
2014-01-24  7:53                         ` Dodji Seketeli
2014-01-24 15:05                       ` Markus Trippelsdorf
2014-01-24 15:41                         ` Dodji Seketeli
2014-01-24 15:44                           ` Jakub Jelinek
2014-01-24 16:09                             ` Dodji Seketeli
2014-01-24 16:13                               ` Jakub Jelinek
2014-01-24 23:02                               ` Markus Trippelsdorf
2014-01-24 23:20                                 ` Markus Trippelsdorf
2014-01-28 13:20                                   ` Dodji Seketeli
2014-01-28 13:23                                     ` Dodji Seketeli
2014-01-28 18:40                                       ` H.J. Lu
2014-01-29 11:28                                         ` Dodji Seketeli
  -- strict thread matches above, loose matches on Subject: below --
2013-10-31 13:45 Dodji Seketeli
2013-10-31 17:30 ` Manuel López-Ibáñez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).