From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24091 invoked by alias); 4 Nov 2013 12:00:44 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 24068 invoked by uid 89); 4 Nov 2013 12:00:44 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=2.1 required=5.0 tests=AWL,BAYES_99,FREEMAIL_FROM,RDNS_NONE,SPF_PASS autolearn=no version=3.3.2 X-HELO: dub0-omc1-s10.dub0.hotmail.com Received: from Unknown (HELO dub0-omc1-s10.dub0.hotmail.com) (157.55.0.209) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 04 Nov 2013 11:59:58 +0000 Received: from DUB122-W28 ([157.55.0.239]) by dub0-omc1-s10.dub0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Mon, 4 Nov 2013 03:59:49 -0800 X-TMN: [uiY34NGGCFNAq8RkHWHjWImItbx7K+xX] Message-ID: From: Bernd Edlinger To: Dodji Seketeli , Jakub Jelinek CC: "gcc-patches@gcc.gnu.org" Subject: RE: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals Date: Mon, 04 Nov 2013 12:06:00 -0000 In-Reply-To: <87r4awtmnx.fsf@redhat.com> References: ,<20131031144309.GR27813@tucnak.zalov.cz> <87y559xz7y.fsf@redhat.com>,<20131031173649.GW27813@tucnak.zalov.cz>,<87r4awtmnx.fsf@redhat.com> Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-SW-Source: 2013-11/txt/msg00201.txt.bz2 Hi, I see another "read_line" at gcov.c, which seems to be a copy. Maybe this should be changed too? What do you think? Bernd. On Mon, 4 Nov 2013 12:46:10, Dodji Seketeli wrote: > > Jakub Jelinek writes: > >> I think even as a fallback is the patch too expensive. >> I'd say best would be to write some getline API compatible function >> and just use it, using fread on say fixed size buffer. > > OK, thanks for the insight. I have just used the getline_fallback > function you proposed, slightly amended to use the memory allocation > routines commonly used in gcc and renamed into get_line, with a > hopefully complete comment explaining where this function comes from > etc. > > [...] > >> A slight complication is what to do on mingw/cygwin and other DOS or >> Mac style line ending environments, no idea what fgets exactly does >> there. > > Actually, I think that even fgets just deals with '\n'. The reason, > from what I gathered being that on windows, we fopen the input file in > text mode; and in that mode, the \r\n is transformed into just \n. > > Apparently OSX is compatible with '\n' too. Someone corrects me if I am > saying non-sense here. > > So the patch below is what I am bootstrapping at the moment. > > OK if it passes bootstrap on x86_64-unknown-linux-gnu against trunk? > >> BTW, we probably want to do something with the speed of the caret >> diagnostics too, right now it opens the file again for each single line >> to be printed in caret diagnostics and reads all lines until the right o= ne, >> so imagine how fast is printing of many warnings on almost adjacent lines >> near the end of many megabytes long file. >> Perhaps we could remember the last file we've opened for caret diagnosti= cs, >> don't fclose the file right away but only if a new request is for a >> different file, perhaps keep some vector of line start offsets (say star= ting >> byte of every 100th line or similar) and also remember the last read line >> offset, so if a new request is for the same file, but higher line than l= ast, >> we can just keep getlineing, and if it is smaller line than last, we loo= k up >> the offset of the line / 100, fseek to it and just getline only modulo 1= 00 >> lines. Maybe we should keep not just one, but 2 or 4 opened files as cac= he >> (again, with the starting line offsets vectors). > > I like this idea. I'll try and work on it. > > And now the patch. > > Cheers. > > gcc/ChangeLog: > > * input.h (location_get_source_line): Take an additional line_size > parameter by reference. > * input.c (get_line): New static function definition. > (read_line): Take an additional line_length output parameter to be > set to the size of the line. Use the new get_line function to > compute the size of the line returned by fgets, rather than using > strlen. Ensure that the buffer is initially zeroed; ensure that > when growing the buffer too. > (location_get_source_line): Take an additional output line_len > parameter. Update the use of read_line to pass it the line_len > parameter. > * diagnostic.c (adjust_line): Take an additional input parameter > for the length of the line, rather than calculating it with > strlen. > (diagnostic_show_locus): Adjust the use of > location_get_source_line and adjust_line with respect to their new > signature. While displaying a line now, do not stop at the first > null byte. Rather, display the zero byte as a space and keep > going until we reach the size of the line. > > gcc/testsuite/ChangeLog: > > * c-c++-common/cpp/warning-zero-in-literals-1.c: New test file. > --- > gcc/diagnostic.c | 17 ++-- > gcc/input.c | 104 ++++++++++++++++++--- > gcc/input.h | 3 +- > .../c-c++-common/cpp/warning-zero-in-literals-1.c | Bin 0 -> 240 bytes > 4 files changed, 103 insertions(+), 21 deletions(-) > create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literal= s-1.c > > diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c > index 36094a1..0ca7081 100644 > --- a/gcc/diagnostic.c > +++ b/gcc/diagnostic.c > @@ -259,12 +259,13 @@ diagnostic_build_prefix (diagnostic_context *contex= t, > MAX_WIDTH by some margin, then adjust the start of the line such > that the COLUMN is smaller than MAX_WIDTH minus the margin. The > margin is either 10 characters or the difference between the column > - and the length of the line, whatever is smaller. */ > + and the length of the line, whatever is smaller. The length of > + LINE is given by LINE_WIDTH. */ > static const char * > -adjust_line (const char *line, int max_width, int *column_p) > +adjust_line (const char *line, int line_width, > + int max_width, int *column_p) > { > int right_margin =3D 10; > - int line_width =3D strlen (line); > int column =3D *column_p; > > right_margin =3D MIN (line_width - column, right_margin); > @@ -284,6 +285,7 @@ diagnostic_show_locus (diagnostic_context * context, > const diagnostic_info *diagnostic) > { > const char *line; > + int line_width; > char *buffer; > expanded_location s; > int max_width; > @@ -297,22 +299,25 @@ diagnostic_show_locus (diagnostic_context * context, > > context->last_location =3D diagnostic->location; > s =3D expand_location_to_spelling_point (diagnostic->location); > - line =3D location_get_source_line (s); > + line =3D location_get_source_line (s, line_width); > if (line =3D=3D NULL) > return; > > max_width =3D context->caret_max_width; > - line =3D adjust_line (line, max_width, &(s.column)); > + line =3D adjust_line (line, line_width, max_width, &(s.column)); > > pp_newline (context->printer); > saved_prefix =3D pp_get_prefix (context->printer); > pp_set_prefix (context->printer, NULL); > pp_space (context->printer); > - while (max_width> 0 && *line !=3D '\0') > + while (max_width> 0 && line_width> 0) > { > char c =3D *line =3D=3D '\t' ? ' ' : *line; > + if (c =3D=3D '\0') > + c =3D ' '; > pp_character (context->printer, c); > max_width--; > + line_width--; > line++; > } > pp_newline (context->printer); > diff --git a/gcc/input.c b/gcc/input.c > index a141a92..be60039 100644 > --- a/gcc/input.c > +++ b/gcc/input.c > @@ -87,44 +87,120 @@ expand_location_1 (source_location loc, > return xloc; > } > > -/* Reads one line from file into a static buffer. */ > +/* This function reads a line that might contain zero byte value. The > + function returns the number of bytes read. Note that this function > + has been adapted from the a combination of geline() and > + _IO_getdelim() from the GNU C library. It's been duplicated here > + because the getline() function is not present on all platforms. > + > + LINEPTR points to a buffer that is to contain the line read. > + > + N points to the size of the the LINEPTR buffer. > + > + FP points to the file to consider. */ > + > +static ssize_t > +get_line (char **lineptr, size_t *n, FILE *fp) > +{ > + ssize_t cur_len =3D 0, len; > + char buf[16384]; > + > + if (lineptr =3D=3D NULL || n =3D=3D NULL) > + return -1; > + > + if (*lineptr =3D=3D NULL || *n =3D=3D 0) > + { > + *n =3D 120; > + *lineptr =3D XNEWVEC (char, *n); > + if (*lineptr =3D=3D NULL) > + return -1; > + } > + > + len =3D fread (buf, 1, sizeof buf, fp); > + if (ferror (fp)) > + return -1; > + > + for (;;) > + { > + size_t needed; > + char *t =3D (char*) memchr (buf, '\n', len); > + if (t !=3D NULL) len =3D (t - buf) + 1; > + if (__builtin_expect (len>=3D SSIZE_MAX - cur_len, 0)) > + return -1; > + needed =3D cur_len + len + 1; > + if (needed> *n) > + { > + char *new_lineptr; > + if (needed < 2 * *n) > + needed =3D 2 * *n; > + new_lineptr =3D XRESIZEVEC (char, *lineptr, needed); > + if (new_lineptr =3D=3D NULL) > + return -1; > + *lineptr =3D new_lineptr; > + *n =3D needed; > + } > + memcpy (*lineptr + cur_len, buf, len); > + cur_len +=3D len; > + if (t !=3D NULL) > + break; > + len =3D fread (buf, 1, sizeof buf, fp); > + if (ferror (fp)) > + return -1; > + if (len =3D=3D 0) > + break; > + } > + (*lineptr)[cur_len] =3D '\0'; > + return cur_len; > +} > + > +/* Reads one line from FILE into a static buffer. LINE_LENGTH is set > + by this function to the length of the returned line. Note that the > + returned line can contain several zero bytes. */ > static const char * > -read_line (FILE *file) > +read_line (FILE *file, int& line_length) > { > static char *string; > - static size_t string_len; > + static size_t string_len, cur_len; > size_t pos =3D 0; > char *ptr; > > if (!string_len) > { > string_len =3D 200; > - string =3D XNEWVEC (char, string_len); > + string =3D XCNEWVEC (char, string_len); > } > + else > + memset (string, 0, string_len); > > - while ((ptr =3D fgets (string + pos, string_len - pos, file))) > + ptr =3D string; > + cur_len =3D string_len; > + while (size_t len =3D get_line (&ptr, &cur_len, file)) > { > - size_t len =3D strlen (string + pos); > - > - if (string[pos + len - 1] =3D=3D '\n') > + if (ptr[len - 1] =3D=3D '\n') > { > - string[pos + len - 1] =3D 0; > + ptr[len - 1] =3D 0; > + line_length =3D len; > return string; > } > pos +=3D len; > string =3D XRESIZEVEC (char, string, string_len * 2); > string_len *=3D 2; > - } > - > + ptr =3D string + pos; > + cur_len =3D string_len - pos; > + } > + > + line_length =3D pos ? string_len : 0; > return pos ? string : NULL; > } > > /* Return the physical source line that corresponds to xloc in a > buffer that is statically allocated. The newline is replaced by > - the null character. */ > + the null character. Note that the line can contain several null > + characters, so LINE_LEN contains the actual length of the line. */ > > const char * > -location_get_source_line (expanded_location xloc) > +location_get_source_line (expanded_location xloc, > + int& line_len) > { > const char *buffer; > int lines =3D 1; > @@ -132,7 +208,7 @@ location_get_source_line (expanded_location xloc) > if (!stream) > return NULL; > > - while ((buffer =3D read_line (stream)) && lines < xloc.line) > + while ((buffer =3D read_line (stream, line_len)) && lines < xloc.line) > lines++; > > fclose (stream); > diff --git a/gcc/input.h b/gcc/input.h > index 8fdc7b2..79b3a10 100644 > --- a/gcc/input.h > +++ b/gcc/input.h > @@ -37,7 +37,8 @@ extern char builtins_location_check[(BUILTINS_LOCATION > < RESERVED_LOCATION_COUNT) ? 1 : -1]; > > extern expanded_location expand_location (source_location); > -extern const char *location_get_source_line (expanded_location xloc); > +extern const char *location_get_source_line (expanded_location xloc, > + int& line_size); > extern expanded_location expand_location_to_spelling_point (source_locati= on); > extern source_location expansion_point_location_if_in_system_header (sour= ce_location); > > diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c = b/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c > new file mode 100644 > index 0000000000000000000000000000000000000000..ff2ed962ac96e47ae05b0b040= f4e10b8e09637e2 > GIT binary patch > literal 240 > zcmdPbSEyD UBo-wmm!uXcDby-x=3D?^KT09Xk|)&Kwi > > literal 0 > HcmV?d00001 > > -- > Dodji=20=09=09=20=09=20=20=20=09=09=20=20