From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 10297 invoked by alias); 4 Nov 2013 11:46:24 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 10277 invoked by uid 89); 4 Nov 2013 11:46:22 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_50,RDNS_NONE,SPF_HELO_PASS,SPF_PASS autolearn=no version=3.3.2 X-HELO: mx1.redhat.com Received: from Unknown (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 04 Nov 2013 11:46:20 +0000 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id rA4BkDaF016133 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 4 Nov 2013 06:46:13 -0500 Received: from localhost (ovpn-116-76.ams2.redhat.com [10.36.116.76]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id rA4BkBdV032558 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 4 Nov 2013 06:46:12 -0500 Received: by localhost (Postfix, from userid 1000) id 58A3116503A; Mon, 4 Nov 2013 12:46:10 +0100 (CET) From: Dodji Seketeli To: Jakub Jelinek Cc: Bernd Edlinger , "gcc-patches\@gcc.gnu.org" Subject: Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals References: <20131031144309.GR27813@tucnak.zalov.cz> <87y559xz7y.fsf@redhat.com> <20131031173649.GW27813@tucnak.zalov.cz> X-URL: http://www.redhat.com Date: Mon, 04 Nov 2013 11:52:00 -0000 In-Reply-To: <20131031173649.GW27813@tucnak.zalov.cz> (Jakub Jelinek's message of "Thu, 31 Oct 2013 18:36:49 +0100") Message-ID: <87r4awtmnx.fsf@redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-SW-Source: 2013-11/txt/msg00197.txt.bz2 Jakub Jelinek writes: > I think even as a fallback is the patch too expensive. > I'd say best would be to write some getline API compatible function > and just use it, using fread on say fixed size buffer. OK, thanks for the insight. I have just used the getline_fallback function you proposed, slightly amended to use the memory allocation routines commonly used in gcc and renamed into get_line, with a hopefully complete comment explaining where this function comes from etc. [...] > A slight complication is what to do on mingw/cygwin and other DOS or > Mac style line ending environments, no idea what fgets exactly does > there. Actually, I think that even fgets just deals with '\n'. The reason, from what I gathered being that on windows, we fopen the input file in text mode; and in that mode, the \r\n is transformed into just \n. Apparently OSX is compatible with '\n' too. Someone corrects me if I am saying non-sense here. So the patch below is what I am bootstrapping at the moment. OK if it passes bootstrap on x86_64-unknown-linux-gnu against trunk? > BTW, we probably want to do something with the speed of the caret > diagnostics too, right now it opens the file again for each single line > to be printed in caret diagnostics and reads all lines until the right one, > so imagine how fast is printing of many warnings on almost adjacent lines > near the end of many megabytes long file. > Perhaps we could remember the last file we've opened for caret diagnostics, > don't fclose the file right away but only if a new request is for a > different file, perhaps keep some vector of line start offsets (say starting > byte of every 100th line or similar) and also remember the last read line > offset, so if a new request is for the same file, but higher line than last, > we can just keep getlineing, and if it is smaller line than last, we look up > the offset of the line / 100, fseek to it and just getline only modulo 100 > lines. Maybe we should keep not just one, but 2 or 4 opened files as cache > (again, with the starting line offsets vectors). I like this idea. I'll try and work on it. And now the patch. Cheers. gcc/ChangeLog: * input.h (location_get_source_line): Take an additional line_size parameter by reference. * input.c (get_line): New static function definition. (read_line): Take an additional line_length output parameter to be set to the size of the line. Use the new get_line function to compute the size of the line returned by fgets, rather than using strlen. Ensure that the buffer is initially zeroed; ensure that when growing the buffer too. (location_get_source_line): Take an additional output line_len parameter. Update the use of read_line to pass it the line_len parameter. * diagnostic.c (adjust_line): Take an additional input parameter for the length of the line, rather than calculating it with strlen. (diagnostic_show_locus): Adjust the use of location_get_source_line and adjust_line with respect to their new signature. While displaying a line now, do not stop at the first null byte. Rather, display the zero byte as a space and keep going until we reach the size of the line. gcc/testsuite/ChangeLog: * c-c++-common/cpp/warning-zero-in-literals-1.c: New test file. --- gcc/diagnostic.c | 17 ++-- gcc/input.c | 104 ++++++++++++++++++--- gcc/input.h | 3 +- .../c-c++-common/cpp/warning-zero-in-literals-1.c | Bin 0 -> 240 bytes 4 files changed, 103 insertions(+), 21 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c index 36094a1..0ca7081 100644 --- a/gcc/diagnostic.c +++ b/gcc/diagnostic.c @@ -259,12 +259,13 @@ diagnostic_build_prefix (diagnostic_context *context, MAX_WIDTH by some margin, then adjust the start of the line such that the COLUMN is smaller than MAX_WIDTH minus the margin. The margin is either 10 characters or the difference between the column - and the length of the line, whatever is smaller. */ + and the length of the line, whatever is smaller. The length of + LINE is given by LINE_WIDTH. */ static const char * -adjust_line (const char *line, int max_width, int *column_p) +adjust_line (const char *line, int line_width, + int max_width, int *column_p) { int right_margin = 10; - int line_width = strlen (line); int column = *column_p; right_margin = MIN (line_width - column, right_margin); @@ -284,6 +285,7 @@ diagnostic_show_locus (diagnostic_context * context, const diagnostic_info *diagnostic) { const char *line; + int line_width; char *buffer; expanded_location s; int max_width; @@ -297,22 +299,25 @@ diagnostic_show_locus (diagnostic_context * context, context->last_location = diagnostic->location; s = expand_location_to_spelling_point (diagnostic->location); - line = location_get_source_line (s); + line = location_get_source_line (s, line_width); if (line == NULL) return; max_width = context->caret_max_width; - line = adjust_line (line, max_width, &(s.column)); + line = adjust_line (line, line_width, max_width, &(s.column)); pp_newline (context->printer); saved_prefix = pp_get_prefix (context->printer); pp_set_prefix (context->printer, NULL); pp_space (context->printer); - while (max_width > 0 && *line != '\0') + while (max_width > 0 && line_width > 0) { char c = *line == '\t' ? ' ' : *line; + if (c == '\0') + c = ' '; pp_character (context->printer, c); max_width--; + line_width--; line++; } pp_newline (context->printer); diff --git a/gcc/input.c b/gcc/input.c index a141a92..be60039 100644 --- a/gcc/input.c +++ b/gcc/input.c @@ -87,44 +87,120 @@ expand_location_1 (source_location loc, return xloc; } -/* Reads one line from file into a static buffer. */ +/* This function reads a line that might contain zero byte value. The + function returns the number of bytes read. Note that this function + has been adapted from the a combination of geline() and + _IO_getdelim() from the GNU C library. It's been duplicated here + because the getline() function is not present on all platforms. + + LINEPTR points to a buffer that is to contain the line read. + + N points to the size of the the LINEPTR buffer. + + FP points to the file to consider. */ + +static ssize_t +get_line (char **lineptr, size_t *n, FILE *fp) +{ + ssize_t cur_len = 0, len; + char buf[16384]; + + if (lineptr == NULL || n == NULL) + return -1; + + if (*lineptr == NULL || *n == 0) + { + *n = 120; + *lineptr = XNEWVEC (char, *n); + if (*lineptr == NULL) + return -1; + } + + len = fread (buf, 1, sizeof buf, fp); + if (ferror (fp)) + return -1; + + for (;;) + { + size_t needed; + char *t = (char*) memchr (buf, '\n', len); + if (t != NULL) len = (t - buf) + 1; + if (__builtin_expect (len >= SSIZE_MAX - cur_len, 0)) + return -1; + needed = cur_len + len + 1; + if (needed > *n) + { + char *new_lineptr; + if (needed < 2 * *n) + needed = 2 * *n; + new_lineptr = XRESIZEVEC (char, *lineptr, needed); + if (new_lineptr == NULL) + return -1; + *lineptr = new_lineptr; + *n = needed; + } + memcpy (*lineptr + cur_len, buf, len); + cur_len += len; + if (t != NULL) + break; + len = fread (buf, 1, sizeof buf, fp); + if (ferror (fp)) + return -1; + if (len == 0) + break; + } + (*lineptr)[cur_len] = '\0'; + return cur_len; +} + +/* Reads one line from FILE into a static buffer. LINE_LENGTH is set + by this function to the length of the returned line. Note that the + returned line can contain several zero bytes. */ static const char * -read_line (FILE *file) +read_line (FILE *file, int& line_length) { static char *string; - static size_t string_len; + static size_t string_len, cur_len; size_t pos = 0; char *ptr; if (!string_len) { string_len = 200; - string = XNEWVEC (char, string_len); + string = XCNEWVEC (char, string_len); } + else + memset (string, 0, string_len); - while ((ptr = fgets (string + pos, string_len - pos, file))) + ptr = string; + cur_len = string_len; + while (size_t len = get_line (&ptr, &cur_len, file)) { - size_t len = strlen (string + pos); - - if (string[pos + len - 1] == '\n') + if (ptr[len - 1] == '\n') { - string[pos + len - 1] = 0; + ptr[len - 1] = 0; + line_length = len; return string; } pos += len; string = XRESIZEVEC (char, string, string_len * 2); string_len *= 2; - } - + ptr = string + pos; + cur_len = string_len - pos; + } + + line_length = pos ? string_len : 0; return pos ? string : NULL; } /* Return the physical source line that corresponds to xloc in a buffer that is statically allocated. The newline is replaced by - the null character. */ + the null character. Note that the line can contain several null + characters, so LINE_LEN contains the actual length of the line. */ const char * -location_get_source_line (expanded_location xloc) +location_get_source_line (expanded_location xloc, + int& line_len) { const char *buffer; int lines = 1; @@ -132,7 +208,7 @@ location_get_source_line (expanded_location xloc) if (!stream) return NULL; - while ((buffer = read_line (stream)) && lines < xloc.line) + while ((buffer = read_line (stream, line_len)) && lines < xloc.line) lines++; fclose (stream); diff --git a/gcc/input.h b/gcc/input.h index 8fdc7b2..79b3a10 100644 --- a/gcc/input.h +++ b/gcc/input.h @@ -37,7 +37,8 @@ extern char builtins_location_check[(BUILTINS_LOCATION < RESERVED_LOCATION_COUNT) ? 1 : -1]; extern expanded_location expand_location (source_location); -extern const char *location_get_source_line (expanded_location xloc); +extern const char *location_get_source_line (expanded_location xloc, + int& line_size); extern expanded_location expand_location_to_spelling_point (source_location); extern source_location expansion_point_location_if_in_system_header (source_location); diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c b/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c new file mode 100644 index 0000000000000000000000000000000000000000..ff2ed962ac96e47ae05b0b040f4e10b8e09637e2 GIT binary patch literal 240 zcmdPbSEyD