From: Jason Merrill <jason@redhat.com>
To: David Malcolm <dmalcolm@redhat.com>, gcc-patches@gcc.gnu.org
Subject: Re: [PATCH RFA] input: add get_source_text_between
Date: Fri, 4 Nov 2022 10:27:56 -0400 [thread overview]
Message-ID: <22136833-3ba8-686b-4eae-a709f2c1780d@redhat.com> (raw)
In-Reply-To: <906b1326bd95c094331f7a5ff46723986215e3cf.camel@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 6016 bytes --]
On 11/3/22 19:06, David Malcolm wrote:
> On Thu, 2022-11-03 at 15:59 -0400, Jason Merrill via Gcc-patches wrote:
>> Tested x86_64-pc-linux-gnu, OK for trunk?
>>
>> -- >8 --
>>
>> The c++-contracts branch uses this to retrieve the source form of the
>> contract predicate, to be returned by contract_violation::comment().
>>
>> gcc/ChangeLog:
>>
>> * input.cc (get_source_text_between): New fn.
>> * input.h (get_source_text_between): Declare.
>> ---
>> gcc/input.h | 1 +
>> gcc/input.cc | 76
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 2 files changed, 77 insertions(+)
>>
>> diff --git a/gcc/input.h b/gcc/input.h
>> index 11c571d076f..f18769950b5 100644
>> --- a/gcc/input.h
>> +++ b/gcc/input.h
>> @@ -111,6 +111,7 @@ class char_span
>> };
>>
>> extern char_span location_get_source_line (const char *file_path,
>> int line);
>> +extern char *get_source_text_between (location_t, location_t);
>>
>> extern bool location_missing_trailing_newline (const char
>> *file_path);
>>
>> diff --git a/gcc/input.cc b/gcc/input.cc
>> index a28abfac5ac..9b36356338a 100644
>> --- a/gcc/input.cc
>> +++ b/gcc/input.cc
>> @@ -949,6 +949,82 @@ location_get_source_line (const char *file_path,
>> int line)
>> return char_span (buffer, len);
>> }
>>
>> +/* Return a copy of the source text between two locations. The
>> caller is
>> + responsible for freeing the return value. */
>> +
>> +char *
>> +get_source_text_between (location_t start, location_t end)
>> +{
>> + expanded_location expstart =
>> + expand_location_to_spelling_point (start,
>> LOCATION_ASPECT_START);
>> + expanded_location expend =
>> + expand_location_to_spelling_point (end, LOCATION_ASPECT_FINISH);
>> +
>> + /* If the locations are in different files or the end comes before
>> the
>> + start, abort and return nothing. */
>
> I don't like the use of the term "abort" here, as it suggests to me the
> use of abort(3). Maybe "bail out" instead?
I went with "give up".
>> + if (!expstart.file || !expend.file)
>> + return NULL;
>> + if (strcmp (expstart.file, expend.file) != 0)
>> + return NULL;
>> + if (expstart.line > expend.line)
>> + return NULL;
>> + if (expstart.line == expend.line
>> + && expstart.column > expend.column)
>> + return NULL;
>
> We occasionally use the convention that
> (column == 0)
> means "the whole line". Probably should detect that case and bail out
> early for it.
Done.
>> +
>> + /* For a single line we need to trim both edges. */
>> + if (expstart.line == expend.line)
>> + {
>> + char_span line = location_get_source_line (expstart.file,
>> expstart.line);
>> + if (line.length () < 1)
>> + return NULL;
>> + int s = expstart.column - 1;
>> + int l = expend.column - s;
>
> Can we please avoid lower-case L "ell" for variable names, as it looks
> so similar to the numeral for one. "len" would be more descriptive
> here.
Done.
>> + if (line.length () < (size_t)expend.column)
>> + return NULL;
>> + return line.subspan (s, l).xstrdup ();
>> + }
>> +
>> + struct obstack buf_obstack;
>> + obstack_init (&buf_obstack);
>> +
>> + /* Loop through all lines in the range and append each to buf; may
>> trim
>> + parts of the start and end lines off depending on column
>> values. */
>> + for (int l = expstart.line; l <= expend.line; ++l)
>
> Again, please let's not have a var named "l". Maybe "iter_line" as
> that's what is being done?
>
>> + {
>> + char_span line = location_get_source_line (expstart.file, l);
>> + if (line.length () < 1 && (l != expstart.line && l !=
>> expend.line))
>
> ...especially as I *think* the first comparison is against numeral one,
> whereas comparisons two and three are against lower-case ell, but I'm
> having to squint at the font in my email client to be sure :-/
Done. But also allow me to recommend
https://nodnod.net/posts/inconsolata-dz/
>> + continue;
>> +
>> + /* For the first line in the range, only start at
>> expstart.column */
>> + if (l == expstart.line)
>> + {
>> + if (expstart.column == 0)
>> + return NULL;
>> + if (line.length () < (size_t)expstart.column - 1)
>> + return NULL;
>> + line = line.subspan (expstart.column - 1,
>> + line.length() - expstart.column + 1);
>> + }
>> + /* For the last line, don't go past expend.column */
>> + else if (l == expend.line)
>> + {
>> + if (line.length () < (size_t)expend.column)
>> + return NULL;
>> + line = line.subspan (0, expend.column);
>> + }
>> +
>> + obstack_grow (&buf_obstack, line.get_buffer (), line.length
>> ());
>
> Is this accumulating the trailing newline characters into the
> buf_obstack? I *think* it is, but it seems worth a comment for each of
> the three cases (first line, intermediate line, last line).
It is not; I've added a comment to that effect, and also implemented the
TODO of collapsing a series of whitespace.
>> + }
>> +
>> + /* NUL-terminate and finish the buf obstack. */
>> + obstack_1grow (&buf_obstack, 0);
>> + const char *buf = (const char *) obstack_finish (&buf_obstack);
>> +
>> + /* TODO should we collapse/trim newlines and runs of spaces? */
>> + return xstrdup (buf);
>> +}
>> +
>
> Do you have test coverage for this from the DejaGnu side? If not, you
> could add selftest coverage for this; see input.cc's
> test_reading_source_line for something similar.
There is test coverage for the output of the the contract violation
handler, which involves printing the result of this function.
[-- Attachment #2: 0001-input-add-get_source_text_between.patch --]
[-- Type: text/x-patch, Size: 4458 bytes --]
From 4d8a24574c808f881438d65e8f333f7e152fb217 Mon Sep 17 00:00:00 2001
From: Jeff Chapman II <jchapman@lock3software.com>
Date: Thu, 3 Nov 2022 15:47:47 -0400
Subject: [PATCH] input: add get_source_text_between
To: gcc-patches@gcc.gnu.org
The c++-contracts branch uses this to retrieve the source form of the
contract predicate, to be returned by contract_violation::comment().
Co-authored-by: Jason Merrill <jason@redhat.com>
gcc/ChangeLog:
* input.cc (get_source_text_between): New fn.
* input.h (get_source_text_between): Declare.
---
gcc/input.h | 1 +
gcc/input.cc | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 92 insertions(+)
diff --git a/gcc/input.h b/gcc/input.h
index 11c571d076f..f18769950b5 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -111,6 +111,7 @@ class char_span
};
extern char_span location_get_source_line (const char *file_path, int line);
+extern char *get_source_text_between (location_t, location_t);
extern bool location_missing_trailing_newline (const char *file_path);
diff --git a/gcc/input.cc b/gcc/input.cc
index a28abfac5ac..04d0809bfdf 100644
--- a/gcc/input.cc
+++ b/gcc/input.cc
@@ -949,6 +949,97 @@ location_get_source_line (const char *file_path, int line)
return char_span (buffer, len);
}
+/* Return a copy of the source text between two locations. The caller is
+ responsible for freeing the return value. */
+
+char *
+get_source_text_between (location_t start, location_t end)
+{
+ expanded_location expstart =
+ expand_location_to_spelling_point (start, LOCATION_ASPECT_START);
+ expanded_location expend =
+ expand_location_to_spelling_point (end, LOCATION_ASPECT_FINISH);
+
+ /* If the locations are in different files or the end comes before the
+ start, give up and return nothing. */
+ if (!expstart.file || !expend.file)
+ return NULL;
+ if (strcmp (expstart.file, expend.file) != 0)
+ return NULL;
+ if (expstart.line > expend.line)
+ return NULL;
+ if (expstart.line == expend.line
+ && expstart.column > expend.column)
+ return NULL;
+ /* These aren't real column numbers, give up. */
+ if (expstart.column == 0 || expend.column == 0)
+ return NULL;
+
+ /* For a single line we need to trim both edges. */
+ if (expstart.line == expend.line)
+ {
+ char_span line = location_get_source_line (expstart.file, expstart.line);
+ if (line.length () < 1)
+ return NULL;
+ int s = expstart.column - 1;
+ int len = expend.column - s;
+ if (line.length () < (size_t)expend.column)
+ return NULL;
+ return line.subspan (s, len).xstrdup ();
+ }
+
+ struct obstack buf_obstack;
+ obstack_init (&buf_obstack);
+
+ /* Loop through all lines in the range and append each to buf; may trim
+ parts of the start and end lines off depending on column values. */
+ for (int lnum = expstart.line; lnum <= expend.line; ++lnum)
+ {
+ char_span line = location_get_source_line (expstart.file, lnum);
+ if (line.length () < 1 && (lnum != expstart.line && lnum != expend.line))
+ continue;
+
+ /* For the first line in the range, only start at expstart.column */
+ if (lnum == expstart.line)
+ {
+ unsigned off = expstart.column - 1;
+ if (line.length () < off)
+ return NULL;
+ line = line.subspan (off, line.length() - off);
+ }
+ /* For the last line, don't go past expend.column */
+ else if (lnum == expend.line)
+ {
+ if (line.length () < (size_t)expend.column)
+ return NULL;
+ line = line.subspan (0, expend.column);
+ }
+
+ /* Combine spaces at the beginning of later lines. */
+ if (lnum > expstart.line)
+ {
+ unsigned off;
+ for (off = 0; off < line.length(); ++off)
+ if (line[off] != ' ' && line[off] != '\t')
+ break;
+ if (off > 0)
+ {
+ obstack_1grow (&buf_obstack, ' ');
+ line = line.subspan (off, line.length() - off);
+ }
+ }
+
+ /* This does not include any trailing newlines. */
+ obstack_grow (&buf_obstack, line.get_buffer (), line.length ());
+ }
+
+ /* NUL-terminate and finish the buf obstack. */
+ obstack_1grow (&buf_obstack, 0);
+ const char *buf = (const char *) obstack_finish (&buf_obstack);
+
+ return xstrdup (buf);
+}
+
/* Determine if FILE_PATH missing a trailing newline on its final line.
Only valid to call once all of the file has been loaded, by
requesting a line number beyond the end of the file. */
--
2.31.1
next prev parent reply other threads:[~2022-11-04 14:28 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-03 19:59 Jason Merrill
2022-11-03 23:06 ` David Malcolm
2022-11-04 14:27 ` Jason Merrill [this message]
2022-11-04 15:16 ` David Malcolm
2022-11-04 17:06 ` Jason Merrill
2022-11-05 2:00 ` David Malcolm
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=22136833-3ba8-686b-4eae-a709f2c1780d@redhat.com \
--to=jason@redhat.com \
--cc=dmalcolm@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).