public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jason Merrill <jason@redhat.com>
To: David Malcolm <dmalcolm@redhat.com>, gcc-patches@gcc.gnu.org
Subject: Re: [PATCH RFA] input: add get_source_text_between
Date: Fri, 4 Nov 2022 10:27:56 -0400	[thread overview]
Message-ID: <22136833-3ba8-686b-4eae-a709f2c1780d@redhat.com> (raw)
In-Reply-To: <906b1326bd95c094331f7a5ff46723986215e3cf.camel@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 6016 bytes --]

On 11/3/22 19:06, David Malcolm wrote:
> On Thu, 2022-11-03 at 15:59 -0400, Jason Merrill via Gcc-patches wrote:
>> Tested x86_64-pc-linux-gnu, OK for trunk?
>>
>> -- >8 --
>>
>> The c++-contracts branch uses this to retrieve the source form of the
>> contract predicate, to be returned by contract_violation::comment().
>>
>> gcc/ChangeLog:
>>
>>          * input.cc (get_source_text_between): New fn.
>>          * input.h (get_source_text_between): Declare.
>> ---
>>   gcc/input.h  |  1 +
>>   gcc/input.cc | 76
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 77 insertions(+)
>>
>> diff --git a/gcc/input.h b/gcc/input.h
>> index 11c571d076f..f18769950b5 100644
>> --- a/gcc/input.h
>> +++ b/gcc/input.h
>> @@ -111,6 +111,7 @@ class char_span
>>   };
>>   
>>   extern char_span location_get_source_line (const char *file_path,
>> int line);
>> +extern char *get_source_text_between (location_t, location_t);
>>   
>>   extern bool location_missing_trailing_newline (const char
>> *file_path);
>>   
>> diff --git a/gcc/input.cc b/gcc/input.cc
>> index a28abfac5ac..9b36356338a 100644
>> --- a/gcc/input.cc
>> +++ b/gcc/input.cc
>> @@ -949,6 +949,82 @@ location_get_source_line (const char *file_path,
>> int line)
>>     return char_span (buffer, len);
>>   }
>>   
>> +/* Return a copy of the source text between two locations.  The
>> caller is
>> +   responsible for freeing the return value.  */
>> +
>> +char *
>> +get_source_text_between (location_t start, location_t end)
>> +{
>> +  expanded_location expstart =
>> +    expand_location_to_spelling_point (start,
>> LOCATION_ASPECT_START);
>> +  expanded_location expend =
>> +    expand_location_to_spelling_point (end, LOCATION_ASPECT_FINISH);
>> +
>> +  /* If the locations are in different files or the end comes before
>> the
>> +     start, abort and return nothing.  */
> 
> I don't like the use of the term "abort" here, as it suggests to me the
> use of abort(3).  Maybe "bail out" instead?

I went with "give up".

>> +  if (!expstart.file || !expend.file)
>> +    return NULL;
>> +  if (strcmp (expstart.file, expend.file) != 0)
>> +    return NULL;
>> +  if (expstart.line > expend.line)
>> +    return NULL;
>> +  if (expstart.line == expend.line
>> +      && expstart.column > expend.column)
>> +    return NULL;
> 
> We occasionally use the convention that
>    (column == 0)
> means "the whole line".  Probably should detect that case and bail out
> early for it.

Done.

>> +
>> +  /* For a single line we need to trim both edges.  */
>> +  if (expstart.line == expend.line)
>> +    {
>> +      char_span line = location_get_source_line (expstart.file,
>> expstart.line);
>> +      if (line.length () < 1)
>> +       return NULL;
>> +      int s = expstart.column - 1;
>> +      int l = expend.column - s;
> 
> Can we please avoid lower-case L "ell" for variable names, as it looks
> so similar to the numeral for one.  "len" would be more descriptive
> here.

Done.

>> +      if (line.length () < (size_t)expend.column)
>> +       return NULL;
>> +      return line.subspan (s, l).xstrdup ();
>> +    }
>> +
>> +  struct obstack buf_obstack;
>> +  obstack_init (&buf_obstack);
>> +
>> +  /* Loop through all lines in the range and append each to buf; may
>> trim
>> +     parts of the start and end lines off depending on column
>> values.  */
>> +  for (int l = expstart.line; l <= expend.line; ++l)
> 
> Again, please let's not have a var named "l".  Maybe "iter_line" as
> that's what is being done?
> 
>> +    {
>> +      char_span line = location_get_source_line (expstart.file, l);
>> +      if (line.length () < 1 && (l != expstart.line && l !=
>> expend.line))
> 
> ...especially as I *think* the first comparison is against numeral one,
> whereas comparisons two and three are against lower-case ell, but I'm
> having to squint at the font in my email client to be sure :-/

Done.  But also allow me to recommend

https://nodnod.net/posts/inconsolata-dz/

>> +       continue;
>> +
>> +      /* For the first line in the range, only start at
>> expstart.column */
>> +      if (l == expstart.line)
>> +       {
>> +         if (expstart.column == 0)
>> +           return NULL;
>> +         if (line.length () < (size_t)expstart.column - 1)
>> +           return NULL;
>> +         line = line.subspan (expstart.column - 1,
>> +                              line.length() - expstart.column + 1);
>> +       }
>> +      /* For the last line, don't go past expend.column */
>> +      else if (l == expend.line)
>> +       {
>> +         if (line.length () < (size_t)expend.column)
>> +           return NULL;
>> +         line = line.subspan (0, expend.column);
>> +       }
>> +
>> +      obstack_grow (&buf_obstack, line.get_buffer (), line.length
>> ());
> 
> Is this accumulating the trailing newline characters into the
> buf_obstack?  I *think* it is, but it seems worth a comment for each of
> the three cases (first line, intermediate line, last line).

It is not; I've added a comment to that effect, and also implemented the 
TODO of collapsing a series of whitespace.

>> +    }
>> +
>> +  /* NUL-terminate and finish the buf obstack.  */
>> +  obstack_1grow (&buf_obstack, 0);
>> +  const char *buf = (const char *) obstack_finish (&buf_obstack);
>> +
>> +  /* TODO should we collapse/trim newlines and runs of spaces?  */
>> +  return xstrdup (buf);
>> +}
>> +
> 
> Do you have test coverage for this from the DejaGnu side?  If not, you
> could add selftest coverage for this; see input.cc's
> test_reading_source_line for something similar.

There is test coverage for the output of the the contract violation 
handler, which involves printing the result of this function.

[-- Attachment #2: 0001-input-add-get_source_text_between.patch --]
[-- Type: text/x-patch, Size: 4458 bytes --]

From 4d8a24574c808f881438d65e8f333f7e152fb217 Mon Sep 17 00:00:00 2001
From: Jeff Chapman II <jchapman@lock3software.com>
Date: Thu, 3 Nov 2022 15:47:47 -0400
Subject: [PATCH] input: add get_source_text_between
To: gcc-patches@gcc.gnu.org

The c++-contracts branch uses this to retrieve the source form of the
contract predicate, to be returned by contract_violation::comment().

Co-authored-by: Jason Merrill  <jason@redhat.com>

gcc/ChangeLog:

	* input.cc (get_source_text_between): New fn.
	* input.h (get_source_text_between): Declare.
---
 gcc/input.h  |  1 +
 gcc/input.cc | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 92 insertions(+)

diff --git a/gcc/input.h b/gcc/input.h
index 11c571d076f..f18769950b5 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -111,6 +111,7 @@ class char_span
 };
 
 extern char_span location_get_source_line (const char *file_path, int line);
+extern char *get_source_text_between (location_t, location_t);
 
 extern bool location_missing_trailing_newline (const char *file_path);
 
diff --git a/gcc/input.cc b/gcc/input.cc
index a28abfac5ac..04d0809bfdf 100644
--- a/gcc/input.cc
+++ b/gcc/input.cc
@@ -949,6 +949,97 @@ location_get_source_line (const char *file_path, int line)
   return char_span (buffer, len);
 }
 
+/* Return a copy of the source text between two locations.  The caller is
+   responsible for freeing the return value.  */
+
+char *
+get_source_text_between (location_t start, location_t end)
+{
+  expanded_location expstart =
+    expand_location_to_spelling_point (start, LOCATION_ASPECT_START);
+  expanded_location expend =
+    expand_location_to_spelling_point (end, LOCATION_ASPECT_FINISH);
+
+  /* If the locations are in different files or the end comes before the
+     start, give up and return nothing.  */
+  if (!expstart.file || !expend.file)
+    return NULL;
+  if (strcmp (expstart.file, expend.file) != 0)
+    return NULL;
+  if (expstart.line > expend.line)
+    return NULL;
+  if (expstart.line == expend.line
+      && expstart.column > expend.column)
+    return NULL;
+  /* These aren't real column numbers, give up.  */
+  if (expstart.column == 0 || expend.column == 0)
+    return NULL;
+
+  /* For a single line we need to trim both edges.  */
+  if (expstart.line == expend.line)
+    {
+      char_span line = location_get_source_line (expstart.file, expstart.line);
+      if (line.length () < 1)
+	return NULL;
+      int s = expstart.column - 1;
+      int len = expend.column - s;
+      if (line.length () < (size_t)expend.column)
+	return NULL;
+      return line.subspan (s, len).xstrdup ();
+    }
+
+  struct obstack buf_obstack;
+  obstack_init (&buf_obstack);
+
+  /* Loop through all lines in the range and append each to buf; may trim
+     parts of the start and end lines off depending on column values.  */
+  for (int lnum = expstart.line; lnum <= expend.line; ++lnum)
+    {
+      char_span line = location_get_source_line (expstart.file, lnum);
+      if (line.length () < 1 && (lnum != expstart.line && lnum != expend.line))
+	continue;
+
+      /* For the first line in the range, only start at expstart.column */
+      if (lnum == expstart.line)
+	{
+	  unsigned off = expstart.column - 1;
+	  if (line.length () < off)
+	    return NULL;
+	  line = line.subspan (off, line.length() - off);
+	}
+      /* For the last line, don't go past expend.column */
+      else if (lnum == expend.line)
+	{
+	  if (line.length () < (size_t)expend.column)
+	    return NULL;
+	  line = line.subspan (0, expend.column);
+	}
+
+      /* Combine spaces at the beginning of later lines.  */
+      if (lnum > expstart.line)
+	{
+	  unsigned off;
+	  for (off = 0; off < line.length(); ++off)
+	    if (line[off] != ' ' && line[off] != '\t')
+	      break;
+	  if (off > 0)
+	    {
+	      obstack_1grow (&buf_obstack, ' ');
+	      line = line.subspan (off, line.length() - off);
+	    }
+	}
+
+      /* This does not include any trailing newlines.  */
+      obstack_grow (&buf_obstack, line.get_buffer (), line.length ());
+    }
+
+  /* NUL-terminate and finish the buf obstack.  */
+  obstack_1grow (&buf_obstack, 0);
+  const char *buf = (const char *) obstack_finish (&buf_obstack);
+
+  return xstrdup (buf);
+}
+
 /* Determine if FILE_PATH missing a trailing newline on its final line.
    Only valid to call once all of the file has been loaded, by
    requesting a line number beyond the end of the file.  */
-- 
2.31.1


  reply	other threads:[~2022-11-04 14:28 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-03 19:59 Jason Merrill
2022-11-03 23:06 ` David Malcolm
2022-11-04 14:27   ` Jason Merrill [this message]
2022-11-04 15:16     ` David Malcolm
2022-11-04 17:06       ` Jason Merrill
2022-11-05  2:00         ` David Malcolm

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=22136833-3ba8-686b-4eae-a709f2c1780d@redhat.com \
    --to=jason@redhat.com \
    --cc=dmalcolm@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).