public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH RFA] input: add get_source_text_between
@ 2022-11-03 19:59 Jason Merrill
  2022-11-03 23:06 ` David Malcolm
  0 siblings, 1 reply; 6+ messages in thread
From: Jason Merrill @ 2022-11-03 19:59 UTC (permalink / raw)
  To: gcc-patches

Tested x86_64-pc-linux-gnu, OK for trunk?

-- >8 --

The c++-contracts branch uses this to retrieve the source form of the
contract predicate, to be returned by contract_violation::comment().

gcc/ChangeLog:

	* input.cc (get_source_text_between): New fn.
	* input.h (get_source_text_between): Declare.
---
 gcc/input.h  |  1 +
 gcc/input.cc | 76 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 77 insertions(+)

diff --git a/gcc/input.h b/gcc/input.h
index 11c571d076f..f18769950b5 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -111,6 +111,7 @@ class char_span
 };
 
 extern char_span location_get_source_line (const char *file_path, int line);
+extern char *get_source_text_between (location_t, location_t);
 
 extern bool location_missing_trailing_newline (const char *file_path);
 
diff --git a/gcc/input.cc b/gcc/input.cc
index a28abfac5ac..9b36356338a 100644
--- a/gcc/input.cc
+++ b/gcc/input.cc
@@ -949,6 +949,82 @@ location_get_source_line (const char *file_path, int line)
   return char_span (buffer, len);
 }
 
+/* Return a copy of the source text between two locations.  The caller is
+   responsible for freeing the return value.  */
+
+char *
+get_source_text_between (location_t start, location_t end)
+{
+  expanded_location expstart =
+    expand_location_to_spelling_point (start, LOCATION_ASPECT_START);
+  expanded_location expend =
+    expand_location_to_spelling_point (end, LOCATION_ASPECT_FINISH);
+
+  /* If the locations are in different files or the end comes before the
+     start, abort and return nothing.  */
+  if (!expstart.file || !expend.file)
+    return NULL;
+  if (strcmp (expstart.file, expend.file) != 0)
+    return NULL;
+  if (expstart.line > expend.line)
+    return NULL;
+  if (expstart.line == expend.line
+      && expstart.column > expend.column)
+    return NULL;
+
+  /* For a single line we need to trim both edges.  */
+  if (expstart.line == expend.line)
+    {
+      char_span line = location_get_source_line (expstart.file, expstart.line);
+      if (line.length () < 1)
+	return NULL;
+      int s = expstart.column - 1;
+      int l = expend.column - s;
+      if (line.length () < (size_t)expend.column)
+	return NULL;
+      return line.subspan (s, l).xstrdup ();
+    }
+
+  struct obstack buf_obstack;
+  obstack_init (&buf_obstack);
+
+  /* Loop through all lines in the range and append each to buf; may trim
+     parts of the start and end lines off depending on column values.  */
+  for (int l = expstart.line; l <= expend.line; ++l)
+    {
+      char_span line = location_get_source_line (expstart.file, l);
+      if (line.length () < 1 && (l != expstart.line && l != expend.line))
+	continue;
+
+      /* For the first line in the range, only start at expstart.column */
+      if (l == expstart.line)
+	{
+	  if (expstart.column == 0)
+	    return NULL;
+	  if (line.length () < (size_t)expstart.column - 1)
+	    return NULL;
+	  line = line.subspan (expstart.column - 1,
+			       line.length() - expstart.column + 1);
+	}
+      /* For the last line, don't go past expend.column */
+      else if (l == expend.line)
+	{
+	  if (line.length () < (size_t)expend.column)
+	    return NULL;
+	  line = line.subspan (0, expend.column);
+	}
+
+      obstack_grow (&buf_obstack, line.get_buffer (), line.length ());
+    }
+
+  /* NUL-terminate and finish the buf obstack.  */
+  obstack_1grow (&buf_obstack, 0);
+  const char *buf = (const char *) obstack_finish (&buf_obstack);
+
+  /* TODO should we collapse/trim newlines and runs of spaces?  */
+  return xstrdup (buf);
+}
+
 /* Determine if FILE_PATH missing a trailing newline on its final line.
    Only valid to call once all of the file has been loaded, by
    requesting a line number beyond the end of the file.  */

base-commit: a4cd2389276a30c39034a83d640ce68fa407bac1
prerequisite-patch-id: 329bc16a88dc9a3b13cd3fcecb3678826cc592dc
prerequisite-patch-id: 49e922c10f6da687d9da3f6a0fd20f324bd352d6
-- 
2.31.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH RFA] input: add get_source_text_between
  2022-11-03 19:59 [PATCH RFA] input: add get_source_text_between Jason Merrill
@ 2022-11-03 23:06 ` David Malcolm
  2022-11-04 14:27   ` Jason Merrill
  0 siblings, 1 reply; 6+ messages in thread
From: David Malcolm @ 2022-11-03 23:06 UTC (permalink / raw)
  To: Jason Merrill, gcc-patches

On Thu, 2022-11-03 at 15:59 -0400, Jason Merrill via Gcc-patches wrote:
> Tested x86_64-pc-linux-gnu, OK for trunk?
> 
> -- >8 --
> 
> The c++-contracts branch uses this to retrieve the source form of the
> contract predicate, to be returned by contract_violation::comment().
> 
> gcc/ChangeLog:
> 
>         * input.cc (get_source_text_between): New fn.
>         * input.h (get_source_text_between): Declare.
> ---
>  gcc/input.h  |  1 +
>  gcc/input.cc | 76
> ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 77 insertions(+)
> 
> diff --git a/gcc/input.h b/gcc/input.h
> index 11c571d076f..f18769950b5 100644
> --- a/gcc/input.h
> +++ b/gcc/input.h
> @@ -111,6 +111,7 @@ class char_span
>  };
>  
>  extern char_span location_get_source_line (const char *file_path,
> int line);
> +extern char *get_source_text_between (location_t, location_t);
>  
>  extern bool location_missing_trailing_newline (const char
> *file_path);
>  
> diff --git a/gcc/input.cc b/gcc/input.cc
> index a28abfac5ac..9b36356338a 100644
> --- a/gcc/input.cc
> +++ b/gcc/input.cc
> @@ -949,6 +949,82 @@ location_get_source_line (const char *file_path,
> int line)
>    return char_span (buffer, len);
>  }
>  
> +/* Return a copy of the source text between two locations.  The
> caller is
> +   responsible for freeing the return value.  */
> +
> +char *
> +get_source_text_between (location_t start, location_t end)
> +{
> +  expanded_location expstart =
> +    expand_location_to_spelling_point (start,
> LOCATION_ASPECT_START);
> +  expanded_location expend =
> +    expand_location_to_spelling_point (end, LOCATION_ASPECT_FINISH);
> +
> +  /* If the locations are in different files or the end comes before
> the
> +     start, abort and return nothing.  */

I don't like the use of the term "abort" here, as it suggests to me the
use of abort(3).  Maybe "bail out" instead?


> +  if (!expstart.file || !expend.file)
> +    return NULL;
> +  if (strcmp (expstart.file, expend.file) != 0)
> +    return NULL;
> +  if (expstart.line > expend.line)
> +    return NULL;
> +  if (expstart.line == expend.line
> +      && expstart.column > expend.column)
> +    return NULL;

We occasionally use the convention that
  (column == 0)
means "the whole line".  Probably should detect that case and bail out
early for it.


> +
> +  /* For a single line we need to trim both edges.  */
> +  if (expstart.line == expend.line)
> +    {
> +      char_span line = location_get_source_line (expstart.file,
> expstart.line);
> +      if (line.length () < 1)
> +       return NULL;
> +      int s = expstart.column - 1;
> +      int l = expend.column - s;

Can we please avoid lower-case L "ell" for variable names, as it looks
so similar to the numeral for one.  "len" would be more descriptive
here.


> +      if (line.length () < (size_t)expend.column)
> +       return NULL;
> +      return line.subspan (s, l).xstrdup ();
> +    }
> +
> +  struct obstack buf_obstack;
> +  obstack_init (&buf_obstack);
> +
> +  /* Loop through all lines in the range and append each to buf; may
> trim
> +     parts of the start and end lines off depending on column
> values.  */
> +  for (int l = expstart.line; l <= expend.line; ++l)

Again, please let's not have a var named "l".  Maybe "iter_line" as
that's what is being done?

> +    {
> +      char_span line = location_get_source_line (expstart.file, l);
> +      if (line.length () < 1 && (l != expstart.line && l !=
> expend.line))

...especially as I *think* the first comparison is against numeral one,
whereas comparisons two and three are against lower-case ell, but I'm
having to squint at the font in my email client to be sure :-/

> +       continue;
> +
> +      /* For the first line in the range, only start at
> expstart.column */
> +      if (l == expstart.line)
> +       {
> +         if (expstart.column == 0)
> +           return NULL;
> +         if (line.length () < (size_t)expstart.column - 1)
> +           return NULL;
> +         line = line.subspan (expstart.column - 1,
> +                              line.length() - expstart.column + 1);
> +       }
> +      /* For the last line, don't go past expend.column */
> +      else if (l == expend.line)
> +       {
> +         if (line.length () < (size_t)expend.column)
> +           return NULL;
> +         line = line.subspan (0, expend.column);
> +       }
> +
> +      obstack_grow (&buf_obstack, line.get_buffer (), line.length
> ());

Is this accumulating the trailing newline characters into the
buf_obstack?  I *think* it is, but it seems worth a comment for each of
the three cases (first line, intermediate line, last line).

> +    }
> +
> +  /* NUL-terminate and finish the buf obstack.  */
> +  obstack_1grow (&buf_obstack, 0);
> +  const char *buf = (const char *) obstack_finish (&buf_obstack);
> +
> +  /* TODO should we collapse/trim newlines and runs of spaces?  */
> +  return xstrdup (buf);
> +}
> +

Do you have test coverage for this from the DejaGnu side?  If not, you
could add selftest coverage for this; see input.cc's
test_reading_source_line for something similar.

OK for trunk otherwise.

Hope this is helpful
Dave


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH RFA] input: add get_source_text_between
  2022-11-03 23:06 ` David Malcolm
@ 2022-11-04 14:27   ` Jason Merrill
  2022-11-04 15:16     ` David Malcolm
  0 siblings, 1 reply; 6+ messages in thread
From: Jason Merrill @ 2022-11-04 14:27 UTC (permalink / raw)
  To: David Malcolm, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 6016 bytes --]

On 11/3/22 19:06, David Malcolm wrote:
> On Thu, 2022-11-03 at 15:59 -0400, Jason Merrill via Gcc-patches wrote:
>> Tested x86_64-pc-linux-gnu, OK for trunk?
>>
>> -- >8 --
>>
>> The c++-contracts branch uses this to retrieve the source form of the
>> contract predicate, to be returned by contract_violation::comment().
>>
>> gcc/ChangeLog:
>>
>>          * input.cc (get_source_text_between): New fn.
>>          * input.h (get_source_text_between): Declare.
>> ---
>>   gcc/input.h  |  1 +
>>   gcc/input.cc | 76
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 77 insertions(+)
>>
>> diff --git a/gcc/input.h b/gcc/input.h
>> index 11c571d076f..f18769950b5 100644
>> --- a/gcc/input.h
>> +++ b/gcc/input.h
>> @@ -111,6 +111,7 @@ class char_span
>>   };
>>   
>>   extern char_span location_get_source_line (const char *file_path,
>> int line);
>> +extern char *get_source_text_between (location_t, location_t);
>>   
>>   extern bool location_missing_trailing_newline (const char
>> *file_path);
>>   
>> diff --git a/gcc/input.cc b/gcc/input.cc
>> index a28abfac5ac..9b36356338a 100644
>> --- a/gcc/input.cc
>> +++ b/gcc/input.cc
>> @@ -949,6 +949,82 @@ location_get_source_line (const char *file_path,
>> int line)
>>     return char_span (buffer, len);
>>   }
>>   
>> +/* Return a copy of the source text between two locations.  The
>> caller is
>> +   responsible for freeing the return value.  */
>> +
>> +char *
>> +get_source_text_between (location_t start, location_t end)
>> +{
>> +  expanded_location expstart =
>> +    expand_location_to_spelling_point (start,
>> LOCATION_ASPECT_START);
>> +  expanded_location expend =
>> +    expand_location_to_spelling_point (end, LOCATION_ASPECT_FINISH);
>> +
>> +  /* If the locations are in different files or the end comes before
>> the
>> +     start, abort and return nothing.  */
> 
> I don't like the use of the term "abort" here, as it suggests to me the
> use of abort(3).  Maybe "bail out" instead?

I went with "give up".

>> +  if (!expstart.file || !expend.file)
>> +    return NULL;
>> +  if (strcmp (expstart.file, expend.file) != 0)
>> +    return NULL;
>> +  if (expstart.line > expend.line)
>> +    return NULL;
>> +  if (expstart.line == expend.line
>> +      && expstart.column > expend.column)
>> +    return NULL;
> 
> We occasionally use the convention that
>    (column == 0)
> means "the whole line".  Probably should detect that case and bail out
> early for it.

Done.

>> +
>> +  /* For a single line we need to trim both edges.  */
>> +  if (expstart.line == expend.line)
>> +    {
>> +      char_span line = location_get_source_line (expstart.file,
>> expstart.line);
>> +      if (line.length () < 1)
>> +       return NULL;
>> +      int s = expstart.column - 1;
>> +      int l = expend.column - s;
> 
> Can we please avoid lower-case L "ell" for variable names, as it looks
> so similar to the numeral for one.  "len" would be more descriptive
> here.

Done.

>> +      if (line.length () < (size_t)expend.column)
>> +       return NULL;
>> +      return line.subspan (s, l).xstrdup ();
>> +    }
>> +
>> +  struct obstack buf_obstack;
>> +  obstack_init (&buf_obstack);
>> +
>> +  /* Loop through all lines in the range and append each to buf; may
>> trim
>> +     parts of the start and end lines off depending on column
>> values.  */
>> +  for (int l = expstart.line; l <= expend.line; ++l)
> 
> Again, please let's not have a var named "l".  Maybe "iter_line" as
> that's what is being done?
> 
>> +    {
>> +      char_span line = location_get_source_line (expstart.file, l);
>> +      if (line.length () < 1 && (l != expstart.line && l !=
>> expend.line))
> 
> ...especially as I *think* the first comparison is against numeral one,
> whereas comparisons two and three are against lower-case ell, but I'm
> having to squint at the font in my email client to be sure :-/

Done.  But also allow me to recommend

https://nodnod.net/posts/inconsolata-dz/

>> +       continue;
>> +
>> +      /* For the first line in the range, only start at
>> expstart.column */
>> +      if (l == expstart.line)
>> +       {
>> +         if (expstart.column == 0)
>> +           return NULL;
>> +         if (line.length () < (size_t)expstart.column - 1)
>> +           return NULL;
>> +         line = line.subspan (expstart.column - 1,
>> +                              line.length() - expstart.column + 1);
>> +       }
>> +      /* For the last line, don't go past expend.column */
>> +      else if (l == expend.line)
>> +       {
>> +         if (line.length () < (size_t)expend.column)
>> +           return NULL;
>> +         line = line.subspan (0, expend.column);
>> +       }
>> +
>> +      obstack_grow (&buf_obstack, line.get_buffer (), line.length
>> ());
> 
> Is this accumulating the trailing newline characters into the
> buf_obstack?  I *think* it is, but it seems worth a comment for each of
> the three cases (first line, intermediate line, last line).

It is not; I've added a comment to that effect, and also implemented the 
TODO of collapsing a series of whitespace.

>> +    }
>> +
>> +  /* NUL-terminate and finish the buf obstack.  */
>> +  obstack_1grow (&buf_obstack, 0);
>> +  const char *buf = (const char *) obstack_finish (&buf_obstack);
>> +
>> +  /* TODO should we collapse/trim newlines and runs of spaces?  */
>> +  return xstrdup (buf);
>> +}
>> +
> 
> Do you have test coverage for this from the DejaGnu side?  If not, you
> could add selftest coverage for this; see input.cc's
> test_reading_source_line for something similar.

There is test coverage for the output of the the contract violation 
handler, which involves printing the result of this function.

[-- Attachment #2: 0001-input-add-get_source_text_between.patch --]
[-- Type: text/x-patch, Size: 4458 bytes --]

From 4d8a24574c808f881438d65e8f333f7e152fb217 Mon Sep 17 00:00:00 2001
From: Jeff Chapman II <jchapman@lock3software.com>
Date: Thu, 3 Nov 2022 15:47:47 -0400
Subject: [PATCH] input: add get_source_text_between
To: gcc-patches@gcc.gnu.org

The c++-contracts branch uses this to retrieve the source form of the
contract predicate, to be returned by contract_violation::comment().

Co-authored-by: Jason Merrill  <jason@redhat.com>

gcc/ChangeLog:

	* input.cc (get_source_text_between): New fn.
	* input.h (get_source_text_between): Declare.
---
 gcc/input.h  |  1 +
 gcc/input.cc | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 92 insertions(+)

diff --git a/gcc/input.h b/gcc/input.h
index 11c571d076f..f18769950b5 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -111,6 +111,7 @@ class char_span
 };
 
 extern char_span location_get_source_line (const char *file_path, int line);
+extern char *get_source_text_between (location_t, location_t);
 
 extern bool location_missing_trailing_newline (const char *file_path);
 
diff --git a/gcc/input.cc b/gcc/input.cc
index a28abfac5ac..04d0809bfdf 100644
--- a/gcc/input.cc
+++ b/gcc/input.cc
@@ -949,6 +949,97 @@ location_get_source_line (const char *file_path, int line)
   return char_span (buffer, len);
 }
 
+/* Return a copy of the source text between two locations.  The caller is
+   responsible for freeing the return value.  */
+
+char *
+get_source_text_between (location_t start, location_t end)
+{
+  expanded_location expstart =
+    expand_location_to_spelling_point (start, LOCATION_ASPECT_START);
+  expanded_location expend =
+    expand_location_to_spelling_point (end, LOCATION_ASPECT_FINISH);
+
+  /* If the locations are in different files or the end comes before the
+     start, give up and return nothing.  */
+  if (!expstart.file || !expend.file)
+    return NULL;
+  if (strcmp (expstart.file, expend.file) != 0)
+    return NULL;
+  if (expstart.line > expend.line)
+    return NULL;
+  if (expstart.line == expend.line
+      && expstart.column > expend.column)
+    return NULL;
+  /* These aren't real column numbers, give up.  */
+  if (expstart.column == 0 || expend.column == 0)
+    return NULL;
+
+  /* For a single line we need to trim both edges.  */
+  if (expstart.line == expend.line)
+    {
+      char_span line = location_get_source_line (expstart.file, expstart.line);
+      if (line.length () < 1)
+	return NULL;
+      int s = expstart.column - 1;
+      int len = expend.column - s;
+      if (line.length () < (size_t)expend.column)
+	return NULL;
+      return line.subspan (s, len).xstrdup ();
+    }
+
+  struct obstack buf_obstack;
+  obstack_init (&buf_obstack);
+
+  /* Loop through all lines in the range and append each to buf; may trim
+     parts of the start and end lines off depending on column values.  */
+  for (int lnum = expstart.line; lnum <= expend.line; ++lnum)
+    {
+      char_span line = location_get_source_line (expstart.file, lnum);
+      if (line.length () < 1 && (lnum != expstart.line && lnum != expend.line))
+	continue;
+
+      /* For the first line in the range, only start at expstart.column */
+      if (lnum == expstart.line)
+	{
+	  unsigned off = expstart.column - 1;
+	  if (line.length () < off)
+	    return NULL;
+	  line = line.subspan (off, line.length() - off);
+	}
+      /* For the last line, don't go past expend.column */
+      else if (lnum == expend.line)
+	{
+	  if (line.length () < (size_t)expend.column)
+	    return NULL;
+	  line = line.subspan (0, expend.column);
+	}
+
+      /* Combine spaces at the beginning of later lines.  */
+      if (lnum > expstart.line)
+	{
+	  unsigned off;
+	  for (off = 0; off < line.length(); ++off)
+	    if (line[off] != ' ' && line[off] != '\t')
+	      break;
+	  if (off > 0)
+	    {
+	      obstack_1grow (&buf_obstack, ' ');
+	      line = line.subspan (off, line.length() - off);
+	    }
+	}
+
+      /* This does not include any trailing newlines.  */
+      obstack_grow (&buf_obstack, line.get_buffer (), line.length ());
+    }
+
+  /* NUL-terminate and finish the buf obstack.  */
+  obstack_1grow (&buf_obstack, 0);
+  const char *buf = (const char *) obstack_finish (&buf_obstack);
+
+  return xstrdup (buf);
+}
+
 /* Determine if FILE_PATH missing a trailing newline on its final line.
    Only valid to call once all of the file has been loaded, by
    requesting a line number beyond the end of the file.  */
-- 
2.31.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH RFA] input: add get_source_text_between
  2022-11-04 14:27   ` Jason Merrill
@ 2022-11-04 15:16     ` David Malcolm
  2022-11-04 17:06       ` Jason Merrill
  0 siblings, 1 reply; 6+ messages in thread
From: David Malcolm @ 2022-11-04 15:16 UTC (permalink / raw)
  To: Jason Merrill, gcc-patches

On Fri, 2022-11-04 at 10:27 -0400, Jason Merrill wrote:
> On 11/3/22 19:06, David Malcolm wrote:
> > On Thu, 2022-11-03 at 15:59 -0400, Jason Merrill via Gcc-patches
> > wrote:

[...snip...]

> > 
> > 
> > Do you have test coverage for this from the DejaGnu side?  If not,
> > you
> > could add selftest coverage for this; see input.cc's
> > test_reading_source_line for something similar.
> 
> There is test coverage for the output of the the contract violation 
> handler, which involves printing the result of this function.

Thanks.   Is this test posted somwehere?  I was looking in:
 https://gcc.gnu.org/pipermail/gcc-patches/2022-November/604974.html
but I'm not seeing it.  Sorry if I'm missing something here.

Ideally we should have coverage for the three cases of:
(a) bails out and returns NULL
(b) single-line case
(c) multi-line case


> index a28abfac5ac..04d0809bfdf 100644
> --- a/gcc/input.cc
> +++ b/gcc/input.cc
> @@ -949,6 +949,97 @@ location_get_source_line (const char *file_path, int line)
>    return char_span (buffer, len);
>  }

Strings in input.cc are not always NUL-terminated, so...

>  
> +/* Return a copy of the source text between two locations.  The caller is
> +   responsible for freeing the return value.  */

...please note in the comment that if non-NULL, the copy is NUL-
terminated (I checked both exit paths that can return non-NULL, and
they do NUL-terminate their buffers).

OK with that nit fixed.

Thanks
Dave


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH RFA] input: add get_source_text_between
  2022-11-04 15:16     ` David Malcolm
@ 2022-11-04 17:06       ` Jason Merrill
  2022-11-05  2:00         ` David Malcolm
  0 siblings, 1 reply; 6+ messages in thread
From: Jason Merrill @ 2022-11-04 17:06 UTC (permalink / raw)
  To: David Malcolm, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1777 bytes --]

On 11/4/22 11:16, David Malcolm wrote:
> On Fri, 2022-11-04 at 10:27 -0400, Jason Merrill wrote:
>> On 11/3/22 19:06, David Malcolm wrote:
>>> On Thu, 2022-11-03 at 15:59 -0400, Jason Merrill via Gcc-patches
>>> wrote:
> 
> [...snip...]
> 
>>>
>>>
>>> Do you have test coverage for this from the DejaGnu side?  If not,
>>> you
>>> could add selftest coverage for this; see input.cc's
>>> test_reading_source_line for something similar.
>>
>> There is test coverage for the output of the the contract violation
>> handler, which involves printing the result of this function.
> 
> Thanks.   Is this test posted somwehere?  I was looking in:
>   https://gcc.gnu.org/pipermail/gcc-patches/2022-November/604974.html
> but I'm not seeing it.  Sorry if I'm missing something here.

The tests are in the newer message

   https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605072.html

> Ideally we should have coverage for the three cases of:
> (a) bails out and returns NULL

This isn't tested because it should never happen.

> (b) single-line case

contracts-tmpl-spec3.C etc.

> (c) multi-line case

contracts-multiline1.C

>> index a28abfac5ac..04d0809bfdf 100644
>> --- a/gcc/input.cc
>> +++ b/gcc/input.cc
>> @@ -949,6 +949,97 @@ location_get_source_line (const char *file_path, int line)
>>     return char_span (buffer, len);
>>   }
> 
> Strings in input.cc are not always NUL-terminated, so...
> 
>>   
>> +/* Return a copy of the source text between two locations.  The caller is
>> +   responsible for freeing the return value.  */
> 
> ...please note in the comment that if non-NULL, the copy is NUL-
> terminated (I checked both exit paths that can return non-NULL, and
> they do NUL-terminate their buffers).
> 
> OK with that nit fixed.

Thanks, pushing this:

[-- Attachment #2: 0001-input-add-get_source_text_between.patch --]
[-- Type: text/x-patch, Size: 4515 bytes --]

From 35d436c8e9455e99015d0e414a4ede9e59e774c4 Mon Sep 17 00:00:00 2001
From: Jeff Chapman II <jchapman@lock3software.com>
Date: Thu, 3 Nov 2022 15:47:47 -0400
Subject: [PATCH] input: add get_source_text_between
To: gcc-patches@gcc.gnu.org

The c++-contracts branch uses this to retrieve the source form of the
contract predicate, to be returned by contract_violation::comment().

Co-authored-by: Jason Merrill  <jason@redhat.com>

gcc/ChangeLog:

	* input.cc (get_source_text_between): New fn.
	* input.h (get_source_text_between): Declare.
---
 gcc/input.h  |  1 +
 gcc/input.cc | 92 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 93 insertions(+)

diff --git a/gcc/input.h b/gcc/input.h
index 11c571d076f..f18769950b5 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -111,6 +111,7 @@ class char_span
 };
 
 extern char_span location_get_source_line (const char *file_path, int line);
+extern char *get_source_text_between (location_t, location_t);
 
 extern bool location_missing_trailing_newline (const char *file_path);
 
diff --git a/gcc/input.cc b/gcc/input.cc
index a28abfac5ac..c185bd74c1f 100644
--- a/gcc/input.cc
+++ b/gcc/input.cc
@@ -949,6 +949,98 @@ location_get_source_line (const char *file_path, int line)
   return char_span (buffer, len);
 }
 
+/* Return a NUL-terminated copy of the source text between two locations, or
+   NULL if the arguments are invalid.  The caller is responsible for freeing
+   the return value.  */
+
+char *
+get_source_text_between (location_t start, location_t end)
+{
+  expanded_location expstart =
+    expand_location_to_spelling_point (start, LOCATION_ASPECT_START);
+  expanded_location expend =
+    expand_location_to_spelling_point (end, LOCATION_ASPECT_FINISH);
+
+  /* If the locations are in different files or the end comes before the
+     start, give up and return nothing.  */
+  if (!expstart.file || !expend.file)
+    return NULL;
+  if (strcmp (expstart.file, expend.file) != 0)
+    return NULL;
+  if (expstart.line > expend.line)
+    return NULL;
+  if (expstart.line == expend.line
+      && expstart.column > expend.column)
+    return NULL;
+  /* These aren't real column numbers, give up.  */
+  if (expstart.column == 0 || expend.column == 0)
+    return NULL;
+
+  /* For a single line we need to trim both edges.  */
+  if (expstart.line == expend.line)
+    {
+      char_span line = location_get_source_line (expstart.file, expstart.line);
+      if (line.length () < 1)
+	return NULL;
+      int s = expstart.column - 1;
+      int len = expend.column - s;
+      if (line.length () < (size_t)expend.column)
+	return NULL;
+      return line.subspan (s, len).xstrdup ();
+    }
+
+  struct obstack buf_obstack;
+  obstack_init (&buf_obstack);
+
+  /* Loop through all lines in the range and append each to buf; may trim
+     parts of the start and end lines off depending on column values.  */
+  for (int lnum = expstart.line; lnum <= expend.line; ++lnum)
+    {
+      char_span line = location_get_source_line (expstart.file, lnum);
+      if (line.length () < 1 && (lnum != expstart.line && lnum != expend.line))
+	continue;
+
+      /* For the first line in the range, only start at expstart.column */
+      if (lnum == expstart.line)
+	{
+	  unsigned off = expstart.column - 1;
+	  if (line.length () < off)
+	    return NULL;
+	  line = line.subspan (off, line.length() - off);
+	}
+      /* For the last line, don't go past expend.column */
+      else if (lnum == expend.line)
+	{
+	  if (line.length () < (size_t)expend.column)
+	    return NULL;
+	  line = line.subspan (0, expend.column);
+	}
+
+      /* Combine spaces at the beginning of later lines.  */
+      if (lnum > expstart.line)
+	{
+	  unsigned off;
+	  for (off = 0; off < line.length(); ++off)
+	    if (line[off] != ' ' && line[off] != '\t')
+	      break;
+	  if (off > 0)
+	    {
+	      obstack_1grow (&buf_obstack, ' ');
+	      line = line.subspan (off, line.length() - off);
+	    }
+	}
+
+      /* This does not include any trailing newlines.  */
+      obstack_grow (&buf_obstack, line.get_buffer (), line.length ());
+    }
+
+  /* NUL-terminate and finish the buf obstack.  */
+  obstack_1grow (&buf_obstack, 0);
+  const char *buf = (const char *) obstack_finish (&buf_obstack);
+
+  return xstrdup (buf);
+}
+
 /* Determine if FILE_PATH missing a trailing newline on its final line.
    Only valid to call once all of the file has been loaded, by
    requesting a line number beyond the end of the file.  */
-- 
2.31.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH RFA] input: add get_source_text_between
  2022-11-04 17:06       ` Jason Merrill
@ 2022-11-05  2:00         ` David Malcolm
  0 siblings, 0 replies; 6+ messages in thread
From: David Malcolm @ 2022-11-05  2:00 UTC (permalink / raw)
  To: Jason Merrill, gcc-patches

On Fri, 2022-11-04 at 13:06 -0400, Jason Merrill wrote:
> On 11/4/22 11:16, David Malcolm wrote:
> > On Fri, 2022-11-04 at 10:27 -0400, Jason Merrill wrote:
> > > On 11/3/22 19:06, David Malcolm wrote:
> > > > On Thu, 2022-11-03 at 15:59 -0400, Jason Merrill via Gcc-
> > > > patches
> > > > wrote:
> > 
> > [...snip...]
> > 
> > > > 
> > > > 
> > > > Do you have test coverage for this from the DejaGnu side?  If
> > > > not,
> > > > you
> > > > could add selftest coverage for this; see input.cc's
> > > > test_reading_source_line for something similar.
> > > 
> > > There is test coverage for the output of the the contract
> > > violation
> > > handler, which involves printing the result of this function.
> > 
> > Thanks.   Is this test posted somwehere?  I was looking in:
> >  
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-November/604974.html
> > but I'm not seeing it.  Sorry if I'm missing something here.
> 
> The tests are in the newer message
> 
>   
> https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605072.html
> 
> > Ideally we should have coverage for the three cases of:
> > (a) bails out and returns NULL
> 
> This isn't tested because it should never happen.

I'm guessing someone will run into it in the wild at some point.  With
very large files we eventually give up on tracking column numbers, and
so we get location_t values with column number == 0, and FWIW
  gcc/testsuite/gcc.dg/plugin/location_overflow_plugin.c
makes it feasible to test such cases.

But obviously that's low priority.

Dave

> 
> > (b) single-line case
> 
> contracts-tmpl-spec3.C etc.
> 
> > (c) multi-line case
> 
> contracts-multiline1.C
> 
> > > index a28abfac5ac..04d0809bfdf 100644
> > > --- a/gcc/input.cc
> > > +++ b/gcc/input.cc
> > > @@ -949,6 +949,97 @@ location_get_source_line (const char
> > > *file_path, int line)
> > >     return char_span (buffer, len);
> > >   }
> > 
> > Strings in input.cc are not always NUL-terminated, so...
> > 
> > >   
> > > +/* Return a copy of the source text between two locations.  The
> > > caller is
> > > +   responsible for freeing the return value.  */
> > 
> > ...please note in the comment that if non-NULL, the copy is NUL-
> > terminated (I checked both exit paths that can return non-NULL, and
> > they do NUL-terminate their buffers).
> > 
> > OK with that nit fixed.
> 
> Thanks, pushing this:


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-11-05  2:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-03 19:59 [PATCH RFA] input: add get_source_text_between Jason Merrill
2022-11-03 23:06 ` David Malcolm
2022-11-04 14:27   ` Jason Merrill
2022-11-04 15:16     ` David Malcolm
2022-11-04 17:06       ` Jason Merrill
2022-11-05  2:00         ` David Malcolm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).