public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: David Malcolm <dmalcolm@redhat.com>
To: Jason Merrill <jason@redhat.com>, gcc-patches@gcc.gnu.org
Subject: Re: [PATCH RFA] input: add get_source_text_between
Date: Thu, 03 Nov 2022 19:06:36 -0400	[thread overview]
Message-ID: <906b1326bd95c094331f7a5ff46723986215e3cf.camel@redhat.com> (raw)
In-Reply-To: <20221103195902.2114479-1-jason@redhat.com>

On Thu, 2022-11-03 at 15:59 -0400, Jason Merrill via Gcc-patches wrote:
> Tested x86_64-pc-linux-gnu, OK for trunk?
> 
> -- >8 --
> 
> The c++-contracts branch uses this to retrieve the source form of the
> contract predicate, to be returned by contract_violation::comment().
> 
> gcc/ChangeLog:
> 
>         * input.cc (get_source_text_between): New fn.
>         * input.h (get_source_text_between): Declare.
> ---
>  gcc/input.h  |  1 +
>  gcc/input.cc | 76
> ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 77 insertions(+)
> 
> diff --git a/gcc/input.h b/gcc/input.h
> index 11c571d076f..f18769950b5 100644
> --- a/gcc/input.h
> +++ b/gcc/input.h
> @@ -111,6 +111,7 @@ class char_span
>  };
>  
>  extern char_span location_get_source_line (const char *file_path,
> int line);
> +extern char *get_source_text_between (location_t, location_t);
>  
>  extern bool location_missing_trailing_newline (const char
> *file_path);
>  
> diff --git a/gcc/input.cc b/gcc/input.cc
> index a28abfac5ac..9b36356338a 100644
> --- a/gcc/input.cc
> +++ b/gcc/input.cc
> @@ -949,6 +949,82 @@ location_get_source_line (const char *file_path,
> int line)
>    return char_span (buffer, len);
>  }
>  
> +/* Return a copy of the source text between two locations.  The
> caller is
> +   responsible for freeing the return value.  */
> +
> +char *
> +get_source_text_between (location_t start, location_t end)
> +{
> +  expanded_location expstart =
> +    expand_location_to_spelling_point (start,
> LOCATION_ASPECT_START);
> +  expanded_location expend =
> +    expand_location_to_spelling_point (end, LOCATION_ASPECT_FINISH);
> +
> +  /* If the locations are in different files or the end comes before
> the
> +     start, abort and return nothing.  */

I don't like the use of the term "abort" here, as it suggests to me the
use of abort(3).  Maybe "bail out" instead?


> +  if (!expstart.file || !expend.file)
> +    return NULL;
> +  if (strcmp (expstart.file, expend.file) != 0)
> +    return NULL;
> +  if (expstart.line > expend.line)
> +    return NULL;
> +  if (expstart.line == expend.line
> +      && expstart.column > expend.column)
> +    return NULL;

We occasionally use the convention that
  (column == 0)
means "the whole line".  Probably should detect that case and bail out
early for it.


> +
> +  /* For a single line we need to trim both edges.  */
> +  if (expstart.line == expend.line)
> +    {
> +      char_span line = location_get_source_line (expstart.file,
> expstart.line);
> +      if (line.length () < 1)
> +       return NULL;
> +      int s = expstart.column - 1;
> +      int l = expend.column - s;

Can we please avoid lower-case L "ell" for variable names, as it looks
so similar to the numeral for one.  "len" would be more descriptive
here.


> +      if (line.length () < (size_t)expend.column)
> +       return NULL;
> +      return line.subspan (s, l).xstrdup ();
> +    }
> +
> +  struct obstack buf_obstack;
> +  obstack_init (&buf_obstack);
> +
> +  /* Loop through all lines in the range and append each to buf; may
> trim
> +     parts of the start and end lines off depending on column
> values.  */
> +  for (int l = expstart.line; l <= expend.line; ++l)

Again, please let's not have a var named "l".  Maybe "iter_line" as
that's what is being done?

> +    {
> +      char_span line = location_get_source_line (expstart.file, l);
> +      if (line.length () < 1 && (l != expstart.line && l !=
> expend.line))

...especially as I *think* the first comparison is against numeral one,
whereas comparisons two and three are against lower-case ell, but I'm
having to squint at the font in my email client to be sure :-/

> +       continue;
> +
> +      /* For the first line in the range, only start at
> expstart.column */
> +      if (l == expstart.line)
> +       {
> +         if (expstart.column == 0)
> +           return NULL;
> +         if (line.length () < (size_t)expstart.column - 1)
> +           return NULL;
> +         line = line.subspan (expstart.column - 1,
> +                              line.length() - expstart.column + 1);
> +       }
> +      /* For the last line, don't go past expend.column */
> +      else if (l == expend.line)
> +       {
> +         if (line.length () < (size_t)expend.column)
> +           return NULL;
> +         line = line.subspan (0, expend.column);
> +       }
> +
> +      obstack_grow (&buf_obstack, line.get_buffer (), line.length
> ());

Is this accumulating the trailing newline characters into the
buf_obstack?  I *think* it is, but it seems worth a comment for each of
the three cases (first line, intermediate line, last line).

> +    }
> +
> +  /* NUL-terminate and finish the buf obstack.  */
> +  obstack_1grow (&buf_obstack, 0);
> +  const char *buf = (const char *) obstack_finish (&buf_obstack);
> +
> +  /* TODO should we collapse/trim newlines and runs of spaces?  */
> +  return xstrdup (buf);
> +}
> +

Do you have test coverage for this from the DejaGnu side?  If not, you
could add selftest coverage for this; see input.cc's
test_reading_source_line for something similar.

OK for trunk otherwise.

Hope this is helpful
Dave


  reply	other threads:[~2022-11-03 23:06 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-03 19:59 Jason Merrill
2022-11-03 23:06 ` David Malcolm [this message]
2022-11-04 14:27   ` Jason Merrill
2022-11-04 15:16     ` David Malcolm
2022-11-04 17:06       ` Jason Merrill
2022-11-05  2:00         ` David Malcolm

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=906b1326bd95c094331f7a5ff46723986215e3cf.camel@redhat.com \
    --to=dmalcolm@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jason@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).