public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: David Malcolm <dmalcolm@redhat.com>
To: Lewis Hyatt <lhyatt@gmail.com>, gcc-patches@gcc.gnu.org
Subject: Re: [PATCH v3 1/4] diagnostics: libcpp: Add LC_GEN linemaps to support in-memory buffers
Date: Fri, 28 Jul 2023 18:58:54 -0400	[thread overview]
Message-ID: <7b672a949cd1686d5c0a998cf88e6bbfed4f0cde.camel@redhat.com> (raw)
In-Reply-To: <20230721230851.1981434-2-lhyatt@gmail.com>

On Fri, 2023-07-21 at 19:08 -0400, Lewis Hyatt wrote:
> Add a new linemap reason LC_GEN which enables encoding the location
> of data
> that was generated during compilation and does not appear in any
> source file.
> There could be many use cases, such as, for instance, referring to
> the content
> of builtin macros (not yet implemented, but an easy lift after this
> one.) The
> first intended application is to create a place to store the input to
> a
> _Pragma directive, so that proper locations can be assigned to those
> tokens. This will be done in a subsequent commit.
> 
> The actual change needed to the line-maps API in libcpp is not too
> large and
> requires no space overhead in the line map data structures (on 64-bit
> systems
> that is; one newly added data member to class line_map_ordinary sits
> inside
> former padding bytes.) An LC_GEN map is just an ordinary map like any
> other,
> but the TO_FILE member that normally points to the file name points
> instead to
> the actual data.  This works automatically with PCH as well, for the
> same
> reason that the file name makes its way into a PCH.  In order to
> avoid
> confusion, the member has been renamed from TO_FILE to DATA, and
> associated
> accessors adjusted.
> 
> Outside libcpp, there are many small changes but most of them are to
> selftests, which are necessarily more sensitive to implementation
> details. From the perspective of the user (the "user", here, being a
> frontend
> using line maps or else the diagnostics infrastructure), the chief
> visible
> change is that the function location_get_source_line() should be
> passed an
> expanded_location object instead of a separate filename and line
> number.  This
> is not a big change because in most cases, this information came
> anyway from a
> call to expand_location and the needed expanded_location object is
> readily
> available. The new overload of location_get_source_line() uses the
> extra
> information in the expanded_location object to obtain the data from
> the
> in-memory buffer when it originated from an LC_GEN map.
> 
> Until the subsequent patch that starts using LC_GEN maps, none are
> yet
> generated within GCC, hence nothing is added to the testsuite here;
> but all
> relevant selftests have been extended to cover generated data maps in
> addition
> to normal files.

[..snip...]

Thanks for the updated patch.

Reading this patch, it felt a bit unnatural to me to have an
  (exploded location, source line) 
pair where the exploded location seems to be representing "which source
file or generated buffer", but the line/column info in that
exploded_location is to be ignored in favor of the 2nd source line.

I think we're missing a class: something that identifies either a
specific source file, or a specific generated buffer.

How about something like either:

class source_id
{
public:
  source_id (const char *filename)
  : m_filename_or_buffer (filename),
    m_len (0)
  {
  }

  explicit source_id (const char *buffer, unsigned buffer_len)
  : m_filename_or_buffer (buffer),
    m_len (buffer_len)
  {
    linemap_assert (buffer_len > 0);
  }

private:
  const char *m_filename_or_buffer;
  unsigned m_len;  // where 0 means "it's a filename"
};

or:

class source_id
{
public:
  source_id (const char *filename)
  : m_ptr (filename),
    m_is_buffer (false)
  {
  }

  explicit source_id (const linemap_ordinary *buffer_linemap)
  : m_ptr (buffer_linemap),
    m_is_buffer (true)
  {
  }

private:
  const void *m_ptr;
  bool m_is_buffer;
};

and use one of these "source_id file" in place of "const char *file",
rather than replacing such things with expanded_location?

> diff --git a/gcc/c-family/c-indentation.cc b/gcc/c-family/c-indentation.cc
> index e8d3dece770..4164fa0b1ba 100644
> --- a/gcc/c-family/c-indentation.cc
> +++ b/gcc/c-family/c-indentation.cc
> @@ -50,7 +50,7 @@ get_visual_column (expanded_location exploc,
>  		   unsigned int *first_nws,
>  		   unsigned int tab_width)
>  {
> -  char_span line = location_get_source_line (exploc.file, exploc.line);
> +  char_span line = location_get_source_line (exploc);

...so this might contine to be:

  char_span line = location_get_source_line (exploc.file, exploc.line);

...but expanded_location's "file" field would become a source_id,
rather than a const char *.  It looks like doing do might make a lot of
"is this the same file or buffer?"  turn into comparisons of source_id
instances.

So I think expanded_location would become:

typedef struct
{
  /* Either the name of the source file involved, or the
     specific generated buffer.  */
  source_id file;

  /* The line-location in the source file.  */
  int line;

  int column;

  void *data;

  /* In a system header?. */
  bool sysp;
} expanded_location;

and we wouldn't need to add these extra fields:

> +
> +  /* If generated data, the data and its length.  The data may contain embedded
> +   nulls and need not be null-terminated.  */
> +  unsigned int generated_data_len;
> +  const char *generated_data;
>  } expanded_location;

and we could pass around source_id instances when identifying specific
filenames/generated buffers.
 
Does this idea simplify/clarify the patch, or make it more complicated?

[...snip...]

Thoughts?
Dave


  reply	other threads:[~2023-07-28 22:58 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-21 23:08 [PATCH v3 0/4] diagnostics: libcpp: Overhaul locations for _Pragma tokens Lewis Hyatt
2023-07-21 23:08 ` [PATCH v3 1/4] diagnostics: libcpp: Add LC_GEN linemaps to support in-memory buffers Lewis Hyatt
2023-07-28 22:58   ` David Malcolm [this message]
2023-07-31 22:39     ` Lewis Hyatt
2023-08-09 22:14       ` [PATCH v4 0/8] diagnostics: libcpp: Overhaul locations for _Pragma tokens Lewis Hyatt
2023-08-09 22:14         ` [PATCH v4 1/8] libcpp: Add LC_GEN linemaps to support in-memory buffers Lewis Hyatt
2023-08-11 22:45           ` David Malcolm
2023-08-13 20:18             ` Lewis Hyatt
2023-08-09 22:14         ` [PATCH v4 2/8] libcpp: diagnostics: Support generated data in expanded locations Lewis Hyatt
2023-08-11 23:02           ` David Malcolm
2023-08-14 21:41             ` Lewis Hyatt
2023-08-09 22:14         ` [PATCH v4 3/8] diagnostics: Refactor class file_cache_slot Lewis Hyatt
2023-08-15 15:43           ` David Malcolm
2023-08-15 17:58             ` Lewis Hyatt
2023-08-15 19:39               ` David Malcolm
2023-08-23 21:22                 ` Lewis Hyatt
2023-08-09 22:14         ` [PATCH v4 4/8] diagnostics: Support obtaining source code lines from generated data buffers Lewis Hyatt
2023-08-15 16:15           ` David Malcolm
2023-08-15 18:15             ` Lewis Hyatt
2023-08-15 19:46               ` David Malcolm
2023-08-15 20:08                 ` Lewis Hyatt
2023-08-23 19:41                   ` Lewis Hyatt
2023-08-09 22:14         ` [PATCH v4 5/8] diagnostics: Support testing generated data in input.cc selftests Lewis Hyatt
2023-08-15 16:27           ` David Malcolm
2023-08-09 22:14         ` [PATCH v4 6/8] diagnostics: Full support for generated data locations Lewis Hyatt
2023-08-15 16:39           ` David Malcolm
2023-08-09 22:14         ` [PATCH v4 7/8] diagnostics: libcpp: Assign real locations to the tokens inside _Pragma strings Lewis Hyatt
2023-08-09 22:14         ` [PATCH v4 8/8] diagnostics: Support generated data locations in SARIF output Lewis Hyatt
2023-08-15 17:04           ` David Malcolm
2023-08-15 17:51             ` Lewis Hyatt
2023-07-21 23:08 ` [PATCH v3 2/4] diagnostics: Handle generated data locations in edit_context Lewis Hyatt
2023-07-21 23:08 ` [PATCH v3 3/4] diagnostics: libcpp: Assign real locations to the tokens inside _Pragma strings Lewis Hyatt
2023-07-21 23:08 ` [PATCH v3 4/4] diagnostics: Support generated data locations in SARIF output Lewis Hyatt
2023-07-28 22:22 ` [PATCH v3 0/4] diagnostics: libcpp: Overhaul locations for _Pragma tokens David Malcolm
2023-07-29 14:27   ` Lewis Hyatt
2023-07-29 16:03     ` David Malcolm

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7b672a949cd1686d5c0a998cf88e6bbfed4f0cde.camel@redhat.com \
    --to=dmalcolm@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=lhyatt@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).