From: David Malcolm <dmalcolm@redhat.com>
To: Lewis Hyatt <lhyatt@gmail.com>, gcc-patches@gcc.gnu.org
Subject: Re: [PATCH v3 1/4] diagnostics: libcpp: Add LC_GEN linemaps to support in-memory buffers
Date: Fri, 28 Jul 2023 18:58:54 -0400 [thread overview]
Message-ID: <7b672a949cd1686d5c0a998cf88e6bbfed4f0cde.camel@redhat.com> (raw)
In-Reply-To: <20230721230851.1981434-2-lhyatt@gmail.com>
On Fri, 2023-07-21 at 19:08 -0400, Lewis Hyatt wrote:
> Add a new linemap reason LC_GEN which enables encoding the location
> of data
> that was generated during compilation and does not appear in any
> source file.
> There could be many use cases, such as, for instance, referring to
> the content
> of builtin macros (not yet implemented, but an easy lift after this
> one.) The
> first intended application is to create a place to store the input to
> a
> _Pragma directive, so that proper locations can be assigned to those
> tokens. This will be done in a subsequent commit.
>
> The actual change needed to the line-maps API in libcpp is not too
> large and
> requires no space overhead in the line map data structures (on 64-bit
> systems
> that is; one newly added data member to class line_map_ordinary sits
> inside
> former padding bytes.) An LC_GEN map is just an ordinary map like any
> other,
> but the TO_FILE member that normally points to the file name points
> instead to
> the actual data. This works automatically with PCH as well, for the
> same
> reason that the file name makes its way into a PCH. In order to
> avoid
> confusion, the member has been renamed from TO_FILE to DATA, and
> associated
> accessors adjusted.
>
> Outside libcpp, there are many small changes but most of them are to
> selftests, which are necessarily more sensitive to implementation
> details. From the perspective of the user (the "user", here, being a
> frontend
> using line maps or else the diagnostics infrastructure), the chief
> visible
> change is that the function location_get_source_line() should be
> passed an
> expanded_location object instead of a separate filename and line
> number. This
> is not a big change because in most cases, this information came
> anyway from a
> call to expand_location and the needed expanded_location object is
> readily
> available. The new overload of location_get_source_line() uses the
> extra
> information in the expanded_location object to obtain the data from
> the
> in-memory buffer when it originated from an LC_GEN map.
>
> Until the subsequent patch that starts using LC_GEN maps, none are
> yet
> generated within GCC, hence nothing is added to the testsuite here;
> but all
> relevant selftests have been extended to cover generated data maps in
> addition
> to normal files.
[..snip...]
Thanks for the updated patch.
Reading this patch, it felt a bit unnatural to me to have an
(exploded location, source line)
pair where the exploded location seems to be representing "which source
file or generated buffer", but the line/column info in that
exploded_location is to be ignored in favor of the 2nd source line.
I think we're missing a class: something that identifies either a
specific source file, or a specific generated buffer.
How about something like either:
class source_id
{
public:
source_id (const char *filename)
: m_filename_or_buffer (filename),
m_len (0)
{
}
explicit source_id (const char *buffer, unsigned buffer_len)
: m_filename_or_buffer (buffer),
m_len (buffer_len)
{
linemap_assert (buffer_len > 0);
}
private:
const char *m_filename_or_buffer;
unsigned m_len; // where 0 means "it's a filename"
};
or:
class source_id
{
public:
source_id (const char *filename)
: m_ptr (filename),
m_is_buffer (false)
{
}
explicit source_id (const linemap_ordinary *buffer_linemap)
: m_ptr (buffer_linemap),
m_is_buffer (true)
{
}
private:
const void *m_ptr;
bool m_is_buffer;
};
and use one of these "source_id file" in place of "const char *file",
rather than replacing such things with expanded_location?
> diff --git a/gcc/c-family/c-indentation.cc b/gcc/c-family/c-indentation.cc
> index e8d3dece770..4164fa0b1ba 100644
> --- a/gcc/c-family/c-indentation.cc
> +++ b/gcc/c-family/c-indentation.cc
> @@ -50,7 +50,7 @@ get_visual_column (expanded_location exploc,
> unsigned int *first_nws,
> unsigned int tab_width)
> {
> - char_span line = location_get_source_line (exploc.file, exploc.line);
> + char_span line = location_get_source_line (exploc);
...so this might contine to be:
char_span line = location_get_source_line (exploc.file, exploc.line);
...but expanded_location's "file" field would become a source_id,
rather than a const char *. It looks like doing do might make a lot of
"is this the same file or buffer?" turn into comparisons of source_id
instances.
So I think expanded_location would become:
typedef struct
{
/* Either the name of the source file involved, or the
specific generated buffer. */
source_id file;
/* The line-location in the source file. */
int line;
int column;
void *data;
/* In a system header?. */
bool sysp;
} expanded_location;
and we wouldn't need to add these extra fields:
> +
> + /* If generated data, the data and its length. The data may contain embedded
> + nulls and need not be null-terminated. */
> + unsigned int generated_data_len;
> + const char *generated_data;
> } expanded_location;
and we could pass around source_id instances when identifying specific
filenames/generated buffers.
Does this idea simplify/clarify the patch, or make it more complicated?
[...snip...]
Thoughts?
Dave
next prev parent reply other threads:[~2023-07-28 22:58 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-21 23:08 [PATCH v3 0/4] diagnostics: libcpp: Overhaul locations for _Pragma tokens Lewis Hyatt
2023-07-21 23:08 ` [PATCH v3 1/4] diagnostics: libcpp: Add LC_GEN linemaps to support in-memory buffers Lewis Hyatt
2023-07-28 22:58 ` David Malcolm [this message]
2023-07-31 22:39 ` Lewis Hyatt
2023-08-09 22:14 ` [PATCH v4 0/8] diagnostics: libcpp: Overhaul locations for _Pragma tokens Lewis Hyatt
2023-08-09 22:14 ` [PATCH v4 1/8] libcpp: Add LC_GEN linemaps to support in-memory buffers Lewis Hyatt
2023-08-11 22:45 ` David Malcolm
2023-08-13 20:18 ` Lewis Hyatt
2023-08-09 22:14 ` [PATCH v4 2/8] libcpp: diagnostics: Support generated data in expanded locations Lewis Hyatt
2023-08-11 23:02 ` David Malcolm
2023-08-14 21:41 ` Lewis Hyatt
2023-08-09 22:14 ` [PATCH v4 3/8] diagnostics: Refactor class file_cache_slot Lewis Hyatt
2023-08-15 15:43 ` David Malcolm
2023-08-15 17:58 ` Lewis Hyatt
2023-08-15 19:39 ` David Malcolm
2023-08-23 21:22 ` Lewis Hyatt
2023-08-09 22:14 ` [PATCH v4 4/8] diagnostics: Support obtaining source code lines from generated data buffers Lewis Hyatt
2023-08-15 16:15 ` David Malcolm
2023-08-15 18:15 ` Lewis Hyatt
2023-08-15 19:46 ` David Malcolm
2023-08-15 20:08 ` Lewis Hyatt
2023-08-23 19:41 ` Lewis Hyatt
2023-08-09 22:14 ` [PATCH v4 5/8] diagnostics: Support testing generated data in input.cc selftests Lewis Hyatt
2023-08-15 16:27 ` David Malcolm
2023-08-09 22:14 ` [PATCH v4 6/8] diagnostics: Full support for generated data locations Lewis Hyatt
2023-08-15 16:39 ` David Malcolm
2023-08-09 22:14 ` [PATCH v4 7/8] diagnostics: libcpp: Assign real locations to the tokens inside _Pragma strings Lewis Hyatt
2023-08-09 22:14 ` [PATCH v4 8/8] diagnostics: Support generated data locations in SARIF output Lewis Hyatt
2023-08-15 17:04 ` David Malcolm
2023-08-15 17:51 ` Lewis Hyatt
2023-07-21 23:08 ` [PATCH v3 2/4] diagnostics: Handle generated data locations in edit_context Lewis Hyatt
2023-07-21 23:08 ` [PATCH v3 3/4] diagnostics: libcpp: Assign real locations to the tokens inside _Pragma strings Lewis Hyatt
2023-07-21 23:08 ` [PATCH v3 4/4] diagnostics: Support generated data locations in SARIF output Lewis Hyatt
2023-07-28 22:22 ` [PATCH v3 0/4] diagnostics: libcpp: Overhaul locations for _Pragma tokens David Malcolm
2023-07-29 14:27 ` Lewis Hyatt
2023-07-29 16:03 ` David Malcolm
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7b672a949cd1686d5c0a998cf88e6bbfed4f0cde.camel@redhat.com \
--to=dmalcolm@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=lhyatt@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).