public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/6] diagnostics: libcpp: Overhaul locations for _Pragma tokens
@ 2022-11-04 13:44 Lewis Hyatt
  2022-11-04 13:44 ` [PATCH 1/6] diagnostics: Fix macro tracking for ad-hoc locations Lewis Hyatt
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Lewis Hyatt @ 2022-11-04 13:44 UTC (permalink / raw)
  To: gcc-patches; +Cc: Lewis Hyatt, David Malcolm

Hello-

In the past couple years there has been a ton of progress in fixing bugs
related to _Pragma, especially its use in the type of macros that many
projects like to implement for manipulating GCC diagnostic pragmas more
easily. For GCC 13 I have been going through the remaining open PRs, fixing a
couple and adding testcases for several that were already fixed. I felt that
made it a good time to overhaul one of the last remaining issues with _Pragma
processing, which is that we do not currently assign good locations to the
tokens involved. The locations are very important, however, because that is
how GCC diagnostic pragmas will ultimately determine whether a given warning
should or should not apply at a given point. Currently, the tokens inside a
_Pragma string are all assigned the same location as the _Pragma token itself,
which is sufficient to make diagnostic pragmas work correctly. It does produce
somewhat inferior diagnostics, though, since we do not point the user to which
part of the _Pragma string caused the problem; and if the _Pragma string was
expanded from a macro, we do not even point them to the string at all.

Further, the assignment of the fake location to the tokens inside the _Pragma
string takes place after all the tokens have been lexed -- consequently, if a
diagnostic is issued by libcpp during that process, it doesn't benefit from the
patched-up location and instead uses a bogus location. As a quick example,
compiling:

    =====
    _Pragma("GCC diagnostic ignored \"oops")
    =====

produces:

    =====
    file:1:24: warning: missing terminating " character
        1 | _Pragma("GCC diagnostic ignored \"oops")
          |                        ^
    =====

It is surprisingly involved to make that caret point to something
reasonable. The reason it points to the middle of nowhere is that the current
implementation of _Pragma in directives.cc:destringize_and_run() does not
touch the line_maps instance at all, and so does not inform it where the
tokens are coming from. But the line_maps API in fact does not provide any way
to handle this case, so this needs to be added first. With all the changes in
this patch set, we would output instead:

    ======
    In buffer generated from file:1:
    <generated>:1:24: warning: missing terminating " character
        1 | GCC diagnostic ignored "oops
          |                        ^
    file:1:1: note: in <_Pragma directive>
        1 | _Pragma("GCC diagnostic ignored \"oops")
          | ^~~~~~~
    ======

Treating the _Pragma like a macro expansion makes everything consistent and
solves a ton of problems; all the locations involved will just make sense from
the user's point of view.

Patches 1-3 are tiny bug fixes that I came across while working on the new
testcases. I was a bit surprised that #1 and #3 especially did not have PRs
open, but I guess these small glitches have gone unnoticed so far.

Patch 4 is the largest one. It adds a new reason=LC_GEN for ordinary line
maps. These maps are just like normal ones, except the file name pointer
points not to a file name, but to the actual data in memory instead. This is
how we can issue diagnostics for code that did not appear in the user's input,
such as the de-stringized _Pragma string. The changes needed in libcpp to
support this concept are pretty small and straightforward. Most of the changes
outside of libcpp are in input.cc and diagnostic-show-locus.cc, which need to
learn how to obtain code from LC_GEN maps, and also a lot of the changes are
in selftests that are pretty sensitive to the internal implementation.

Patch 5 is a continuation of 4 that supports LC_GEN maps in less commonly used
places, such as the new SARIF output format, that also need to know how to
read source back from in-memory buffers in addition to files.

Patch 6 updates the implementation of _Pragma handling to use LC_GEN maps and
to create virtual locations for the tokens as in the example above. I have
also added support for the argument of the _Pragma to be a raw string, as
requested by PR83473, since this was easy to do while I was there.

1/6: diagnostics: Fix macro tracking for ad-hoc locations
2/6: diagnostics: Use an inline function rather than hardcoding <built-in>
     string
3/6: libcpp: Fix paste error with unknown pragma after macro expansion
4/6: diagnostics: libcpp: Add LC_GEN linemaps to support in-memory buffers
5/6: diagnostics: Support generated data in additional contexts
6/6: diagnostics: libcpp: Assign real locations to the tokens inside
     _Pragma strings

Bootstrap and regtest all languages on x86-64 Linux looks good.

I realize it's near the end of stage 1 now. It would still be great and I
would appreciate very much if this patch could get reviewed please? For GCC 13,
there have been several _Pragma-related bugs fixed (especially PR53431), and
addressing this location issue would tie it together nicely. Thanks very much!

-Lewis

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/6] diagnostics: Fix macro tracking for ad-hoc locations
  2022-11-04 13:44 [PATCH 0/6] diagnostics: libcpp: Overhaul locations for _Pragma tokens Lewis Hyatt
@ 2022-11-04 13:44 ` Lewis Hyatt
  2022-11-04 15:53   ` David Malcolm
  2022-11-04 13:44 ` [PATCH 2/6] diagnostics: Use an inline function rather than hardcoding <built-in> string Lewis Hyatt
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Lewis Hyatt @ 2022-11-04 13:44 UTC (permalink / raw)
  To: gcc-patches; +Cc: Lewis Hyatt

The result of linemap_resolve_location() can be an ad-hoc location, if that is
what was stored in a relevant macro map.  maybe_unwind_expanded_macro_loc()
did not previously handle this case, causing it to print the wrong tracking
information for an example such as the new testcase macro-trace-1.c.  Fix that
by checking for ad-hoc locations where needed.

gcc/ChangeLog:

	* tree-diagnostic.cc (maybe_unwind_expanded_macro_loc): Handle ad-hoc
	location in return value of linemap_resolve_location().

gcc/testsuite/ChangeLog:

	* c-c++-common/cpp/macro-trace-1.c: New test.
---
 gcc/testsuite/c-c++-common/cpp/macro-trace-1.c | 4 ++++
 gcc/tree-diagnostic.cc                         | 7 +++++--
 2 files changed, 9 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/cpp/macro-trace-1.c

diff --git a/gcc/testsuite/c-c++-common/cpp/macro-trace-1.c b/gcc/testsuite/c-c++-common/cpp/macro-trace-1.c
new file mode 100644
index 00000000000..34cfbb3dad3
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/macro-trace-1.c
@@ -0,0 +1,4 @@
+/* This token is long enough to require an ad-hoc location. Make sure that
+   the macro trace still prints properly.  */
+#define X "0123456789012345678901234567689" /* { dg-error {expected .* before string constant} } */
+X /* { dg-note {in expansion of macro 'X'} } */
diff --git a/gcc/tree-diagnostic.cc b/gcc/tree-diagnostic.cc
index 0d79fe3c3c1..5cf3a1c17d2 100644
--- a/gcc/tree-diagnostic.cc
+++ b/gcc/tree-diagnostic.cc
@@ -190,14 +190,17 @@ maybe_unwind_expanded_macro_loc (diagnostic_context *context,
         location_t l = 
           linemap_resolve_location (line_table, resolved_def_loc,
                                     LRK_SPELLING_LOCATION,  &m);
-        if (l < RESERVED_LOCATION_COUNT || LINEMAP_SYSP (m))
+	location_t l0 = l;
+	if (IS_ADHOC_LOC (l0))
+	  l0 = get_location_from_adhoc_loc (line_table, l0);
+	if (l0 < RESERVED_LOCATION_COUNT || LINEMAP_SYSP (m))
           continue;
         
 	/* We need to print the context of the macro definition only
 	   when the locus of the first displayed diagnostic (displayed
 	   before this trace) was inside the definition of the
 	   macro.  */
-        int resolved_def_loc_line = SOURCE_LINE (m, l);
+	const int resolved_def_loc_line = SOURCE_LINE (m, l0);
         if (ix == 0 && saved_location_line != resolved_def_loc_line)
           {
             diagnostic_append_note (context, resolved_def_loc, 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 2/6] diagnostics: Use an inline function rather than hardcoding <built-in> string
  2022-11-04 13:44 [PATCH 0/6] diagnostics: libcpp: Overhaul locations for _Pragma tokens Lewis Hyatt
  2022-11-04 13:44 ` [PATCH 1/6] diagnostics: Fix macro tracking for ad-hoc locations Lewis Hyatt
@ 2022-11-04 13:44 ` Lewis Hyatt
  2022-11-04 15:55   ` David Malcolm
  2022-11-04 13:44 ` [PATCH 3/6] libcpp: Fix paste error with unknown pragma after macro expansion Lewis Hyatt
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Lewis Hyatt @ 2022-11-04 13:44 UTC (permalink / raw)
  To: gcc-patches; +Cc: Lewis Hyatt

The string "<built-in>" is hard-coded in several places throughout the
diagnostics code, and in some of those places, it is used incorrectly with
respect to internationalization. (Comparing a translated string to an
untranslated string.) The error is not currently observable in any output GCC
actually produces, hence no testcase added here, but it's worth fixing, and
also, I am shortly going to add a new such string and want to avoid hardcoding
that one in similar places.

gcc/c-family/ChangeLog:

	* c-opts.cc (c_finish_options): Use special_fname_builtin () rather
	than a hard-coded string.

gcc/ChangeLog:

	* diagnostic.cc (diagnostic_get_location_text): Use
	special_fname_builtin () rather than a hardcoded string (which was
	also incorrectly left untranslated previously.)
	* input.cc (special_fname_builtin): New function.
	(expand_location_1): Use special_fname_builtin () rather than a
	hard-coded string.
	(test_builtins): Likewise.
	* input.h (special_fname_builtin): Declare.

gcc/fortran/ChangeLog:

	* cpp.cc (gfc_cpp_init): Use special_fname_builtin () rather than a
	hardcoded string (which was also incorrectly left untranslated
	previously.)
	* error.cc (gfc_diagnostic_build_locus_prefix): Likewise.
	* f95-lang.cc (gfc_init): Likewise.
---
 gcc/c-family/c-opts.cc  |  2 +-
 gcc/diagnostic.cc       |  2 +-
 gcc/fortran/cpp.cc      |  2 +-
 gcc/fortran/error.cc    |  4 ++--
 gcc/fortran/f95-lang.cc |  2 +-
 gcc/input.cc            | 10 ++++++++--
 gcc/input.h             |  3 +++
 7 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index 32b929e3ece..521797fb7eb 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1476,7 +1476,7 @@ c_finish_options (void)
     {
       const line_map_ordinary *bltin_map
 	= linemap_check_ordinary (linemap_add (line_table, LC_RENAME, 0,
-					       _("<built-in>"), 0));
+					       special_fname_builtin (), 0));
       cb_file_change (parse_in, bltin_map);
       linemap_line_start (line_table, 0, 1);
 
diff --git a/gcc/diagnostic.cc b/gcc/diagnostic.cc
index 22f7b0b6d6e..7c7ee6da746 100644
--- a/gcc/diagnostic.cc
+++ b/gcc/diagnostic.cc
@@ -470,7 +470,7 @@ diagnostic_get_location_text (diagnostic_context *context,
   const char *file = s.file ? s.file : progname;
   int line = 0;
   int col = -1;
-  if (strcmp (file, N_("<built-in>")))
+  if (strcmp (file, special_fname_builtin ()))
     {
       line = s.line;
       if (context->show_column)
diff --git a/gcc/fortran/cpp.cc b/gcc/fortran/cpp.cc
index 364bd0d2a85..0b5755edbb4 100644
--- a/gcc/fortran/cpp.cc
+++ b/gcc/fortran/cpp.cc
@@ -605,7 +605,7 @@ gfc_cpp_init (void)
   if (gfc_option.flag_preprocessed)
     return;
 
-  cpp_change_file (cpp_in, LC_RENAME, _("<built-in>"));
+  cpp_change_file (cpp_in, LC_RENAME, special_fname_builtin ());
   if (!gfc_cpp_option.no_predefined)
     {
       /* Make sure all of the builtins about to be declared have
diff --git a/gcc/fortran/error.cc b/gcc/fortran/error.cc
index c9d6edbb923..214fb78ba7b 100644
--- a/gcc/fortran/error.cc
+++ b/gcc/fortran/error.cc
@@ -1147,7 +1147,7 @@ gfc_diagnostic_build_locus_prefix (diagnostic_context *context,
   const char *locus_ce = colorize_stop (pp_show_color (pp));
   return (s.file == NULL
 	  ? build_message_string ("%s%s:%s", locus_cs, progname, locus_ce )
-	  : !strcmp (s.file, N_("<built-in>"))
+	  : !strcmp (s.file, special_fname_builtin ())
 	  ? build_message_string ("%s%s:%s", locus_cs, s.file, locus_ce)
 	  : context->show_column
 	  ? build_message_string ("%s%s:%d:%d:%s", locus_cs, s.file, s.line,
@@ -1167,7 +1167,7 @@ gfc_diagnostic_build_locus_prefix (diagnostic_context *context,
 
   return (s.file == NULL
 	  ? build_message_string ("%s%s:%s", locus_cs, progname, locus_ce )
-	  : !strcmp (s.file, N_("<built-in>"))
+	  : !strcmp (s.file, special_fname_builtin ())
 	  ? build_message_string ("%s%s:%s", locus_cs, s.file, locus_ce)
 	  : context->show_column
 	  ? build_message_string ("%s%s:%d:%d-%d:%s", locus_cs, s.file, s.line,
diff --git a/gcc/fortran/f95-lang.cc b/gcc/fortran/f95-lang.cc
index a6750bea787..0d83f3f8b69 100644
--- a/gcc/fortran/f95-lang.cc
+++ b/gcc/fortran/f95-lang.cc
@@ -259,7 +259,7 @@ gfc_init (void)
   if (!gfc_cpp_enabled ())
     {
       linemap_add (line_table, LC_ENTER, false, gfc_source_file, 1);
-      linemap_add (line_table, LC_RENAME, false, "<built-in>", 0);
+      linemap_add (line_table, LC_RENAME, false, special_fname_builtin (), 0);
     }
   else
     gfc_cpp_init_0 ();
diff --git a/gcc/input.cc b/gcc/input.cc
index a28abfac5ac..483cb6e940d 100644
--- a/gcc/input.cc
+++ b/gcc/input.cc
@@ -29,6 +29,12 @@ along with GCC; see the file COPYING3.  If not see
 #define HAVE_ICONV 0
 #endif
 
+const char *
+special_fname_builtin ()
+{
+  return _("<built-in>");
+}
+
 /* Input charset configuration.  */
 static const char *default_charset_callback (const char *)
 {
@@ -275,7 +281,7 @@ expand_location_1 (location_t loc,
 
   xloc.data = block;
   if (loc <= BUILTINS_LOCATION)
-    xloc.file = loc == UNKNOWN_LOCATION ? NULL : _("<built-in>");
+    xloc.file = loc == UNKNOWN_LOCATION ? NULL : special_fname_builtin ();
 
   return xloc;
 }
@@ -2102,7 +2108,7 @@ test_unknown_location ()
 static void
 test_builtins ()
 {
-  assert_loceq (_("<built-in>"), 0, 0, BUILTINS_LOCATION);
+  assert_loceq (special_fname_builtin (), 0, 0, BUILTINS_LOCATION);
   ASSERT_PRED1 (is_location_from_builtin_token, BUILTINS_LOCATION);
 }
 
diff --git a/gcc/input.h b/gcc/input.h
index 11c571d076f..0b23e66e53b 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -32,6 +32,9 @@ extern GTY(()) class line_maps *saved_line_table;
 /* The location for declarations in "<built-in>" */
 #define BUILTINS_LOCATION ((location_t) 1)
 
+/* Returns the translated string referring to the special location.  */
+const char *special_fname_builtin ();
+
 /* line-map.cc reserves RESERVED_LOCATION_COUNT to the user.  Ensure
    both UNKNOWN_LOCATION and BUILTINS_LOCATION fit into that.  */
 STATIC_ASSERT (BUILTINS_LOCATION < RESERVED_LOCATION_COUNT);

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 3/6] libcpp: Fix paste error with unknown pragma after macro expansion
  2022-11-04 13:44 [PATCH 0/6] diagnostics: libcpp: Overhaul locations for _Pragma tokens Lewis Hyatt
  2022-11-04 13:44 ` [PATCH 1/6] diagnostics: Fix macro tracking for ad-hoc locations Lewis Hyatt
  2022-11-04 13:44 ` [PATCH 2/6] diagnostics: Use an inline function rather than hardcoding <built-in> string Lewis Hyatt
@ 2022-11-04 13:44 ` Lewis Hyatt
  2022-11-21 17:50   ` Jeff Law
  2022-11-04 13:44 ` [PATCH 4/6] diagnostics: libcpp: Add LC_GEN linemaps to support in-memory buffers Lewis Hyatt
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Lewis Hyatt @ 2022-11-04 13:44 UTC (permalink / raw)
  To: gcc-patches; +Cc: Lewis Hyatt

In directives.cc, do_pragma() contains logic to handle a case such as the new
testcase pragma-omp-unknown.c, where an unknown pragma was the result of macro
expansion (for pragma namespaces that permit expansion). This no longer works
correctly as shown by the testcase, fixed by adding PREV_WHITE to the flags on
the second token to prevent an unwanted paste.  Also fixed the memory leak,
since the temporary tokens are pushed on their own context, nothing prevents
freeing of the buffer that holds them when the context is eventually popped.

libcpp/ChangeLog:

	* directives.cc (do_pragma): Fix memory leak in token buffer.  Fix
	unwanted paste between two tokens.

gcc/testsuite/ChangeLog:

	* c-c++-common/gomp/pragma-omp-unknown.c: New test.
---
 gcc/testsuite/c-c++-common/gomp/pragma-omp-unknown.c | 10 ++++++++++
 libcpp/directives.cc                                 | 10 +++++-----
 2 files changed, 15 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/gomp/pragma-omp-unknown.c

diff --git a/gcc/testsuite/c-c++-common/gomp/pragma-omp-unknown.c b/gcc/testsuite/c-c++-common/gomp/pragma-omp-unknown.c
new file mode 100644
index 00000000000..04881f786ab
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/gomp/pragma-omp-unknown.c
@@ -0,0 +1,10 @@
+/* { dg-do preprocess } */
+/* { dg-options "-fopenmp" } */
+
+#define X UNKNOWN1
+#pragma omp X
+/* { dg-final { scan-file pragma-omp-unknown.i "#pragma omp UNKNOWN1" } } */
+
+#define Y UNKNOWN2
+_Pragma("omp Y")
+/* { dg-final { scan-file pragma-omp-unknown.i "#pragma omp UNKNOWN2" } } */
diff --git a/libcpp/directives.cc b/libcpp/directives.cc
index 918752f6b1f..9dc4363c65a 100644
--- a/libcpp/directives.cc
+++ b/libcpp/directives.cc
@@ -1565,15 +1565,15 @@ do_pragma (cpp_reader *pfile)
 	{
 	  /* Invalid name comes from macro expansion, _cpp_backup_tokens
 	     won't allow backing 2 tokens.  */
-	  /* ??? The token buffer is leaked.  Perhaps if def_pragma hook
-	     reads both tokens, we could perhaps free it, but if it doesn't,
-	     we don't know the exact lifespan.  */
-	  cpp_token *toks = XNEWVEC (cpp_token, 2);
+	  const auto tok_buff = _cpp_get_buff (pfile, 2 * sizeof (cpp_token));
+	  const auto toks = (cpp_token *)tok_buff->base;
 	  toks[0] = ns_token;
 	  toks[0].flags |= NO_EXPAND;
 	  toks[1] = *token;
-	  toks[1].flags |= NO_EXPAND;
+	  toks[1].flags |= NO_EXPAND | PREV_WHITE;
 	  _cpp_push_token_context (pfile, NULL, toks, 2);
+	  /* Arrange to free this buffer when no longer needed.  */
+	  pfile->context->buff = tok_buff;
 	}
       pfile->cb.def_pragma (pfile, pfile->directive_line);
     }

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 4/6] diagnostics: libcpp: Add LC_GEN linemaps to support in-memory buffers
  2022-11-04 13:44 [PATCH 0/6] diagnostics: libcpp: Overhaul locations for _Pragma tokens Lewis Hyatt
                   ` (2 preceding siblings ...)
  2022-11-04 13:44 ` [PATCH 3/6] libcpp: Fix paste error with unknown pragma after macro expansion Lewis Hyatt
@ 2022-11-04 13:44 ` Lewis Hyatt
  2022-11-05 16:23   ` David Malcolm
  2022-11-04 13:44 ` [PATCH 5/6] diagnostics: Support generated data in additional contexts Lewis Hyatt
  2022-11-04 13:44 ` [PATCH 6/6] diagnostics: libcpp: Assign real locations to the tokens inside _Pragma strings Lewis Hyatt
  5 siblings, 1 reply; 18+ messages in thread
From: Lewis Hyatt @ 2022-11-04 13:44 UTC (permalink / raw)
  To: gcc-patches; +Cc: Lewis Hyatt

Add a new linemap reason LC_GEN which enables encoding the location of data
that was generated during compilation and does not appear in any source file.
There could be many use cases, such as, for instance, referring to the content
of builtin macros (not yet implemented, but an easy lift after this one.) The
first intended application is to create a place to store the input to a
_Pragma directive, so that proper locations can be assigned to those
tokens. This will be done in a subsequent commit.

The actual change needed to the line-maps API in libcpp is very minimal and
requires no space overhead in the line map data structures (on 64-bit systems
that is; one newly added data member to class line_map_ordinary sits inside
former padding bytes.) An LC_GEN map is just an ordinary map like any other,
but the TO_FILE member that normally points to the file name points instead to
the actual data.  This works automatically with PCH as well, for the same
reason that the file name makes its way into a PCH.

Outside libcpp, there are many small changes but most of them are to
selftests, which are necessarily more sensitive to implementation
details. From the perspective of the user (the "user", here, being a frontend
using line maps or else the diagnostics infrastructure), the chief visible
change is that the function location_get_source_line() should be passed an
expanded_location object instead of a separate filename and line number.  This
is not a big change because in most cases, this information came anyway from a
call to expand_location and the needed expanded_location object is readily
available. The new overload of location_get_source_line() uses the extra
information in the expanded_location object to obtain the data from the
in-memory buffer when it originated from an LC_GEN map.

Until the subsequent patch that starts using LC_GEN maps, none are yet
generated within GCC, hence nothing is added to the testsuite here; but all
relevant selftests have been extended to cover generated data maps in
addition to normal files.

libcpp/ChangeLog:

	* include/line-map.h (enum lc_reason): Add LC_GEN.
	(struct line_map_ordinary): Add new member to_file_len and update the
	GTY markup on to_file to support embedded null bytes.
	(class expanded_location): Add new members to store generated content.
	* line-map.cc (linemap_add): Add new argument to_file_len to support
	generated content. Implement LC_GEN maps.
	(linemap_line_start): Pass new to_file_len argument to linemap_add.
	(linemap_expand_location): Support LC_GEN locations.
	(linemap_dump): Likewise.

gcc/ChangeLog:

	* diagnostic-show-locus.cc (make_range): Initialize new fields in
	expanded_location.
	(layout::calculate_x_offset_display): Use the new expanded_location
	overload of location_get_source_line(), so as to support LC_GEN maps.
	(layout::print_line): Likewise.
	(source_line::source_line): Likewise.
	(line_corrections::add_hint): Likewise.
	(class line_corrections): Store the location as an exploc rather than
	individual filename, so as to support LC_GEN maps.
	(layout::print_trailing_fixits): Use the new exploc constructor for
	class line_corrections.
	(test_layout_x_offset_display_utf8): Test LC_GEN maps as well as normal.
	(test_layout_x_offset_display_tab): Likewise.
	(test_diagnostic_show_locus_one_liner): Likewise.
	(test_diagnostic_show_locus_one_liner_utf8): Likewise.
	(test_add_location_if_nearby): Likewise.
	(test_diagnostic_show_locus_fixit_lines): Likewise.
	(test_fixit_consolidation): Likewise.
	(test_overlapped_fixit_printing): Likewise.
	(test_overlapped_fixit_printing_utf8): Likewise.
	(test_overlapped_fixit_printing_2): Likewise.
	(test_fixit_insert_containing_newline): Likewise.
	(test_fixit_insert_containing_newline_2): Likewise.
	(test_fixit_replace_containing_newline): Likewise.
	(test_fixit_deletion_affecting_newline): Likewise.
	(test_tab_expansion): Likewise.
	(test_escaping_bytes_1): Likewise.
	(test_escaping_bytes_2): Likewise.
	(test_line_numbers_multiline_range): Likewise.
	(diagnostic_show_locus_cc_tests): Likewise.
	* diagnostic.cc (diagnostic_report_current_module): Support LC_GEN
	maps when outputting include trace.
	(assert_location_text): Zero-initialize the expanded_location so as to
	cover all fields, including the newly added ones.
	* gcc-rich-location.cc (blank_line_before_p): Use the new
	expanded_location overload of location_get_source_line().
	* input.cc (class file_cache_slot): Add new member m_data_active.
	(file_cache_slot::file_cache_slot): Initialize new member.
	(special_fname_generated): New function.
	(expand_location_1): Recognize LC_GEN locations and output the special
	filename for them.
	(file_cache::add_file): Support generated data that is already in
	memory and does not need to be read from a file.
	(file_cache_slot::create): Likewise.
	(file_cache::lookup_or_add_file): Likewise.
	(file_cache_slot::maybe_grow): Add assert that we are not in generated
	data mode.
	(file_cache_slot::read_data): Likewise.
	(find_end_of_line): Add missing const.
	(file_cache_slot::goto_next_line): Likewise.
	(file_cache_slot::read_line_num): Likewise.
	(file_cache_slot::get_next_line): Add missing const. Adapt to use of
	new m_data_active member.
	(location_get_source_line): Change to take an expanded_location
	argument instead of a filename.  Support generated data. Add another
	overload taking a filename that delegates to this one.
	(location_compute_display_column): Use the new overload of
	location_get_source_line.
	(dump_location_info): Likewise.
	(get_substring_ranges_for_loc): Likewise.
	(location_missing_trailing_newline): Pass new argument to
	lookup_or_add_file ().
	(temp_source_file::do_linemap_add): New function.
	(class line_table_case): Add new m_generated_data member.
	(line_table_test::line_table_test): Initialize the new member.
	(test_accessing_ordinary_linemaps): Test generated data too.
	(test_make_location_nonpure_range_endpoints): Likewise.
	(test_line_offset_overflow): Likewise.
	(for_each_line_table_case): Add new argument requesting to test
	generated data as well as regular.
	(input_cc_tests): Enable testing generated data in the selftests.
	* input.h (special_fname_generated): Declare.
	(location_get_source_line): Declare new overloads.
	(class file_cache): Update prototypes for new arguments.
	* selftest.cc (named_temp_file::named_temp_file): Support nullptr
	argument to disable creating any file.
	(named_temp_file::~named_temp_file): Likewise.
	(temp_source_file::temp_source_file): Add a new constructor argument
	to enable creating generated data instead of a file.
	(temp_source_file::~temp_source_file): Handle freeing generated data buffer.
	* selftest.h (struct line_map_ordinary): Forward declare.
	(class named_temp_file): Add missing explicit on constructor.
	(class temp_source_file): Add new members to store generated content.
	(class line_table_test): Add new m_generated_data member.
	(for_each_line_table_case): Update prototype for new argument.

gcc/c-family/ChangeLog:

	* c-common.cc (try_to_locate_new_include_insertion_point): Add
	awareness of LC_GEN maps.
	* c-format.cc (get_corrected_substring): Use the new expanded_location
	overload of location_get_source_line(), so as to support LC_GEN maps.
	* c-indentation.cc (get_visual_column): Likewise.
	(get_first_nws_vis_column): Likewise.
	(detect_intervening_unindent): Likewise.
	(should_warn_for_misleading_indentation): Likewise.
	(assert_get_visual_column_succeeds): Zero-initialize the exploc to
	cover all fields including those newly added.
	(assert_get_visual_column_fails): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c: Use the new
	overload of location_get_source_line.
---
 gcc/c-family/c-common.cc                      |   5 +-
 gcc/c-family/c-format.cc                      |   2 +-
 gcc/c-family/c-indentation.cc                 |  28 +-
 gcc/diagnostic-show-locus.cc                  | 237 +++++++++--------
 gcc/diagnostic.cc                             |  15 +-
 gcc/gcc-rich-location.cc                      |   2 +-
 gcc/input.cc                                  | 240 ++++++++++++------
 gcc/input.h                                   |  11 +-
 gcc/selftest.cc                               |  49 +++-
 gcc/selftest.h                                |  20 +-
 .../diagnostic_plugin_test_show_locus.c       |   4 +-
 libcpp/include/line-map.h                     |  21 +-
 libcpp/line-map.cc                            |  51 +++-
 13 files changed, 443 insertions(+), 242 deletions(-)

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 5890c18bdc3..2935d7fb236 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -9183,11 +9183,14 @@ try_to_locate_new_include_insertion_point (const char *file, location_t loc)
       const line_map_ordinary *ord_map
 	= LINEMAPS_ORDINARY_MAP_AT (line_table, i);
 
+      if (ord_map->reason == LC_GEN)
+	continue;
+
       if (const line_map_ordinary *from
 	  = linemap_included_from_linemap (line_table, ord_map))
 	/* We cannot use pointer equality, because with preprocessed
 	   input all filename strings are unique.  */
-	if (0 == strcmp (from->to_file, file))
+	if (from->reason != LC_GEN && 0 == strcmp (from->to_file, file))
 	  {
 	    last_include_ord_map = from;
 	    last_ord_map_after_include = NULL;
diff --git a/gcc/c-family/c-format.cc b/gcc/c-family/c-format.cc
index 01adea4ff41..87ad3e74b72 100644
--- a/gcc/c-family/c-format.cc
+++ b/gcc/c-family/c-format.cc
@@ -4538,7 +4538,7 @@ get_corrected_substring (const substring_loc &fmt_loc,
   if (caret.column > finish.column)
     return NULL;
 
-  char_span line = location_get_source_line (start.file, start.line);
+  char_span line = location_get_source_line (start);
   if (!line)
     return NULL;
 
diff --git a/gcc/c-family/c-indentation.cc b/gcc/c-family/c-indentation.cc
index 85a3ae1b303..42738dd4d13 100644
--- a/gcc/c-family/c-indentation.cc
+++ b/gcc/c-family/c-indentation.cc
@@ -50,7 +50,7 @@ get_visual_column (expanded_location exploc,
 		   unsigned int *first_nws,
 		   unsigned int tab_width)
 {
-  char_span line = location_get_source_line (exploc.file, exploc.line);
+  char_span line = location_get_source_line (exploc);
   if (!line)
     return false;
   if ((size_t)exploc.column > line.length ())
@@ -87,13 +87,13 @@ get_visual_column (expanded_location exploc,
    Otherwise, return false, leaving *FIRST_NWS untouched.  */
 
 static bool
-get_first_nws_vis_column (const char *file, int line_num,
+get_first_nws_vis_column (expanded_location exploc,
 			  unsigned int *first_nws,
 			  unsigned int tab_width)
 {
   gcc_assert (first_nws);
 
-  char_span line = location_get_source_line (file, line_num);
+  char_span line = location_get_source_line (exploc);
   if (!line)
     return false;
   unsigned int vis_column = 0;
@@ -158,19 +158,18 @@ get_first_nws_vis_column (const char *file, int line_num,
    Return true if such an unindent/outdent is detected.  */
 
 static bool
-detect_intervening_unindent (const char *file,
-			     int body_line,
+detect_intervening_unindent (expanded_location exploc,
 			     int next_stmt_line,
 			     unsigned int vis_column,
 			     unsigned int tab_width)
 {
-  gcc_assert (file);
-  gcc_assert (next_stmt_line > body_line);
+  gcc_assert (exploc.file);
+  gcc_assert (next_stmt_line > exploc.line);
 
-  for (int line = body_line + 1; line < next_stmt_line; line++)
+  while (++exploc.line < next_stmt_line)
     {
       unsigned int line_vis_column;
-      if (get_first_nws_vis_column (file, line, &line_vis_column, tab_width))
+      if (get_first_nws_vis_column (exploc, &line_vis_column, tab_width))
 	if (line_vis_column < vis_column)
 	  return true;
     }
@@ -528,8 +527,7 @@ should_warn_for_misleading_indentation (const token_indent_info &guard_tinfo,
 
 	  /* Don't warn if there is an unindent between the two statements. */
 	  int vis_column = MIN (next_stmt_vis_column, body_vis_column);
-	  if (detect_intervening_unindent (body_exploc.file, body_exploc.line,
-					   next_stmt_exploc.line,
+	  if (detect_intervening_unindent (body_exploc, next_stmt_exploc.line,
 					   vis_column, tab_width))
 	    return false;
 
@@ -691,12 +689,10 @@ assert_get_visual_column_succeeds (const location &loc,
 				   unsigned int expected_visual_column,
 				   unsigned int expected_first_nws)
 {
-  expanded_location exploc;
+  expanded_location exploc = {};
   exploc.file = file;
   exploc.line = line;
   exploc.column = column;
-  exploc.data = NULL;
-  exploc.sysp = false;
   unsigned int actual_visual_column;
   unsigned int actual_first_nws;
   bool result = get_visual_column (exploc,
@@ -729,12 +725,10 @@ assert_get_visual_column_fails (const location &loc,
 				const char *file, int line, int column,
 				const unsigned int tab_width)
 {
-  expanded_location exploc;
+  expanded_location exploc = {};
   exploc.file = file;
   exploc.line = line;
   exploc.column = column;
-  exploc.data = NULL;
-  exploc.sysp = false;
   unsigned int actual_visual_column;
   unsigned int actual_first_nws;
   bool result = get_visual_column (exploc,
diff --git a/gcc/diagnostic-show-locus.cc b/gcc/diagnostic-show-locus.cc
index 9d430b5189c..442744f5e0b 100644
--- a/gcc/diagnostic-show-locus.cc
+++ b/gcc/diagnostic-show-locus.cc
@@ -709,9 +709,9 @@ static layout_range
 make_range (int start_line, int start_col, int end_line, int end_col)
 {
   const expanded_location start_exploc
-    = {"", start_line, start_col, NULL, false};
+    = {"", start_line, start_col, NULL, false, 0, NULL};
   const expanded_location finish_exploc
-    = {"", end_line, end_col, NULL, false};
+    = {"", end_line, end_col, NULL, false, 0, NULL};
   return layout_range (exploc_with_display_col (start_exploc, def_policy (),
 						LOCATION_ASPECT_START),
 		       exploc_with_display_col (finish_exploc, def_policy (),
@@ -1614,8 +1614,7 @@ layout::calculate_x_offset_display ()
       return;
     }
 
-  const char_span line = location_get_source_line (m_exploc.file,
-						   m_exploc.line);
+  const char_span line = location_get_source_line (m_exploc);
   if (!line)
     {
       /* Nothing to do, we couldn't find the source line.  */
@@ -2398,17 +2397,18 @@ class line_corrections
 {
 public:
   line_corrections (const char_display_policy &policy,
-		    const char *filename,
-		    linenum_type row)
-  : m_policy (policy), m_filename (filename), m_row (row)
-  {}
+		    expanded_location exploc, linenum_type row = 0)
+  : m_policy (policy), m_exploc (exploc)
+  {
+    if (row)
+      m_exploc.line = row;
+  }
   ~line_corrections ();
 
   void add_hint (const fixit_hint *hint);
 
   const char_display_policy &m_policy;
-  const char *m_filename;
-  linenum_type m_row;
+  expanded_location m_exploc;
   auto_vec <correction *> m_corrections;
 };
 
@@ -2428,7 +2428,7 @@ line_corrections::~line_corrections ()
 class source_line
 {
 public:
-  source_line (const char *filename, int line);
+  explicit source_line (expanded_location xloc);
 
   char_span as_span () { return char_span (chars, width); }
 
@@ -2438,9 +2438,9 @@ public:
 
 /* source_line's ctor.  */
 
-source_line::source_line (const char *filename, int line)
+source_line::source_line (expanded_location exploc)
 {
-  char_span span = location_get_source_line (filename, line);
+  char_span span = location_get_source_line (exploc);
   chars = span.get_buffer ();
   width = span.length ();
 }
@@ -2482,7 +2482,7 @@ line_corrections::add_hint (const fixit_hint *hint)
 				affected_bytes.start - 1);
 
 	  /* Try to read the source.  */
-	  source_line line (m_filename, m_row);
+	  source_line line (m_exploc);
 	  if (line.chars && between.finish < line.width)
 	    {
 	      /* Consolidate into the last correction:
@@ -2538,7 +2538,7 @@ layout::print_trailing_fixits (linenum_type row)
 {
   /* Build a list of correction instances for the line,
      potentially consolidating hints (for the sake of readability).  */
-  line_corrections corrections (m_policy, m_exploc.file, row);
+  line_corrections corrections (m_policy, m_exploc, row);
   for (unsigned int i = 0; i < m_fixit_hints.length (); i++)
     {
       const fixit_hint *hint = m_fixit_hints[i];
@@ -2776,7 +2776,7 @@ layout::show_ruler (int max_column) const
 void
 layout::print_line (linenum_type row)
 {
-  char_span line = location_get_source_line (m_exploc.file, row);
+  char_span line = location_get_source_line (m_exploc, row);
   if (!line)
     return;
 
@@ -2985,10 +2985,10 @@ test_layout_x_offset_display_utf8 (const line_table_case &case_)
      no multibyte characters earlier on the line.  */
   const int emoji_col = 102;
 
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
   line_table_test ltt (case_);
-
-  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content, 1 + line_bytes,
+			ltt.m_generated_data);
+  tmp.do_linemap_add (1);
 
   location_t line_end = linemap_position_for_column (line_table, line_bytes);
 
@@ -2996,17 +2996,23 @@ test_layout_x_offset_display_utf8 (const line_table_case &case_)
   if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS)
     return;
 
-  ASSERT_STREQ (tmp.get_filename (), LOCATION_FILE (line_end));
+  if (ltt.m_generated_data)
+    {
+      ASSERT_EQ (nullptr, tmp.get_filename ());
+      ASSERT_STREQ (special_fname_generated (), LOCATION_FILE (line_end));
+    }
+  else
+    ASSERT_STREQ (tmp.get_filename (), LOCATION_FILE (line_end));
   ASSERT_EQ (1, LOCATION_LINE (line_end));
   ASSERT_EQ (line_bytes, LOCATION_COLUMN (line_end));
 
-  char_span lspan = location_get_source_line (tmp.get_filename (), 1);
+  const expanded_location xloc = expand_location (line_end);
+  char_span lspan = location_get_source_line (xloc);
   ASSERT_EQ (line_display_cols,
 	     cpp_display_width (lspan.get_buffer (), lspan.length (),
 				def_policy ()));
   ASSERT_EQ (line_display_cols,
-	     location_compute_display_column (expand_location (line_end),
-					      def_policy ()));
+	     location_compute_display_column (xloc, def_policy ()));
   ASSERT_EQ (0, memcmp (lspan.get_buffer () + (emoji_col - 1),
 			"\xf0\x9f\x98\x82\xf0\x9f\x98\x82", 8));
 
@@ -3138,10 +3144,10 @@ test_layout_x_offset_display_tab (const line_table_case &case_)
      a space would have taken up.  */
   ASSERT_EQ (7, extra_width[10]);
 
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
   line_table_test ltt (case_);
-
-  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content, line_bytes + 1,
+			ltt.m_generated_data);
+  tmp.do_linemap_add (1);
 
   location_t line_end = linemap_position_for_column (line_table, line_bytes);
 
@@ -3150,7 +3156,8 @@ test_layout_x_offset_display_tab (const line_table_case &case_)
     return;
 
   /* Check that cpp_display_width handles the tabs as expected.  */
-  char_span lspan = location_get_source_line (tmp.get_filename (), 1);
+  const expanded_location xloc = expand_location (line_end);
+  char_span lspan = location_get_source_line (xloc);
   ASSERT_EQ ('\t', *(lspan.get_buffer () + (tab_col - 1)));
   for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)
     {
@@ -3159,8 +3166,7 @@ test_layout_x_offset_display_tab (const line_table_case &case_)
 		 cpp_display_width (lspan.get_buffer (), lspan.length (),
 				    policy));
       ASSERT_EQ (line_bytes + extra_width[tabstop],
-		 location_compute_display_column (expand_location (line_end),
-						  policy));
+		 location_compute_display_column (xloc, policy));
     }
 
   /* Check that the tab is expanded to the expected number of spaces.  */
@@ -3784,10 +3790,10 @@ test_diagnostic_show_locus_one_liner (const line_table_case &case_)
      ....................0000000001111111.
      ....................1234567890123456.  */
   const char *content = "foo = bar.field;\n";
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
   line_table_test ltt (case_);
-
-  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content, strlen (content),
+			ltt.m_generated_data);
+  tmp.do_linemap_add (1);
 
   location_t line_end = linemap_position_for_column (line_table, 16);
 
@@ -3795,7 +3801,14 @@ test_diagnostic_show_locus_one_liner (const line_table_case &case_)
   if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS)
     return;
 
-  ASSERT_STREQ (tmp.get_filename (), LOCATION_FILE (line_end));
+  if (ltt.m_generated_data)
+    {
+      ASSERT_EQ (nullptr, tmp.get_filename ());
+      ASSERT_STREQ (special_fname_generated (), LOCATION_FILE (line_end));
+    }
+  else
+    ASSERT_STREQ (tmp.get_filename (), LOCATION_FILE (line_end));
+
   ASSERT_EQ (1, LOCATION_LINE (line_end));
   ASSERT_EQ (16, LOCATION_COLUMN (line_end));
 
@@ -4366,10 +4379,10 @@ test_diagnostic_show_locus_one_liner_utf8 (const line_table_case &case_)
     /* 0000000000000000000001111111111111111111222222222222222222222233333
        1111222233334444567890122223333456789999000011112222345678999900001
        Byte columns.  */
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
   line_table_test ltt (case_);
-
-  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content, strlen (content),
+			ltt.m_generated_data);
+  tmp.do_linemap_add (1);
 
   location_t line_end = linemap_position_for_column (line_table, 31);
 
@@ -4377,11 +4390,18 @@ test_diagnostic_show_locus_one_liner_utf8 (const line_table_case &case_)
   if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS)
     return;
 
-  ASSERT_STREQ (tmp.get_filename (), LOCATION_FILE (line_end));
+  if (ltt.m_generated_data)
+    {
+      ASSERT_EQ (nullptr, tmp.get_filename ());
+      ASSERT_STREQ (special_fname_generated (), LOCATION_FILE (line_end));
+    }
+  else
+    ASSERT_STREQ (tmp.get_filename (), LOCATION_FILE (line_end));
+
   ASSERT_EQ (1, LOCATION_LINE (line_end));
   ASSERT_EQ (31, LOCATION_COLUMN (line_end));
 
-  char_span lspan = location_get_source_line (tmp.get_filename (), 1);
+  char_span lspan = location_get_source_line (expand_location (line_end));
   ASSERT_EQ (25, cpp_display_width (lspan.get_buffer (), lspan.length (),
 				    def_policy ()));
   ASSERT_EQ (25, location_compute_display_column (expand_location (line_end),
@@ -4418,12 +4438,10 @@ test_add_location_if_nearby (const line_table_case &case_)
        "  double x;\n"                              /* line 4.  */
        "  double y;\n"                              /* line 5.  */
        ";\n");                                      /* line 6.  */
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
   line_table_test ltt (case_);
-
-  const line_map_ordinary *ord_map
-    = linemap_check_ordinary (linemap_add (line_table, LC_ENTER, false,
-					   tmp.get_filename (), 0));
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content, strlen (content),
+			ltt.m_generated_data);
+  const line_map_ordinary *ord_map = tmp.do_linemap_add (0);
 
   linemap_line_start (line_table, 1, 100);
 
@@ -4482,12 +4500,10 @@ test_diagnostic_show_locus_fixit_lines (const line_table_case &case_)
        "\n"                                      /* line 4.  */
        "\n"                                      /* line 5.  */
        "                        : 0.0};\n");     /* line 6.  */
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
   line_table_test ltt (case_);
-
-  const line_map_ordinary *ord_map
-    = linemap_check_ordinary (linemap_add (line_table, LC_ENTER, false,
-					   tmp.get_filename (), 0));
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content, strlen (content),
+			ltt.m_generated_data);
+  const line_map_ordinary *ord_map = tmp.do_linemap_add (0);
 
   linemap_line_start (line_table, 1, 100);
 
@@ -4578,8 +4594,10 @@ static void
 test_fixit_consolidation (const line_table_case &case_)
 {
   line_table_test ltt (case_);
-
-  linemap_add (line_table, LC_ENTER, false, "test.c", 1);
+  if (ltt.m_generated_data)
+    linemap_add (line_table, LC_GEN, false, "some content", 1, 13);
+  else
+    linemap_add (line_table, LC_ENTER, false, "test.c", 1);
 
   const location_t c10 = linemap_position_for_column (line_table, 10);
   const location_t c15 = linemap_position_for_column (line_table, 15);
@@ -4725,13 +4743,11 @@ test_overlapped_fixit_printing (const line_table_case &case_)
      ...123456789012345678901234567890123456789.  */
   const char *content
     = ("  foo *f = (foo *)ptr->field;\n");
-  temp_source_file tmp (SELFTEST_LOCATION, ".C", content);
   line_table_test ltt (case_);
+  temp_source_file tmp (SELFTEST_LOCATION, ".C", content, strlen (content),
+			ltt.m_generated_data);
 
-  const line_map_ordinary *ord_map
-    = linemap_check_ordinary (linemap_add (line_table, LC_ENTER, false,
-					   tmp.get_filename (), 0));
-
+  const line_map_ordinary *ord_map = tmp.do_linemap_add (0);
   linemap_line_start (line_table, 1, 100);
 
   const location_t final_line_end
@@ -4752,6 +4768,8 @@ test_overlapped_fixit_printing (const line_table_case &case_)
     = linemap_position_for_line_and_column (line_table, ord_map, 1, 28);
   const location_t expr = make_location (expr_start, expr_start, expr_finish);
 
+  const expanded_location xloc = expand_location (expr);
+
   /* Various examples of fix-it hints that aren't themselves consolidated,
      but for which the *printing* may need consolidation.  */
 
@@ -4795,7 +4813,7 @@ test_overlapped_fixit_printing (const line_table_case &case_)
     /* Add each hint in turn to a line_corrections instance,
        and verify that they are consolidated into one correction instance
        as expected.  */
-    line_corrections lc (policy, tmp.get_filename (), 1);
+    line_corrections lc (policy, xloc);
 
     /* The first replace hint by itself.  */
     lc.add_hint (hint_0);
@@ -4936,13 +4954,10 @@ test_overlapped_fixit_printing_utf8 (const line_table_case &case_)
        12344445555666677778901234566667777888899990123456789012333344445
        Byte columns.  */
 
-  temp_source_file tmp (SELFTEST_LOCATION, ".C", content);
   line_table_test ltt (case_);
-
-  const line_map_ordinary *ord_map
-    = linemap_check_ordinary (linemap_add (line_table, LC_ENTER, false,
-					   tmp.get_filename (), 0));
-
+  temp_source_file tmp (SELFTEST_LOCATION, ".C", content, strlen (content),
+			ltt.m_generated_data);
+  const line_map_ordinary *ord_map = tmp.do_linemap_add (0);
   linemap_line_start (line_table, 1, 100);
 
   const location_t final_line_end
@@ -4963,6 +4978,8 @@ test_overlapped_fixit_printing_utf8 (const line_table_case &case_)
     = linemap_position_for_line_and_column (line_table, ord_map, 1, 34);
   const location_t expr = make_location (expr_start, expr_start, expr_finish);
 
+  const expanded_location xloc = expand_location (expr);
+
   /* Various examples of fix-it hints that aren't themselves consolidated,
      but for which the *printing* may need consolidation.  */
 
@@ -5011,7 +5028,7 @@ test_overlapped_fixit_printing_utf8 (const line_table_case &case_)
     /* Add each hint in turn to a line_corrections instance,
        and verify that they are consolidated into one correction instance
        as expected.  */
-    line_corrections lc (policy, tmp.get_filename (), 1);
+    line_corrections lc (policy, xloc);
 
     /* The first replace hint by itself.  */
     lc.add_hint (hint_0);
@@ -5169,13 +5186,11 @@ test_overlapped_fixit_printing_2 (const line_table_case &case_)
      ...123456789012345678901234567890123456789.  */
   const char *content
     = ("int a5[][0][0] = { 1, 2 };\n");
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
-  line_table_test ltt (case_);
-
-  const line_map_ordinary *ord_map
-    = linemap_check_ordinary (linemap_add (line_table, LC_ENTER, false,
-					   tmp.get_filename (), 0));
 
+  line_table_test ltt (case_);
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content, strlen (content),
+			ltt.m_generated_data);
+  const line_map_ordinary *ord_map = tmp.do_linemap_add (0);
   linemap_line_start (line_table, 1, 100);
 
   const location_t final_line_end
@@ -5260,10 +5275,10 @@ test_fixit_insert_containing_newline (const line_table_case &case_)
 			     "      x = a;\n"  /* line 2. */
 			     "    case 'b':\n" /* line 3. */
 			     "      x = b;\n");/* line 4. */
-
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", old_content);
   line_table_test ltt (case_);
-  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 3);
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", old_content,
+			strlen (old_content), false);
+  tmp.do_linemap_add (3);
 
   location_t case_start = linemap_position_for_column (line_table, 5);
   location_t case_finish = linemap_position_for_column (line_table, 13);
@@ -5331,12 +5346,11 @@ test_fixit_insert_containing_newline_2 (const line_table_case &case_)
 			     "{\n"              /* line 2. */
 			     " putchar (ch);\n" /* line 3. */
 			     "}\n");            /* line 4. */
-
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", old_content);
   line_table_test ltt (case_);
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", old_content,
+			strlen (old_content), ltt.m_generated_data);
 
-  const line_map_ordinary *ord_map = linemap_check_ordinary
-    (linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 0));
+  const line_map_ordinary *ord_map = tmp.do_linemap_add (0);
   linemap_line_start (line_table, 1, 100);
 
   /* The primary range is the "putchar" token.  */
@@ -5395,9 +5409,10 @@ test_fixit_replace_containing_newline (const line_table_case &case_)
     .........................1234567890123.  */
   const char *old_content = "foo = bar ();\n";
 
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", old_content);
   line_table_test ltt (case_);
-  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", old_content,
+			strlen (old_content), ltt.m_generated_data);
+  tmp.do_linemap_add (1);
 
   /* Replace the " = " with "\n  = ", as if we were reformatting an
      overly long line.  */
@@ -5435,10 +5450,10 @@ test_fixit_deletion_affecting_newline (const line_table_case &case_)
   const char *old_content = ("foo = bar (\n"
 			     "      );\n");
 
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", old_content);
   line_table_test ltt (case_);
-  const line_map_ordinary *ord_map = linemap_check_ordinary
-    (linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 0));
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", old_content,
+			strlen (old_content), ltt.m_generated_data);
+  const line_map_ordinary *ord_map = tmp.do_linemap_add (0);
   linemap_line_start (line_table, 1, 100);
 
   /* Attempt to delete the " (\n...)".  */
@@ -5487,9 +5502,10 @@ test_tab_expansion (const line_table_case &case_)
   const int last_byte_col = 25;
   ASSERT_EQ (35, cpp_display_width (content, last_byte_col, policy));
 
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
   line_table_test ltt (case_);
-  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content, strlen (content),
+			ltt.m_generated_data);
+  tmp.do_linemap_add (1);
 
   /* Don't attempt to run the tests if column data might be unavailable.  */
   location_t line_end = linemap_position_for_column (line_table, last_byte_col);
@@ -5536,15 +5552,14 @@ test_escaping_bytes_1 (const line_table_case &case_)
 {
   const char content[] = "before\0\1\2\3\v\x80\xff""after\n";
   const size_t sz = sizeof (content);
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", content, sz);
   line_table_test ltt (case_);
-  const line_map_ordinary *ord_map = linemap_check_ordinary
-    (linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 0));
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content, sz,
+			ltt.m_generated_data);
+  const line_map_ordinary *ord_map = tmp.do_linemap_add (0);
   linemap_line_start (line_table, 1, 100);
 
   location_t finish
-    = linemap_position_for_line_and_column (line_table, ord_map, 1,
-					    strlen (content));
+    = linemap_position_for_line_and_column (line_table, ord_map, 1, sz);
 
   if (finish > LINE_MAP_MAX_LOCATION_WITH_COLS)
     return;
@@ -5592,15 +5607,14 @@ test_escaping_bytes_2 (const line_table_case &case_)
 {
   const char content[]  = "\0after\n";
   const size_t sz = sizeof (content);
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", content, sz);
   line_table_test ltt (case_);
-  const line_map_ordinary *ord_map = linemap_check_ordinary
-    (linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 0));
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content, sz,
+			ltt.m_generated_data);
+  const line_map_ordinary *ord_map = tmp.do_linemap_add (0);
   linemap_line_start (line_table, 1, 100);
 
   location_t finish
-    = linemap_position_for_line_and_column (line_table, ord_map, 1,
-					    strlen (content));
+    = linemap_position_for_line_and_column (line_table, ord_map, 1, sz);
 
   if (finish > LINE_MAP_MAX_LOCATION_WITH_COLS)
     return;
@@ -5652,8 +5666,7 @@ test_line_numbers_multiline_range ()
   temp_source_file tmp (SELFTEST_LOCATION, ".txt", pp_formatted_text (&pp));
   line_table_test ltt;
 
-  const line_map_ordinary *ord_map = linemap_check_ordinary
-    (linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 0));
+  const line_map_ordinary *ord_map = tmp.do_linemap_add (0);
   linemap_line_start (line_table, 1, 100);
 
   /* Create a multi-line location, starting at the "line" of line 9, with
@@ -5694,28 +5707,28 @@ diagnostic_show_locus_cc_tests ()
 
   test_display_widths ();
 
-  for_each_line_table_case (test_layout_x_offset_display_utf8);
-  for_each_line_table_case (test_layout_x_offset_display_tab);
+  for_each_line_table_case (test_layout_x_offset_display_utf8, true);
+  for_each_line_table_case (test_layout_x_offset_display_tab, true);
 
   test_get_line_bytes_without_trailing_whitespace ();
 
   test_diagnostic_show_locus_unknown_location ();
 
-  for_each_line_table_case (test_diagnostic_show_locus_one_liner);
-  for_each_line_table_case (test_diagnostic_show_locus_one_liner_utf8);
-  for_each_line_table_case (test_add_location_if_nearby);
-  for_each_line_table_case (test_diagnostic_show_locus_fixit_lines);
-  for_each_line_table_case (test_fixit_consolidation);
-  for_each_line_table_case (test_overlapped_fixit_printing);
-  for_each_line_table_case (test_overlapped_fixit_printing_utf8);
-  for_each_line_table_case (test_overlapped_fixit_printing_2);
-  for_each_line_table_case (test_fixit_insert_containing_newline);
-  for_each_line_table_case (test_fixit_insert_containing_newline_2);
-  for_each_line_table_case (test_fixit_replace_containing_newline);
-  for_each_line_table_case (test_fixit_deletion_affecting_newline);
-  for_each_line_table_case (test_tab_expansion);
-  for_each_line_table_case (test_escaping_bytes_1);
-  for_each_line_table_case (test_escaping_bytes_2);
+  for_each_line_table_case (test_diagnostic_show_locus_one_liner, true);
+  for_each_line_table_case (test_diagnostic_show_locus_one_liner_utf8, true);
+  for_each_line_table_case (test_add_location_if_nearby, true);
+  for_each_line_table_case (test_diagnostic_show_locus_fixit_lines, true);
+  for_each_line_table_case (test_fixit_consolidation, true);
+  for_each_line_table_case (test_overlapped_fixit_printing, true);
+  for_each_line_table_case (test_overlapped_fixit_printing_utf8, true);
+  for_each_line_table_case (test_overlapped_fixit_printing_2, true);
+  for_each_line_table_case (test_fixit_insert_containing_newline, true);
+  for_each_line_table_case (test_fixit_insert_containing_newline_2, true);
+  for_each_line_table_case (test_fixit_replace_containing_newline, true);
+  for_each_line_table_case (test_fixit_deletion_affecting_newline, true);
+  for_each_line_table_case (test_tab_expansion, true);
+  for_each_line_table_case (test_escaping_bytes_1, true);
+  for_each_line_table_case (test_escaping_bytes_2, true);
 
   test_line_numbers_multiline_range ();
 }
diff --git a/gcc/diagnostic.cc b/gcc/diagnostic.cc
index 7c7ee6da746..164b4206b41 100644
--- a/gcc/diagnostic.cc
+++ b/gcc/diagnostic.cc
@@ -771,13 +771,15 @@ diagnostic_report_current_module (diagnostic_context *context, location_t where)
       if (!includes_seen (context, map))
 	{
 	  bool first = true, need_inc = true, was_module = MAP_MODULE_P (map);
+	  const bool was_gen = (map->reason == LC_GEN);
 	  expanded_location s = {};
 	  do
 	    {
 	      where = linemap_included_from (map);
 	      map = linemap_included_from_linemap (line_table, map);
 	      bool is_module = MAP_MODULE_P (map);
-	      s.file = LINEMAP_FILE (map);
+	      s.file = (map->reason == LC_GEN
+			? special_fname_generated () : LINEMAP_FILE (map));
 	      s.line = SOURCE_LINE (map, where);
 	      int col = -1;
 	      if (first && context->show_column)
@@ -796,10 +798,13 @@ diagnostic_report_current_module (diagnostic_context *context, location_t where)
 		 N_("of module"),
 		 N_("In module imported at"),	/* 6 */
 		 N_("imported at"),
+		 N_("In buffer generated from"),   /* 8 */
 		};
 
-	      unsigned index = (was_module ? 6 : is_module ? 4
-				: need_inc ? 2 : 0) + !first;
+	      const unsigned index
+		= was_gen ? 8
+		: ((was_module ? 6 : is_module ? 4 : need_inc ? 2 : 0)
+		   + !first);
 
 	      pp_verbatim (context->printer, "%s%s %r%s%s%R",
 			   first ? "" : was_module ? ", " : ",\n",
@@ -2573,12 +2578,10 @@ assert_location_text (const char *expected_loc_text,
   dc.column_unit = column_unit;
   dc.column_origin = origin;
 
-  expanded_location xloc;
+  expanded_location xloc = {};
   xloc.file = filename;
   xloc.line = line;
   xloc.column = column;
-  xloc.data = NULL;
-  xloc.sysp = false;
 
   char *actual_loc_text = diagnostic_get_location_text (&dc, xloc);
   ASSERT_STREQ (expected_loc_text, actual_loc_text);
diff --git a/gcc/gcc-rich-location.cc b/gcc/gcc-rich-location.cc
index 0fa4239bd29..4836624f03c 100644
--- a/gcc/gcc-rich-location.cc
+++ b/gcc/gcc-rich-location.cc
@@ -78,7 +78,7 @@ static bool
 blank_line_before_p (location_t loc)
 {
   expanded_location exploc = expand_location (loc);
-  char_span line = location_get_source_line (exploc.file, exploc.line);
+  char_span line = location_get_source_line (exploc);
   if (!line)
     return false;
   if (line.length () < (size_t)exploc.column)
diff --git a/gcc/input.cc b/gcc/input.cc
index 483cb6e940d..3cf5480551d 100644
--- a/gcc/input.cc
+++ b/gcc/input.cc
@@ -35,6 +35,12 @@ special_fname_builtin ()
   return _("<built-in>");
 }
 
+const char *
+special_fname_generated ()
+{
+  return _("<generated>");
+}
+
 /* Input charset configuration.  */
 static const char *default_charset_callback (const char *)
 {
@@ -58,7 +64,7 @@ public:
   ~file_cache_slot ();
 
   bool read_line_num (size_t line_num,
-		      char ** line, ssize_t *line_len);
+		      const char **line, ssize_t *line_len);
 
   /* Accessors.  */
   const char *get_file_path () const { return m_file_path; }
@@ -71,7 +77,8 @@ public:
   void inc_use_count () { m_use_count++; }
 
   bool create (const file_cache::input_context &in_context,
-	       const char *file_path, FILE *fp, unsigned highest_use_count);
+	       const char *file_path, FILE *fp, unsigned highest_use_count,
+	       unsigned int generated_data_len);
   void evict ();
 
  private:
@@ -106,8 +113,8 @@ public:
   void maybe_grow ();
   bool read_data ();
   bool maybe_read_data ();
-  bool get_next_line (char **line, ssize_t *line_len);
-  bool read_next_line (char ** line, ssize_t *line_len);
+  bool get_next_line (const char **line, ssize_t *line_len);
+  bool read_next_line (const char **line, ssize_t *line_len);
   bool goto_next_line ();
 
   static const size_t buffer_size = 4 * 1024;
@@ -126,10 +133,16 @@ public:
 
   FILE *m_fp;
 
-  /* This points to the content of the file that we've read so
-     far.  */
+  /* This is a buffer owned by this object, holding the content of the file
+     that we've read so far.  */
   char *m_data;
 
+  /* This is the current buffer from which to obtain data.  It is usually
+     equal to m_data, except when we are handling internally generated
+     content that already lives in memory and does not require a separate
+     buffer here.  */
+  const char *m_data_active;
+
   /* The allocated buffer to be freed may start a little earlier than DATA,
      e.g. if a UTF8 BOM was skipped at the beginning.  */
   int m_alloc_offset;
@@ -179,6 +192,7 @@ public:
     gcc_assert (m_data);
     m_alloc_offset += offset;
     m_data += offset;
+    m_data_active = m_data;
     m_size -= offset;
   }
 
@@ -282,6 +296,8 @@ expand_location_1 (location_t loc,
   xloc.data = block;
   if (loc <= BUILTINS_LOCATION)
     xloc.file = loc == UNKNOWN_LOCATION ? NULL : special_fname_builtin ();
+  else if (xloc.generated_data_len)
+    xloc.file = special_fname_generated ();
 
   return xloc;
 }
@@ -445,16 +461,23 @@ file_cache::evicted_cache_tab_entry (unsigned *highest_use_count)
    num_file_slots files are cached.  */
 
 file_cache_slot*
-file_cache::add_file (const char *file_path)
+file_cache::add_file (const char *file_path, unsigned int generated_data_len)
 {
 
-  FILE *fp = fopen (file_path, "r");
-  if (fp == NULL)
-    return NULL;
+  FILE *fp;
+  if (generated_data_len)
+    fp = NULL;
+  else
+    {
+      fp = fopen (file_path, "r");
+      if (fp == NULL)
+	return NULL;
+    }
 
   unsigned highest_use_count = 0;
   file_cache_slot *r = evicted_cache_tab_entry (&highest_use_count);
-  if (!r->create (in_context, file_path, fp, highest_use_count))
+  if (!r->create (in_context, file_path, fp, highest_use_count,
+		  generated_data_len))
     return NULL;
   return r;
 }
@@ -465,14 +488,13 @@ file_cache::add_file (const char *file_path)
 bool
 file_cache_slot::create (const file_cache::input_context &in_context,
 			 const char *file_path, FILE *fp,
-			 unsigned highest_use_count)
+			 unsigned highest_use_count,
+			 unsigned int generated_data_len)
 {
   m_file_path = file_path;
   if (m_fp)
     fclose (m_fp);
   m_fp = fp;
-  if (m_alloc_offset)
-    offset_buffer (-m_alloc_offset);
   m_nb_read = 0;
   m_line_start_idx = 0;
   m_line_num = 0;
@@ -480,9 +502,23 @@ file_cache_slot::create (const file_cache::input_context &in_context,
   /* Ensure that this cache entry doesn't get evicted next time
      add_file_to_cache_tab is called.  */
   m_use_count = ++highest_use_count;
-  m_total_lines = total_lines_num (file_path);
   m_missing_trailing_newline = true;
 
+  /* If this is generated data, then file_path points to it and we temporarily
+     source from there rather than from our own m_data buffer.  */
+  if (!m_fp)
+    {
+      gcc_assert (generated_data_len);
+      m_data_active = file_path;
+      m_nb_read = generated_data_len;
+      m_total_lines = 0;
+      return true;
+    }
+
+  m_data_active = m_data;
+  m_total_lines = total_lines_num (file_path);
+  if (m_alloc_offset)
+    offset_buffer (-m_alloc_offset);
 
   /* Check the input configuration to determine if we need to do any
      transformations, such as charset conversion or BOM skipping.  */
@@ -497,7 +533,7 @@ file_cache_slot::create (const file_cache::input_context &in_context,
 	return false;
       if (m_data)
 	XDELETEVEC (m_data);
-      m_data = cs.data;
+      m_data_active = m_data = cs.data;
       m_nb_read = m_size = cs.len;
       m_alloc_offset = cs.data - cs.to_free;
     }
@@ -535,11 +571,12 @@ file_cache::~file_cache ()
    it.  */
 
 file_cache_slot*
-file_cache::lookup_or_add_file (const char *file_path)
+file_cache::lookup_or_add_file (const char *file_path,
+				unsigned int generated_data_len)
 {
   file_cache_slot *r = lookup_file (file_path);
   if (r == NULL)
-    r = add_file (file_path);
+    r = add_file (file_path, generated_data_len);
   return r;
 }
 
@@ -547,7 +584,8 @@ file_cache::lookup_or_add_file (const char *file_path)
    diagnostic.  */
 
 file_cache_slot::file_cache_slot ()
-: m_use_count (0), m_file_path (NULL), m_fp (NULL), m_data (0),
+: m_use_count (0), m_file_path (NULL), m_fp (NULL),
+  m_data (0), m_data_active (0),
   m_alloc_offset (0), m_size (0), m_nb_read (0), m_line_start_idx (0),
   m_line_num (0), m_total_lines (0), m_missing_trailing_newline (true)
 {
@@ -599,6 +637,8 @@ file_cache_slot::needs_grow_p () const
 void
 file_cache_slot::maybe_grow ()
 {
+  gcc_checking_assert (m_data_active == m_data);
+
   if (!needs_grow_p ())
     return;
 
@@ -616,6 +656,8 @@ file_cache_slot::maybe_grow ()
       m_data = XRESIZEVEC (char, m_data, m_size);
       offset_buffer (offset);
     }
+
+  m_data_active = m_data;
 }
 
 /*  Read more data into the cache.  Extends the cache if need be.
@@ -624,6 +666,8 @@ file_cache_slot::maybe_grow ()
 bool
 file_cache_slot::read_data ()
 {
+  gcc_checking_assert (m_data_active == m_data);
+
   if (feof (m_fp) || ferror (m_fp))
     return false;
 
@@ -657,8 +701,8 @@ file_cache_slot::maybe_read_data ()
    terminator was not found.  We need to determine line endings in the same
    manner that libcpp does: any of \n, \r\n, or \r is a line ending.  */
 
-static char *
-find_end_of_line (char *s, size_t len)
+static const char *
+find_end_of_line (const char *s, size_t len)
 {
   for (const auto end = s + len; s != end; ++s)
     {
@@ -694,7 +738,7 @@ find_end_of_line (char *s, size_t len)
    make the content of *LINE invalid.  */
 
 bool
-file_cache_slot::get_next_line (char **line, ssize_t *line_len)
+file_cache_slot::get_next_line (const char **line, ssize_t *line_len)
 {
   /* Fill the cache with data to process.  */
   maybe_read_data ();
@@ -704,18 +748,18 @@ file_cache_slot::get_next_line (char **line, ssize_t *line_len)
     /* There is no more data to process.  */
     return false;
 
-  char *line_start = m_data + m_line_start_idx;
+  const char *line_start = m_data_active + m_line_start_idx;
 
-  char *next_line_start = NULL;
+  const char *next_line_start = NULL;
   size_t len = 0;
-  char *line_end = find_end_of_line (line_start, remaining_size);
+  const char *line_end = find_end_of_line (line_start, remaining_size);
   if (line_end == NULL)
     {
       /* We haven't found an end-of-line delimiter in the cache.
 	 Fill the cache with more data from the file and look again.  */
       while (maybe_read_data ())
 	{
-	  line_start = m_data + m_line_start_idx;
+	  line_start = m_data_active + m_line_start_idx;
 	  remaining_size = m_nb_read - m_line_start_idx;
 	  line_end = find_end_of_line (line_start, remaining_size);
 	  if (line_end != NULL)
@@ -734,7 +778,7 @@ file_cache_slot::get_next_line (char **line, ssize_t *line_len)
 
 	     If the file ends in a \r, we didn't identify it as a line
 	     terminator above, so do that now instead.  */
-	  line_end = m_data + m_nb_read;
+	  line_end = m_data_active + m_nb_read;
 	  if (m_nb_read && line_end[-1] == '\r')
 	    {
 	      --line_end;
@@ -785,7 +829,7 @@ file_cache_slot::get_next_line (char **line, ssize_t *line_len)
 	m_line_record.safe_push
 	  (file_cache_slot::line_info (m_line_num,
 				       m_line_start_idx,
-				       line_end - m_data));
+				       line_end - m_data_active));
       else if (m_total_lines > line_record_size)
 	{
 	  /* ... otherwise, we just scale total_lines down to
@@ -796,14 +840,14 @@ file_cache_slot::get_next_line (char **line, ssize_t *line_len)
 	    m_line_record.safe_push
 	      (file_cache_slot::line_info (m_line_num,
 					   m_line_start_idx,
-					   line_end - m_data));
+					   line_end - m_data_active));
 	}
     }
 
   /* Update m_line_start_idx so that it points to the next line to be
      read.  */
   if (next_line_start)
-    m_line_start_idx = next_line_start - m_data;
+    m_line_start_idx = next_line_start - m_data_active;
   else
     /* We didn't find any terminal '\n'.  Let's consider that the end
        of line is the end of the data in the cache.  The next
@@ -826,7 +870,7 @@ file_cache_slot::get_next_line (char **line, ssize_t *line_len)
 bool
 file_cache_slot::goto_next_line ()
 {
-  char *l;
+  const char *l;
   ssize_t len;
 
   return get_next_line (&l, &len);
@@ -841,7 +885,7 @@ file_cache_slot::goto_next_line ()
 
 bool
 file_cache_slot::read_line_num (size_t line_num,
-		       char ** line, ssize_t *line_len)
+				const char **line, ssize_t *line_len)
 {
   gcc_assert (line_num > 0);
 
@@ -894,7 +938,7 @@ file_cache_slot::read_line_num (size_t line_num,
 	  if (i && i->line_num == line_num)
 	    {
 	      /* We have the start/end of the line.  */
-	      *line = m_data + i->start_pos;
+	      *line = m_data_active + i->start_pos;
 	      *line_len = i->end_pos - i->start_pos;
 	      return true;
 	    }
@@ -931,30 +975,49 @@ file_cache_slot::read_line_num (size_t line_num,
    If the function fails, a NULL char_span is returned.  */
 
 char_span
-location_get_source_line (const char *file_path, int line)
+location_get_source_line (expanded_location xloc, int line)
 {
-  char *buffer = NULL;
-  ssize_t len;
+  xloc.line = line;
 
-  if (line == 0)
+  if (xloc.line == 0)
     return char_span (NULL, 0);
 
-  if (file_path == NULL)
+  if (xloc.file == NULL)
     return char_span (NULL, 0);
 
   diagnostic_file_cache_init ();
 
-  file_cache_slot *c = global_dc->m_file_cache->lookup_or_add_file (file_path);
+  file_cache_slot *c = global_dc->m_file_cache->lookup_or_add_file
+    (xloc.generated_data ? xloc.generated_data : xloc.file,
+     xloc.generated_data_len);
+
   if (c == NULL)
     return char_span (NULL, 0);
 
-  bool read = c->read_line_num (line, &buffer, &len);
+  const char *buffer = NULL;
+  ssize_t len;
+  bool read = c->read_line_num (xloc.line, &buffer, &len);
   if (!read)
     return char_span (NULL, 0);
 
   return char_span (buffer, len);
 }
 
+char_span
+location_get_source_line (expanded_location xloc)
+{
+  return location_get_source_line (xloc, xloc.line);
+}
+
+char_span
+location_get_source_line (const char *file_path, int line)
+{
+  expanded_location xloc = {};
+  xloc.file = file_path;
+  xloc.line = line;
+  return location_get_source_line (xloc);
+}
+
 /* Determine if FILE_PATH missing a trailing newline on its final line.
    Only valid to call once all of the file has been loaded, by
    requesting a line number beyond the end of the file.  */
@@ -964,7 +1027,8 @@ location_missing_trailing_newline (const char *file_path)
 {
   diagnostic_file_cache_init ();
 
-  file_cache_slot *c = global_dc->m_file_cache->lookup_or_add_file (file_path);
+  file_cache_slot *c
+    = global_dc->m_file_cache->lookup_or_add_file (file_path, 0);
   if (c == NULL)
     return false;
 
@@ -1110,9 +1174,10 @@ int
 location_compute_display_column (expanded_location exploc,
 				 const cpp_char_column_policy &policy)
 {
-  if (!(exploc.file && *exploc.file && exploc.line && exploc.column))
+  if (!(exploc.file && (exploc.generated_data_len || *exploc.file)
+	&& exploc.line && exploc.column))
     return exploc.column;
-  char_span line = location_get_source_line (exploc.file, exploc.line);
+  char_span line = location_get_source_line (exploc);
   /* If line is NULL, this function returns exploc.column which is the
      desired fallback.  */
   return cpp_byte_column_to_display_column (line.get_buffer (), line.length (),
@@ -1298,6 +1363,9 @@ dump_location_info (FILE *stream)
       case LC_ENTER_MACRO:
 	reason = "LC_RENAME_MACRO";
 	break;
+      case LC_GEN:
+	reason = "LC_GEN";
+	break;
       default:
 	reason = "Unknown";
       }
@@ -1327,8 +1395,7 @@ dump_location_info (FILE *stream)
 	    {
 	      /* Beginning of a new source line: draw the line.  */
 
-	      char_span line_text = location_get_source_line (exploc.file,
-							      exploc.line);
+	      char_span line_text = location_get_source_line (exploc);
 	      if (!line_text)
 		break;
 	      fprintf (stream,
@@ -1655,7 +1722,7 @@ get_substring_ranges_for_loc (cpp_reader *pfile,
       if (start.column > finish.column)
 	return "range endpoints are reversed";
 
-      char_span line = location_get_source_line (start.file, start.line);
+      char_span line = location_get_source_line (start);
       if (!line)
 	return "unable to read source line";
 
@@ -1871,6 +1938,20 @@ get_num_source_ranges_for_substring (cpp_reader *pfile,
 
 /* Selftests of location handling.  */
 
+/* Wrapper around linemap_add to handle transparently adding either a tmp file,
+   or in-memory generated content.  */
+const line_map_ordinary *
+temp_source_file::do_linemap_add (int line)
+{
+  const line_map *map;
+  if (content_buf)
+    map = linemap_add (line_table, LC_GEN, false, content_buf,
+		       line, content_len);
+  else
+    map = linemap_add (line_table, LC_ENTER, false, get_filename (), line);
+  return linemap_check_ordinary (map);
+}
+
 /* Verify that compare() on linenum_type handles comparisons over the full
    range of the type.  */
 
@@ -1949,13 +2030,16 @@ assert_loceq (const char *exp_filename, int exp_linenum, int exp_colnum,
 class line_table_case
 {
 public:
-  line_table_case (int default_range_bits, int base_location)
+  line_table_case (int default_range_bits, int base_location,
+		   bool generated_data)
   : m_default_range_bits (default_range_bits),
-    m_base_location (base_location)
+    m_base_location (base_location),
+    m_generated_data (generated_data)
   {}
 
   int m_default_range_bits;
   int m_base_location;
+  bool m_generated_data;
 };
 
 /* Constructor.  Store the old value of line_table, and create a new
@@ -1972,6 +2056,7 @@ line_table_test::line_table_test ()
   gcc_assert (saved_line_table->round_alloc_size);
   line_table->round_alloc_size = saved_line_table->round_alloc_size;
   line_table->default_range_bits = 0;
+  m_generated_data = false;
 }
 
 /* Constructor.  Store the old value of line_table, and create a new
@@ -1993,6 +2078,7 @@ line_table_test::line_table_test (const line_table_case &case_)
       line_table->highest_location = case_.m_base_location;
       line_table->highest_line = case_.m_base_location;
     }
+  m_generated_data = case_.m_generated_data;
 }
 
 /* Destructor.  Restore the old value of line_table.  */
@@ -2012,7 +2098,10 @@ test_accessing_ordinary_linemaps (const line_table_case &case_)
   line_table_test ltt (case_);
 
   /* Build a simple linemap describing some locations. */
-  linemap_add (line_table, LC_ENTER, false, "foo.c", 0);
+  if (ltt.m_generated_data)
+    linemap_add (line_table, LC_GEN, false, "some data", 0, 10);
+  else
+    linemap_add (line_table, LC_ENTER, false, "foo.c", 0);
 
   linemap_line_start (line_table, 1, 100);
   location_t loc_a = linemap_position_for_column (line_table, 1);
@@ -2062,21 +2151,23 @@ test_accessing_ordinary_linemaps (const line_table_case &case_)
   linemap_add (line_table, LC_LEAVE, false, NULL, 0);
 
   /* Verify that we can recover the location info.  */
-  assert_loceq ("foo.c", 1, 1, loc_a);
-  assert_loceq ("foo.c", 1, 23, loc_b);
-  assert_loceq ("foo.c", 2, 1, loc_c);
-  assert_loceq ("foo.c", 2, 17, loc_d);
-  assert_loceq ("foo.c", 3, 700, loc_e);
-  assert_loceq ("foo.c", 4, 100, loc_back_to_short);
+  const auto fname
+    = (ltt.m_generated_data ? special_fname_generated () : "foo.c");
+  assert_loceq (fname, 1, 1, loc_a);
+  assert_loceq (fname, 1, 23, loc_b);
+  assert_loceq (fname, 2, 1, loc_c);
+  assert_loceq (fname, 2, 17, loc_d);
+  assert_loceq (fname, 3, 700, loc_e);
+  assert_loceq (fname, 4, 100, loc_back_to_short);
 
   /* In the very wide line, the initial location should be fully tracked.  */
-  assert_loceq ("foo.c", 5, 2000, loc_start_of_very_long_line);
+  assert_loceq (fname, 5, 2000, loc_start_of_very_long_line);
   /* ...but once we exceed LINE_MAP_MAX_COLUMN_NUMBER column-tracking should
      be disabled.  */
-  assert_loceq ("foo.c", 5, 0, loc_too_wide);
-  assert_loceq ("foo.c", 5, 0, loc_too_wide_2);
+  assert_loceq (fname, 5, 0, loc_too_wide);
+  assert_loceq (fname, 5, 0, loc_too_wide_2);
   /*...and column-tracking should be re-enabled for subsequent lines.  */
-  assert_loceq ("foo.c", 6, 10, loc_sane_again);
+  assert_loceq (fname, 6, 10, loc_sane_again);
 
   assert_loceq ("bar.c", 1, 150, loc_f);
 
@@ -2123,10 +2214,11 @@ test_make_location_nonpure_range_endpoints (const line_table_case &case_)
      with C++ frontend.
      ....................0000000001111111111222.
      ....................1234567890123456789012.  */
-  const char *content = "     r += !aaa == bbb;\n";
-  temp_source_file tmp (SELFTEST_LOCATION, ".C", content);
   line_table_test ltt (case_);
-  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
+  const char *content = "     r += !aaa == bbb;\n";
+  temp_source_file tmp (SELFTEST_LOCATION, ".C", content, strlen (content),
+			ltt.m_generated_data);
+  tmp.do_linemap_add (1);
 
   const location_t c11 = linemap_position_for_column (line_table, 11);
   const location_t c12 = linemap_position_for_column (line_table, 12);
@@ -3783,7 +3875,8 @@ static const location_t boundary_locations[] = {
 /* Run TESTCASE multiple times, once for each case in our test matrix.  */
 
 void
-for_each_line_table_case (void (*testcase) (const line_table_case &))
+for_each_line_table_case (void (*testcase) (const line_table_case &),
+			  bool test_generated_data)
 {
   /* As noted above in the description of struct line_table_case,
      we want to explore a test matrix of interesting line_table
@@ -3802,16 +3895,19 @@ for_each_line_table_case (void (*testcase) (const line_table_case &))
       const int num_boundary_locations = ARRAY_SIZE (boundary_locations);
       for (int loc_idx = 0; loc_idx < num_boundary_locations; loc_idx++)
 	{
-	  line_table_case c (default_range_bits, boundary_locations[loc_idx]);
-
-	  testcase (c);
-
-	  num_cases_tested++;
+	  /* ...and try both normal files, and internally generated data.  */
+	  for (int gen = 0; gen != 1+test_generated_data; ++gen)
+	    {
+	      line_table_case c (default_range_bits,
+				 boundary_locations[loc_idx], gen);
+	      testcase (c);
+	      num_cases_tested++;
+	    }
 	}
     }
 
   /* Verify that we fully covered the test matrix.  */
-  ASSERT_EQ (num_cases_tested, 2 * 12);
+  ASSERT_EQ (num_cases_tested, 2 * 12 * (1+test_generated_data));
 }
 
 /* Verify that when presented with a consecutive pair of locations with
@@ -3822,7 +3918,7 @@ for_each_line_table_case (void (*testcase) (const line_table_case &))
 static void
 test_line_offset_overflow ()
 {
-  line_table_test ltt (line_table_case (5, 0));
+  line_table_test ltt (line_table_case (5, 0, false));
 
   linemap_add (line_table, LC_ENTER, false, "foo.c", 0);
   linemap_line_start (line_table, 1, 100);
@@ -3965,9 +4061,9 @@ input_cc_tests ()
   test_should_have_column_data_p ();
   test_unknown_location ();
   test_builtins ();
-  for_each_line_table_case (test_make_location_nonpure_range_endpoints);
+  for_each_line_table_case (test_make_location_nonpure_range_endpoints, true);
 
-  for_each_line_table_case (test_accessing_ordinary_linemaps);
+  for_each_line_table_case (test_accessing_ordinary_linemaps, true);
   for_each_line_table_case (test_lexer);
   for_each_line_table_case (test_lexer_string_locations_simple);
   for_each_line_table_case (test_lexer_string_locations_ebcdic);
diff --git a/gcc/input.h b/gcc/input.h
index 0b23e66e53b..c5a7b69400f 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -34,6 +34,7 @@ extern GTY(()) class line_maps *saved_line_table;
 
 /* Returns the translated string referring to the special location.  */
 const char *special_fname_builtin ();
+const char *special_fname_generated ();
 
 /* line-map.cc reserves RESERVED_LOCATION_COUNT to the user.  Ensure
    both UNKNOWN_LOCATION and BUILTINS_LOCATION fit into that.  */
@@ -114,6 +115,10 @@ class char_span
 };
 
 extern char_span location_get_source_line (const char *file_path, int line);
+/* The version taking an exploc handles generated source too, and should be used
+   whenever possible.  */
+extern char_span location_get_source_line (expanded_location exploc);
+extern char_span location_get_source_line (expanded_location exploc, int line);
 
 extern bool location_missing_trailing_newline (const char *file_path);
 
@@ -136,7 +141,8 @@ class file_cache
   file_cache ();
   ~file_cache ();
 
-  file_cache_slot *lookup_or_add_file (const char *file_path);
+  file_cache_slot *lookup_or_add_file (const char *file_path,
+				       unsigned int generated_data_len);
   void forcibly_evict_file (const char *file_path);
 
   /* See comments in diagnostic.h about the input conversion context.  */
@@ -150,7 +156,8 @@ class file_cache
 
  private:
   file_cache_slot *evicted_cache_tab_entry (unsigned *highest_use_count);
-  file_cache_slot *add_file (const char *file_path);
+  file_cache_slot *add_file (const char *file_path,
+			     unsigned int generated_data_len);
   file_cache_slot *lookup_file (const char *file_path);
 
  private:
diff --git a/gcc/selftest.cc b/gcc/selftest.cc
index 89abfba5e80..179a41bde1c 100644
--- a/gcc/selftest.cc
+++ b/gcc/selftest.cc
@@ -163,14 +163,21 @@ assert_str_startswith (const location &loc,
 
 named_temp_file::named_temp_file (const char *suffix)
 {
-  m_filename = make_temp_file (suffix);
-  ASSERT_NE (m_filename, NULL);
+  if (suffix)
+    {
+      m_filename = make_temp_file (suffix);
+      ASSERT_NE (m_filename, NULL);
+    }
+  else
+    m_filename = nullptr;
 }
 
 /* Destructor.  Delete the tempfile.  */
 
 named_temp_file::~named_temp_file ()
 {
+  if (!m_filename)
+    return;
   unlink (m_filename);
   diagnostics_file_cache_forcibly_evict_file (m_filename);
   free (m_filename);
@@ -183,7 +190,9 @@ named_temp_file::~named_temp_file ()
 temp_source_file::temp_source_file (const location &loc,
 				    const char *suffix,
 				    const char *content)
-: named_temp_file (suffix)
+: named_temp_file (suffix),
+  content_buf (nullptr),
+  content_len (0)
 {
   FILE *out = fopen (get_filename (), "w");
   if (!out)
@@ -192,19 +201,37 @@ temp_source_file::temp_source_file (const location &loc,
   fclose (out);
 }
 
-/* As above, but with a size, to allow for NUL bytes in CONTENT.  */
+/* As above, but with a size, to allow for NUL bytes in CONTENT.  When
+   IS_GENERATED==true, the data is kept in memory instead, for testing LC_GEN
+   maps.  */
 
 temp_source_file::temp_source_file (const location &loc,
 				    const char *suffix,
 				    const char *content,
-				    size_t sz)
-: named_temp_file (suffix)
+				    size_t sz,
+				    bool is_generated)
+: named_temp_file (is_generated ? nullptr : suffix),
+  content_buf (is_generated ? XNEWVEC (char, sz) : nullptr),
+  content_len (is_generated ? sz : 0)
 {
-  FILE *out = fopen (get_filename (), "w");
-  if (!out)
-    fail_formatted (loc, "unable to open tempfile: %s", get_filename ());
-  fwrite (content, sz, 1, out);
-  fclose (out);
+  if (is_generated)
+    {
+      gcc_assert (sz); /* Empty generated content is not supported.  */
+      memcpy (content_buf, content, sz);
+    }
+  else
+    {
+      FILE *out = fopen (get_filename (), "w");
+      if (!out)
+	fail_formatted (loc, "unable to open tempfile: %s", get_filename ());
+      fwrite (content, sz, 1, out);
+      fclose (out);
+    }
+}
+
+temp_source_file::~temp_source_file ()
+{
+  XDELETEVEC (content_buf);
 }
 
 /* Avoid introducing locale-specific differences in the results
diff --git a/gcc/selftest.h b/gcc/selftest.h
index 7568a6d24d4..ab1c9025349 100644
--- a/gcc/selftest.h
+++ b/gcc/selftest.h
@@ -25,6 +25,8 @@ along with GCC; see the file COPYING3.  If not see
 
 #if CHECKING_P
 
+struct line_map_ordinary;
+
 namespace selftest {
 
 /* A struct describing the source-location of a selftest, to make it
@@ -96,10 +98,9 @@ extern void assert_str_startswith (const location &loc,
 class named_temp_file
 {
  public:
-  named_temp_file (const char *suffix);
+  explicit named_temp_file (const char *suffix);
   ~named_temp_file ();
   const char *get_filename () const { return m_filename; }
-
  private:
   char *m_filename;
 };
@@ -113,7 +114,13 @@ class temp_source_file : public named_temp_file
   temp_source_file (const location &loc, const char *suffix,
 		    const char *content);
   temp_source_file (const location &loc, const char *suffix,
-		    const char *content, size_t sz);
+		    const char *content, size_t sz,
+		    bool is_generated = false);
+  ~temp_source_file ();
+
+  char *const content_buf;
+  const size_t content_len;
+  const line_map_ordinary *do_linemap_add (int line); /* In input.cc */
 };
 
 /* RAII-style class for avoiding introducing locale-specific differences
@@ -171,6 +178,10 @@ class line_table_test
 
   /* Destructor.  Restore the saved line_table.  */
   ~line_table_test ();
+
+  /* When this is enabled in the line_table_case, test storing all the data
+     in memory rather than a file.  */
+  bool m_generated_data;
 };
 
 /* Helper function for selftests that need a function decl.  */
@@ -183,7 +194,8 @@ extern tree make_fndecl (tree return_type,
 /* Run TESTCASE multiple times, once for each case in our test matrix.  */
 
 extern void
-for_each_line_table_case (void (*testcase) (const line_table_case &));
+for_each_line_table_case (void (*testcase) (const line_table_case &),
+			  bool test_generated_data = false);
 
 /* Read the contents of PATH into memory, returning a 0-terminated buffer
    that must be freed by the caller.
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
index baa6b629b83..29e653625f8 100644
--- a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
@@ -430,7 +430,7 @@ test_show_locus (function *fun)
      to upper case.  Give all of the ranges labels (sharing one label).  */
   if (0 == strcmp (fnname, "test_many_nested_locations"))
     {
-      const char *file = LOCATION_FILE (fnstart);
+      const expanded_location xloc = expand_location (fnstart);
       const int start_line = fnstart_line + 2;
       const int finish_line = start_line + 7;
       location_t loc = get_loc (start_line - 1, 2);
@@ -438,7 +438,7 @@ test_show_locus (function *fun)
       rich_location richloc (line_table, loc);
       for (int line = start_line; line <= finish_line; line++)
 	{
-	  char_span content = location_get_source_line (file, line);
+	  char_span content = location_get_source_line (xloc, line);
 	  gcc_assert (content);
 	  /* Split line up into words.  */
 	  for (int idx = 0; idx < content.length (); idx++)
diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index 50207cacc12..eb281809cbd 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -75,6 +75,8 @@ enum lc_reason
   LC_RENAME_VERBATIM,	/* Likewise, but "" != stdin.  */
   LC_ENTER_MACRO,	/* Begin macro expansion.  */
   LC_MODULE,		/* A (C++) Module.  */
+  LC_GEN,		/* Internally generated source.  */
+
   /* FIXME: add support for stringize and paste.  */
   LC_HWM /* High Water Mark.  */
 };
@@ -437,7 +439,11 @@ struct GTY((tag ("1"))) line_map_ordinary : public line_map {
 
   /* Pointer alignment boundary on both 32 and 64-bit systems.  */
 
-  const char *to_file;
+  /* This GTY markup is in case this is an LC_GEN map, in which case
+     to_file actually points to the generated data, which we do not
+     want to require to be free of null bytes.  */
+  const char * GTY((string_length ("%h.to_file_len"))) to_file;
+  unsigned int to_file_len;
   linenum_type to_line;
 
   /* Location from whence this line map was included.  For regular
@@ -1101,13 +1107,15 @@ extern line_map *line_map_new_raw (line_maps *, bool, unsigned);
    at least as long as the lifetime of SET.  An empty
    TO_FILE means standard input.  If reason is LC_LEAVE, and
    TO_FILE is NULL, then TO_FILE, TO_LINE and SYSP are given their
-   natural values considering the file we are returning to.
+   natural values considering the file we are returning to.  If reason
+   is LC_GEN, then TO_FILE is not a file name, but rather the actual
+   content, and TO_FILE_LEN>0 is the length of it.
 
    A call to this function can relocate the previous set of
    maps, so any stored line_map pointers should not be used.  */
 extern const line_map *linemap_add
   (class line_maps *, enum lc_reason, unsigned int sysp,
-   const char *to_file, linenum_type to_line);
+   const char *to_file, linenum_type to_line, unsigned int to_file_len = 0);
 
 /* Create a macro map.  A macro map encodes source locations of tokens
    that are part of a macro replacement-list, at a macro expansion
@@ -1304,7 +1312,8 @@ linemap_location_before_p (class line_maps *set,
 
 typedef struct
 {
-  /* The name of the source file involved.  */
+  /* The name of the source file involved, or NULL if
+     generated_data is non-NULL.  */
   const char *file;
 
   /* The line-location in the source file.  */
@@ -1316,6 +1325,10 @@ typedef struct
 
   /* In a system header?. */
   bool sysp;
+
+  /* If generated data, the data and its length.  */
+  unsigned int generated_data_len;
+  const char *generated_data;
 } expanded_location;
 
 class range_label;
diff --git a/libcpp/line-map.cc b/libcpp/line-map.cc
index 50e8043255e..2838d1103b0 100644
--- a/libcpp/line-map.cc
+++ b/libcpp/line-map.cc
@@ -513,13 +513,18 @@ LAST_SOURCE_LINE_LOCATION (const line_map_ordinary *map)
    TO_FILE is NULL, then TO_FILE, TO_LINE and SYSP are given their
    natural values considering the file we are returning to.
 
+   If reason is LC_GEN, then to_file is not a file name, but is
+   rather the actual generated content, and TO_FILE_LEN > 0 is the number
+   of bytes it contains.  Otherwise TO_FILE_LEN should be set to 0.
+
    FROM_LINE should be monotonic increasing across calls to this
    function.  A call to this function can relocate the previous set of
    maps, so any stored line_map pointers should not be used.  */
 
 const struct line_map *
 linemap_add (line_maps *set, enum lc_reason reason,
-	     unsigned int sysp, const char *to_file, linenum_type to_line)
+	     unsigned int sysp, const char *to_file, linenum_type to_line,
+	     unsigned int to_file_len)
 {
   /* Generate a start_location above the current highest_location.
      If possible, make the low range bits be zero.  */
@@ -535,8 +540,20 @@ linemap_add (line_maps *set, enum lc_reason reason,
 		      >= MAP_START_LOCATION (LINEMAPS_LAST_ORDINARY_MAP (set))));
 
   /* When we enter the file for the first time reason cannot be
-     LC_RENAME.  */
-  linemap_assert (!(set->depth == 0 && reason == LC_RENAME));
+     LC_RENAME.  To keep things simple, don't track LC_RENAME for
+     LC_GEN maps, but just keep their reason as always LC_GEN.  */
+  if (reason == LC_RENAME)
+    {
+      linemap_assert (set->depth != 0);
+      const auto prev = LINEMAPS_LAST_ORDINARY_MAP (set);
+      linemap_assert (prev);
+      if (prev->reason == LC_GEN)
+	{
+	  reason = LC_GEN;
+	  to_file = prev->to_file;
+	  to_file_len = prev->to_file_len;
+	}
+    }
 
   /* If we are leaving the main file, return a NULL map.  */
   if (reason == LC_LEAVE
@@ -557,7 +574,8 @@ linemap_add (line_maps *set, enum lc_reason reason,
     = linemap_check_ordinary (new_linemap (set, start_location));
   map->reason = reason;
 
-  if (to_file && *to_file == '\0' && reason != LC_RENAME_VERBATIM)
+  if (to_file && *to_file == '\0' && reason != LC_RENAME_VERBATIM
+      && reason != LC_GEN)
     to_file = "<stdin>";
 
   if (reason == LC_RENAME_VERBATIM)
@@ -591,6 +609,15 @@ linemap_add (line_maps *set, enum lc_reason reason,
 
   map->sysp = sysp;
   map->to_file = to_file;
+
+  if (reason == LC_GEN)
+    {
+      gcc_assert (to_file_len);
+      map->to_file_len = to_file_len;
+    }
+  else
+    map->to_file_len = (to_file_len ? to_file_len :  strlen (to_file) + 1);
+
   map->to_line = to_line;
   LINEMAPS_ORDINARY_CACHE (set) = LINEMAPS_ORDINARY_USED (set) - 1;
   /* Do not store range_bits here.  That's readjusted in
@@ -606,7 +633,7 @@ linemap_add (line_maps *set, enum lc_reason reason,
      pure_location_p.  */
   linemap_assert (pure_location_p (set, start_location));
 
-  if (reason == LC_ENTER)
+  if (reason == LC_ENTER || reason == LC_GEN)
     {
       if (set->depth == 0)
 	map->included_from = 0;
@@ -617,7 +644,7 @@ linemap_add (line_maps *set, enum lc_reason reason,
 	      & ~((1 << map[-1].m_column_and_range_bits) - 1))
 	     + map[-1].start_location);
       set->depth++;
-      if (set->trace_includes)
+      if (set->trace_includes && reason == LC_ENTER)
 	trace_include (set, map);
     }
   else if (reason == LC_RENAME)
@@ -864,7 +891,7 @@ linemap_line_start (line_maps *set, linenum_type to_line,
 		  (linemap_add (set, LC_RENAME,
 				ORDINARY_MAP_IN_SYSTEM_HEADER_P (map),
 				ORDINARY_MAP_FILE_NAME (map),
-				to_line)));
+				to_line, map->to_file_len)));
       map->m_column_and_range_bits = column_bits;
       map->m_range_bits = range_bits;
       r = (MAP_START_LOCATION (map)
@@ -1853,8 +1880,14 @@ linemap_expand_location (line_maps *set,
 	abort ();
 
       const line_map_ordinary *ord_map = linemap_check_ordinary (map);
+      if (ord_map->reason == LC_GEN)
+	{
+	  xloc.generated_data = ord_map->to_file;
+	  xloc.generated_data_len = ord_map->to_file_len;
+	}
+      else
+	xloc.file = LINEMAP_FILE (ord_map);
 
-      xloc.file = LINEMAP_FILE (ord_map);
       xloc.line = SOURCE_LINE (ord_map, loc);
       xloc.column = SOURCE_COLUMN (ord_map, loc);
       xloc.sysp = LINEMAP_SYSP (ord_map) != 0;
@@ -1873,7 +1906,7 @@ linemap_dump (FILE *stream, class line_maps *set, unsigned ix, bool is_macro)
 {
   const char *const lc_reasons_v[LC_HWM]
       = { "LC_ENTER", "LC_LEAVE", "LC_RENAME", "LC_RENAME_VERBATIM",
-	  "LC_ENTER_MACRO", "LC_MODULE" };
+	  "LC_ENTER_MACRO", "LC_MODULE", "LC_GEN" };
   const line_map *map;
   unsigned reason;
 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 5/6] diagnostics: Support generated data in additional contexts
  2022-11-04 13:44 [PATCH 0/6] diagnostics: libcpp: Overhaul locations for _Pragma tokens Lewis Hyatt
                   ` (3 preceding siblings ...)
  2022-11-04 13:44 ` [PATCH 4/6] diagnostics: libcpp: Add LC_GEN linemaps to support in-memory buffers Lewis Hyatt
@ 2022-11-04 13:44 ` Lewis Hyatt
  2022-11-04 16:42   ` David Malcolm
  2022-11-04 13:44 ` [PATCH 6/6] diagnostics: libcpp: Assign real locations to the tokens inside _Pragma strings Lewis Hyatt
  5 siblings, 1 reply; 18+ messages in thread
From: Lewis Hyatt @ 2022-11-04 13:44 UTC (permalink / raw)
  To: gcc-patches; +Cc: Lewis Hyatt

Add awareness that diagnostic locations may be in generated buffers rather
than an actual file to other places in the diagnostics code that may care,
most notably SARIF output (which needs to obtain its own snapshots of the code
involved). For edit context output, which outputs fixit hints as diffs, for
now just make sure we ignore generated data buffers. At the moment, there is
no ability for a fixit hint to be generated in such a buffer.

Because SARIF uses JSON as well, also add the ability to the json::string
class to handle a buffer with nulls in the middle (since we place no
restriction on LC_GEN content) by providing the option to specify the data
length.

gcc/ChangeLog:

	* diagnostic-format-sarif.cc (sarif_builder::xloc_to_fb): New function.
	(sarif_builder::maybe_make_physical_location_object): Support
	generated data locations.
	(sarif_builder::make_artifact_location_object): Likewise.
	(sarif_builder::maybe_make_region_object_for_context): Likewise.
	(sarif_builder::make_artifact_object): Likewise.
	(sarif_builder::maybe_make_artifact_content_object): Likewise.
	(get_source_lines): Likewise.
	* edit-context.cc (edit_context::apply_fixit): Ignore generated
	locations if one should make its way this far.
	* json.cc (string::string): Support non-null-terminated string.
	(string::print): Likewise.
	* json.h (class string): Likewise.
---
 gcc/diagnostic-format-sarif.cc | 86 +++++++++++++++++++++-------------
 gcc/edit-context.cc            |  4 ++
 gcc/json.cc                    | 17 +++++--
 gcc/json.h                     |  5 +-
 4 files changed, 75 insertions(+), 37 deletions(-)

diff --git a/gcc/diagnostic-format-sarif.cc b/gcc/diagnostic-format-sarif.cc
index 7110db4edd6..c2d18a1a16e 100644
--- a/gcc/diagnostic-format-sarif.cc
+++ b/gcc/diagnostic-format-sarif.cc
@@ -125,7 +125,10 @@ private:
   json::array *maybe_make_kinds_array (diagnostic_event::meaning m) const;
   json::object *maybe_make_physical_location_object (location_t loc);
   json::object *make_artifact_location_object (location_t loc);
-  json::object *make_artifact_location_object (const char *filename);
+
+  typedef std::pair<const char *, unsigned int> filename_or_buffer;
+  json::object *make_artifact_location_object (filename_or_buffer fb);
+
   json::object *make_artifact_location_object_for_pwd () const;
   json::object *maybe_make_region_object (location_t loc) const;
   json::object *maybe_make_region_object_for_context (location_t loc) const;
@@ -146,16 +149,17 @@ private:
   json::object *make_reporting_descriptor_object_for_cwe_id (int cwe_id) const;
   json::object *
   make_reporting_descriptor_reference_object_for_cwe_id (int cwe_id);
-  json::object *make_artifact_object (const char *filename);
-  json::object *maybe_make_artifact_content_object (const char *filename) const;
-  json::object *maybe_make_artifact_content_object (const char *filename,
-						    int start_line,
+  json::object *make_artifact_object (filename_or_buffer fb);
+  json::object *
+  maybe_make_artifact_content_object (filename_or_buffer fb) const;
+  json::object *maybe_make_artifact_content_object (expanded_location xloc,
 						    int end_line) const;
   json::object *make_fix_object (const rich_location &rich_loc);
   json::object *make_artifact_change_object (const rich_location &richloc);
   json::object *make_replacement_object (const fixit_hint &hint) const;
   json::object *make_artifact_content_object (const char *text) const;
   int get_sarif_column (expanded_location exploc) const;
+  static filename_or_buffer xloc_to_fb (expanded_location xloc);
 
   diagnostic_context *m_context;
 
@@ -166,7 +170,11 @@ private:
      diagnostic group.  */
   sarif_result *m_cur_group_result;
 
-  hash_set <const char *> m_filenames;
+  /* If the second member is >0, then this is a buffer of generated content,
+     with that length, not a filename.  */
+  hash_set <pair_hash <nofree_ptr_hash <const char>,
+		       int_hash <unsigned int, -1U> >
+	    > m_filenames;
   bool m_seen_any_relative_paths;
   hash_set <free_string_hash> m_rule_id_set;
   json::array *m_rules_arr;
@@ -588,6 +596,15 @@ sarif_builder::make_location_object (const diagnostic_event &event)
   return location_obj;
 }
 
+/* Populate a filename_or_buffer pair from an expanded location.  */
+sarif_builder::filename_or_buffer
+sarif_builder::xloc_to_fb (expanded_location xloc)
+{
+  if (xloc.generated_data_len)
+    return filename_or_buffer (xloc.generated_data, xloc.generated_data_len);
+  return filename_or_buffer (xloc.file, 0);
+}
+
 /* Make a physicalLocation object (SARIF v2.1.0 section 3.29) for LOC,
    or return NULL;
    Add any filename to the m_artifacts.  */
@@ -603,7 +620,7 @@ sarif_builder::maybe_make_physical_location_object (location_t loc)
   /* "artifactLocation" property (SARIF v2.1.0 section 3.29.3).  */
   json::object *artifact_loc_obj = make_artifact_location_object (loc);
   phys_loc_obj->set ("artifactLocation", artifact_loc_obj);
-  m_filenames.add (LOCATION_FILE (loc));
+  m_filenames.add (xloc_to_fb (expand_location (loc)));
 
   /* "region" property (SARIF v2.1.0 section 3.29.4).  */
   if (json::object *region_obj = maybe_make_region_object (loc))
@@ -627,7 +644,7 @@ sarif_builder::maybe_make_physical_location_object (location_t loc)
 json::object *
 sarif_builder::make_artifact_location_object (location_t loc)
 {
-  return make_artifact_location_object (LOCATION_FILE (loc));
+  return make_artifact_location_object (xloc_to_fb (expand_location (loc)));
 }
 
 /* The ID value for use in "uriBaseId" properties (SARIF v2.1.0 section 3.4.4)
@@ -639,10 +656,12 @@ sarif_builder::make_artifact_location_object (location_t loc)
    or return NULL.  */
 
 json::object *
-sarif_builder::make_artifact_location_object (const char *filename)
+sarif_builder::make_artifact_location_object (filename_or_buffer fb)
 {
   json::object *artifact_loc_obj = new json::object ();
 
+  const auto filename = (fb.second ? special_fname_generated () : fb.first);
+
   /* "uri" property (SARIF v2.1.0 section 3.4.3).  */
   artifact_loc_obj->set ("uri", new json::string (filename));
 
@@ -795,9 +814,7 @@ sarif_builder::maybe_make_region_object_for_context (location_t loc) const
 
   /* "snippet" property (SARIF v2.1.0 section 3.30.13).  */
   if (json::object *artifact_content_obj
-	 = maybe_make_artifact_content_object (exploc_start.file,
-					       exploc_start.line,
-					       exploc_finish.line))
+	= maybe_make_artifact_content_object (exploc_start, exploc_finish.line))
     region_obj->set ("snippet", artifact_content_obj);
 
   return region_obj;
@@ -1248,24 +1265,24 @@ sarif_builder::maybe_make_cwe_taxonomy_object () const
 /* Make an artifact object (SARIF v2.1.0 section 3.24).  */
 
 json::object *
-sarif_builder::make_artifact_object (const char *filename)
+sarif_builder::make_artifact_object (filename_or_buffer fb)
 {
   json::object *artifact_obj = new json::object ();
 
   /* "location" property (SARIF v2.1.0 section 3.24.2).  */
-  json::object *artifact_loc_obj = make_artifact_location_object (filename);
+  json::object *artifact_loc_obj = make_artifact_location_object (fb);
   artifact_obj->set ("location", artifact_loc_obj);
 
   /* "contents" property (SARIF v2.1.0 section 3.24.8).  */
   if (json::object *artifact_content_obj
-	= maybe_make_artifact_content_object (filename))
+	= maybe_make_artifact_content_object (fb))
     artifact_obj->set ("contents", artifact_content_obj);
 
   /* "sourceLanguage" property (SARIF v2.1.0 section 3.24.10).  */
   if (m_context->m_client_data_hooks)
     if (const char *source_lang
 	= m_context->m_client_data_hooks->maybe_get_sarif_source_language
-	    (filename))
+	    (fb.first))
       artifact_obj->set ("sourceLanguage", new json::string (source_lang));
 
   return artifact_obj;
@@ -1331,16 +1348,21 @@ maybe_read_file (const char *filename)
    full contents of FILENAME.  */
 
 json::object *
-sarif_builder::maybe_make_artifact_content_object (const char *filename) const
+sarif_builder::maybe_make_artifact_content_object (filename_or_buffer fb) const
 {
-  char *text_utf8 = maybe_read_file (filename);
-  if (!text_utf8)
-    return NULL;
-
-  json::object *artifact_content_obj = new json::object ();
-  artifact_content_obj->set ("text", new json::string (text_utf8));
-  free (text_utf8);
-
+  json::object *artifact_content_obj = nullptr;
+  if (fb.second)
+    {
+      artifact_content_obj = new json::object ();
+      artifact_content_obj->set ("text", new json::string (fb.first,
+							   fb.second));
+    }
+  else if (char *text_utf8 = maybe_read_file (fb.first))
+    {
+      artifact_content_obj = new json::object ();
+      artifact_content_obj->set ("text", new json::string (text_utf8));
+      free (text_utf8);
+    }
   return artifact_content_obj;
 }
 
@@ -1348,15 +1370,14 @@ sarif_builder::maybe_make_artifact_content_object (const char *filename) const
    a freshly-allocated 0-terminated buffer containing them, or NULL.  */
 
 static char *
-get_source_lines (const char *filename,
-		  int start_line,
+get_source_lines (expanded_location xloc,
 		  int end_line)
 {
   auto_vec<char> result;
 
-  for (int line = start_line; line <= end_line; line++)
+  for (int line = xloc.line; line <= end_line; line++)
     {
-      char_span line_content = location_get_source_line (filename, line);
+      char_span line_content = location_get_source_line (xloc, line);
       if (!line_content.get_buffer ())
 	return NULL;
       result.reserve (line_content.length () + 1);
@@ -1370,14 +1391,13 @@ get_source_lines (const char *filename,
 }
 
 /* Make an artifactContent object (SARIF v2.1.0 section 3.3) for the given
-   run of lines within FILENAME (including the endpoints).  */
+   run of lines starting at XLOC (including the endpoints).  */
 
 json::object *
-sarif_builder::maybe_make_artifact_content_object (const char *filename,
-						   int start_line,
+sarif_builder::maybe_make_artifact_content_object (expanded_location xloc,
 						   int end_line) const
 {
-  char *text_utf8 = get_source_lines (filename, start_line, end_line);
+  char *text_utf8 = get_source_lines (xloc, end_line);
 
   if (!text_utf8)
     return NULL;
diff --git a/gcc/edit-context.cc b/gcc/edit-context.cc
index 6879ddd41b4..aa95bc0834f 100644
--- a/gcc/edit-context.cc
+++ b/gcc/edit-context.cc
@@ -301,8 +301,12 @@ edit_context::apply_fixit (const fixit_hint *hint)
     return false;
   if (start.column == 0)
     return false;
+  if (start.generated_data)
+    return false;
   if (next_loc.column == 0)
     return false;
+  if (next_loc.generated_data)
+    return false;
 
   edited_file &file = get_or_insert_file (start.file);
   if (!m_valid)
diff --git a/gcc/json.cc b/gcc/json.cc
index 974f8c36825..3ebe8495e96 100644
--- a/gcc/json.cc
+++ b/gcc/json.cc
@@ -190,6 +190,15 @@ string::string (const char *utf8)
 {
   gcc_assert (utf8);
   m_utf8 = xstrdup (utf8);
+  m_len = strlen (utf8);
+}
+
+string::string (const char *utf8, size_t len)
+{
+  gcc_assert (utf8);
+  m_utf8 = XNEWVEC (char, len);
+  m_len = len;
+  memcpy (m_utf8, utf8, len);
 }
 
 /* Implementation of json::value::print for json::string.  */
@@ -198,9 +207,9 @@ void
 string::print (pretty_printer *pp) const
 {
   pp_character (pp, '"');
-  for (const char *ptr = m_utf8; *ptr; ptr++)
+  for (size_t i = 0; i != m_len; ++i)
     {
-      char ch = *ptr;
+      char ch = m_utf8[i];
       switch (ch)
 	{
 	case '"':
@@ -224,7 +233,9 @@ string::print (pretty_printer *pp) const
 	case '\t':
 	  pp_string (pp, "\\t");
 	  break;
-
+	case '\0':
+	  pp_string (pp, "\\0");
+	  break;
 	default:
 	  pp_character (pp, ch);
 	}
diff --git a/gcc/json.h b/gcc/json.h
index f272981259b..f7afd843dc5 100644
--- a/gcc/json.h
+++ b/gcc/json.h
@@ -156,16 +156,19 @@ class integer_number : public value
 class string : public value
 {
  public:
-  string (const char *utf8);
+  explicit string (const char *utf8);
+  string (const char *utf8, size_t len);
   ~string () { free (m_utf8); }
 
   enum kind get_kind () const final override { return JSON_STRING; }
   void print (pretty_printer *pp) const final override;
 
   const char *get_string () const { return m_utf8; }
+  size_t get_length () const { return m_len; }
 
  private:
   char *m_utf8;
+  size_t m_len;
 };
 
 /* Subclass of value for the three JSON literals "true", "false",

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 6/6] diagnostics: libcpp: Assign real locations to the tokens inside _Pragma strings
  2022-11-04 13:44 [PATCH 0/6] diagnostics: libcpp: Overhaul locations for _Pragma tokens Lewis Hyatt
                   ` (4 preceding siblings ...)
  2022-11-04 13:44 ` [PATCH 5/6] diagnostics: Support generated data in additional contexts Lewis Hyatt
@ 2022-11-04 13:44 ` Lewis Hyatt
  5 siblings, 0 replies; 18+ messages in thread
From: Lewis Hyatt @ 2022-11-04 13:44 UTC (permalink / raw)
  To: gcc-patches; +Cc: Lewis Hyatt

Currently, the tokens obtained from a destringified _Pragma string do not get
assigned proper locations while they are being lexed.  After the tokens have
been obtained, they are reassigned the same location as the _Pragma token,
which is sufficient to make things like _Pragma("GCC diagnostic ignored...")
operate correctly, but this still results in inferior diagnostics, since the
diagnostics do not point to the problematic tokens.  Further, if a diagnostic
is issued by libcpp during the lexing of the tokens, as opposed to being
issued by the frontend during the processing of the pragma, then the
patched-up location is not yet in place, and the user rather sees an invalid
location that is near to the location of the _Pragma string in some cases, or
potentially very far away, depending on the macro expansion history.  For
example:

=====
_Pragma("GCC diagnostic ignored \"oops")
=====

produces the diagnostic:

file.cpp:1:24: warning: missing terminating " character
    1 | _Pragma("GCC diagnostic ignored \"oops")
      |                        ^

with the caret in a nonsensical location, while this one:

=====
 #define S "GCC diagnostic ignored \"oops"
_Pragma(S)
=====

produces:

file.cpp:2:24: warning: missing terminating " character
    2 | _Pragma(S)
      |                        ^

with both the caret in a nonsensical location, and the actual relevant context
completely absent.

Fix this by assigning proper locations using the new LC_GEN type of linemap.
Now the tokens are given locations inside a generated content buffer, and the
macro expansion stack is modified to be aware that these tokens logically
belong to the "expansion" of the _Pragma directive. For the above examples we
now output:

======
In buffer generated from file.cpp:1:
<generated>:1:24: warning: missing terminating " character
    1 | GCC diagnostic ignored "oops
      |                        ^
file.cpp:1:1: note: in <_Pragma directive>
    1 | _Pragma("GCC diagnostic ignored \"oops")
      | ^~~~~~~
======

and

======
<generated>:1:24: warning: missing terminating " character
    1 | GCC diagnostic ignored "oops
      |                        ^
file.cpp:2:1: note: in <_Pragma directive>
    2 | _Pragma(S)
      | ^~~~~~~
======

So that carets are pointing to something meaningful and all relevant context
appears in the diagnostic.  For the second example, it would be nice if the
macro expansion also output "in expansion of macro S", however doing that for
a general case of macro expansions makes the logic very complicated, since it
has to be done after the fact when the macro maps have already been
constructed.  It doesn't seem worth it for this case, given that the _Pragma
string has already been output once on the first line.

gcc/ChangeLog:

	* tree-diagnostic.cc (maybe_unwind_expanded_macro_loc): Add awareness
	of _Pragma directive to the macro expansion trace.

libcpp/ChangeLog:

	* directives.cc (get_token_no_padding): Add argument to receive the
	virtual location of the token.
	(get__Pragma_string): Likewise.
	(do_pragma): Set pfile->directive_result->src_loc properly, it should
	not be a virtual location.
	(destringize_and_run): Update to provide proper locations for the
	_Pragma string tokens.  Support raw strings.
	(_cpp_do__Pragma): Adapt to changes to the helper functions.
	* errors.cc (cpp_diagnostic_at): Support
	cpp_reader::diagnostic_rebase_loc.
	(cpp_diagnostic_with_line): Likewise.
	* include/line-map.h (class rich_location): Add new member
	forget_cached_expanded_locations().
	* internal.h (struct _cpp__Pragma_state): Define new struct.
	(_cpp_rebase_diagnostic_location): Declare new function.
	(struct cpp_reader): Add diagnostic_rebase_loc member.
	(_cpp_push__Pragma_token_context): Declare new function.
	(_cpp_do__Pragma): Adjust prototype.
	* macro.cc (pragma_str): New static var.
	(builtin_macro): Adapt to new implementation of _Pragma processing.
	(_cpp_pop_context): Fix the logic for resetting
	pfile->top_most_macro_node, which previously was never triggered,
	although the error seems to have been harmless.
	(_cpp_push__Pragma_token_context): New function.
	(_cpp_rebase_diagnostic_location): New function.

gcc/c-family/ChangeLog:

	* c-ppoutput.cc (token_streamer::stream): Pass the virtual location of
	the _Pragma token to maybe_print_line(), not the spelling location.

libgomp/ChangeLog:

	* testsuite/libgomp.oacc-c-c++-common/reduction-5.c: Adjust for new
	macro tracking output for _Pragma directives.
	* testsuite/libgomp.oacc-c-c++-common/vred2d-128.c: Likewise.

gcc/testsuite/ChangeLog:

	* c-c++-common/cpp/diagnostic-pragma-1.c: Adjust for new macro
	tracking output for _Pragma directives.
	* c-c++-common/cpp/pr57580.c: Likewise.
	* c-c++-common/gomp/pragma-3.c: Likewise.
	* c-c++-common/gomp/pragma-5.c: Likewise.
	* g++.dg/pch/operator-1.C: Likewise.
	* gcc.dg/cpp/pr28165.c: Likewise.
	* gcc.dg/cpp/pr35322.c: Likewise.
	* gcc.dg/dfp/pragma-float-const-decimal64-4.c: Likewise.
	* gcc.dg/dfp/pragma-float-const-decimal64-5.c: Likewise.
	* gcc.dg/dfp/pragma-float-const-decimal64-6.c: Likewise.
	* gcc.dg/gomp/macro-4.c: Likewise.
	* gcc.dg/pragma-message.c: Likewise.
	* c-c++-common/pragma-diag-17.c: New test.
	* c-c++-common/pragma-diag-18.c: New test.
	* g++.dg/cpp/pragma-raw-string.C: New test.
	* g++.dg/pch/LC_GEN-maps.C: New test.
	* g++.dg/pch/LC_GEN-maps.Hs: New test.
	* lib/prune.exp: Support pruning new _Pragma include trace.
---
 gcc/c-family/c-ppoutput.cc                    |   2 +-
 .../c-c++-common/cpp/diagnostic-pragma-1.c    |   1 +
 gcc/testsuite/c-c++-common/cpp/pr57580.c      |   2 +-
 gcc/testsuite/c-c++-common/gomp/pragma-3.c    |   3 +-
 gcc/testsuite/c-c++-common/gomp/pragma-5.c    |   3 +-
 gcc/testsuite/c-c++-common/pragma-diag-17.c   |  35 +++
 gcc/testsuite/c-c++-common/pragma-diag-18.c   |  18 ++
 gcc/testsuite/g++.dg/cpp/pragma-raw-string.C  |  16 +
 gcc/testsuite/g++.dg/pch/LC_GEN-maps.C        |  20 ++
 gcc/testsuite/g++.dg/pch/LC_GEN-maps.Hs       |   5 +
 gcc/testsuite/g++.dg/pch/operator-1.C         |   1 +
 gcc/testsuite/gcc.dg/cpp/pr28165.c            |   1 +
 gcc/testsuite/gcc.dg/cpp/pr35322.c            |   1 +
 .../dfp/pragma-float-const-decimal64-4.c      |   1 +
 .../dfp/pragma-float-const-decimal64-5.c      |   2 +-
 .../dfp/pragma-float-const-decimal64-6.c      |   2 +-
 gcc/testsuite/gcc.dg/gomp/macro-4.c           |   2 +-
 gcc/testsuite/gcc.dg/pragma-message.c         |   3 +-
 gcc/testsuite/lib/prune.exp                   |   1 +
 gcc/tree-diagnostic.cc                        |  18 +-
 libcpp/directives.cc                          | 279 ++++++++++++------
 libcpp/errors.cc                              |  16 +-
 libcpp/include/line-map.h                     |   1 +
 libcpp/internal.h                             |  32 +-
 libcpp/macro.cc                               | 126 +++++++-
 .../libgomp.oacc-c-c++-common/reduction-5.c   |   3 +-
 .../libgomp.oacc-c-c++-common/vred2d-128.c    |  40 ++-
 27 files changed, 492 insertions(+), 142 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/pragma-diag-17.c
 create mode 100644 gcc/testsuite/c-c++-common/pragma-diag-18.c
 create mode 100644 gcc/testsuite/g++.dg/cpp/pragma-raw-string.C
 create mode 100644 gcc/testsuite/g++.dg/pch/LC_GEN-maps.C
 create mode 100644 gcc/testsuite/g++.dg/pch/LC_GEN-maps.Hs

diff --git a/gcc/c-family/c-ppoutput.cc b/gcc/c-family/c-ppoutput.cc
index a99d9e9c5ca..75388ce5737 100644
--- a/gcc/c-family/c-ppoutput.cc
+++ b/gcc/c-family/c-ppoutput.cc
@@ -280,7 +280,7 @@ token_streamer::stream (cpp_reader *pfile, const cpp_token *token,
 	  const char *space;
 	  const char *name;
 
-	  line_marker_emitted = maybe_print_line (token->src_loc);
+	  line_marker_emitted = maybe_print_line (loc);
 	  fputs ("#pragma ", print.outf);
 	  c_pp_lookup_pragma (token->val.pragma, &space, &name);
 	  if (space)
diff --git a/gcc/testsuite/c-c++-common/cpp/diagnostic-pragma-1.c b/gcc/testsuite/c-c++-common/cpp/diagnostic-pragma-1.c
index 9867c94a8dd..801c93935b8 100644
--- a/gcc/testsuite/c-c++-common/cpp/diagnostic-pragma-1.c
+++ b/gcc/testsuite/c-c++-common/cpp/diagnostic-pragma-1.c
@@ -1,4 +1,5 @@
 // { dg-do compile }
+// { dg-additional-options "-ftrack-macro-expansion=0" }
 
 #pragma GCC warning "warn-a" // { dg-warning warn-a }
 #pragma GCC error "err-b" // { dg-error err-b }
diff --git a/gcc/testsuite/c-c++-common/cpp/pr57580.c b/gcc/testsuite/c-c++-common/cpp/pr57580.c
index e77462b20de..b0e54d876d6 100644
--- a/gcc/testsuite/c-c++-common/cpp/pr57580.c
+++ b/gcc/testsuite/c-c++-common/cpp/pr57580.c
@@ -1,6 +1,6 @@
 /* PR preprocessor/57580 */
 /* { dg-do compile } */
-/* { dg-options "-save-temps" } */
+/* { dg-options "-save-temps -ftrack-macro-expansion=0" } */
 
 #define MSG 	\
   _Pragma("message(\"message0\")")	\
diff --git a/gcc/testsuite/c-c++-common/gomp/pragma-3.c b/gcc/testsuite/c-c++-common/gomp/pragma-3.c
index 3e1b2111c3d..e0cffb8aeea 100644
--- a/gcc/testsuite/c-c++-common/gomp/pragma-3.c
+++ b/gcc/testsuite/c-c++-common/gomp/pragma-3.c
@@ -8,7 +8,8 @@ void
 f (void)
 {
   const char *str = outer(inner(1,2)); /* { dg-line str_location } */
-  /* { dg-warning "35:'pragma omp error' encountered: Test" "" { target *-*-* } inner_location }
+  /* { dg-warning "1:'pragma omp error' encountered: Test" "" { target *-*-* } 1 }
+     { dg-note "35: in <_Pragma directive>" "" { target *-*-* } inner_location }
      { dg-note "20:in expansion of macro 'inner'" "" { target *-*-* } outer_location }
      { dg-note "21:in expansion of macro 'outer'" "" { target *-*-* } str_location } */
 }
diff --git a/gcc/testsuite/c-c++-common/gomp/pragma-5.c b/gcc/testsuite/c-c++-common/gomp/pragma-5.c
index 173c25e803a..787a334882d 100644
--- a/gcc/testsuite/c-c++-common/gomp/pragma-5.c
+++ b/gcc/testsuite/c-c++-common/gomp/pragma-5.c
@@ -8,7 +8,8 @@ void
 f (void)
 {
   const char *str = outer(inner(1,2)); /* { dg-line str_location } */
-  /* { dg-warning "35:'pragma omp error' encountered: Test" "" { target *-*-* } inner_location }
+  /* { dg-warning "4:'pragma omp error' encountered: Test" "" { target *-*-* } 1 }
+     { dg-note "35:in <_Pragma directive>" "" { target *-*-*} inner_location }
      { dg-note "20:in expansion of macro 'inner'" "" { target *-*-* } outer_location }
      { dg-note "21:in expansion of macro 'outer'" "" { target *-*-* } str_location } */
 }
diff --git a/gcc/testsuite/c-c++-common/pragma-diag-17.c b/gcc/testsuite/c-c++-common/pragma-diag-17.c
new file mode 100644
index 00000000000..b9539c9598b
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/pragma-diag-17.c
@@ -0,0 +1,35 @@
+/* Test virtual location aspects of _Pragmas, when an error is reported after
+   lexing the tokens from the _Pragma string.  */
+/* { dg-additional-options "-Wpragmas -Wunknown-pragmas" } */
+
+_Pragma("GCC diagnostic ignored \"oops1\"") /* { dg-note {1:in <_Pragma directive>} } */
+/* { dg-warning {24:'oops1' is not an option} "" { target *-*-* } 1 } */
+
+#define S2 "GCC diagnostic ignored \"oops2\""
+_Pragma(S2) /* { dg-note {1:in <_Pragma directive>} } */
+/* { dg-warning {24:'oops2' is not an option} "" { target *-*-* } 1 } */
+
+#define PP(x) _Pragma(x) /* { dg-note {15:in <_Pragma directive>} } */
+PP("GCC diagnostic ignored \"oops3\"") /* { dg-note {1:in expansion of macro 'PP'} } */
+/* { dg-warning {24:'oops3' is not an option} "" { target *-*-* } 1 } */
+
+#define X4 _Pragma("GCC diagnostic ignored \"oops4\"") /* { dg-note {12:in <_Pragma directive>} } */
+#define Y4 X4 /* { dg-note {12:in expansion of macro 'X4'} } */
+Y4 /* { dg-note {1:in expansion of macro 'Y4'} } */
+/* { dg-warning {24:'oops4' is not an option} "" { target *-*-* } 1 } */
+
+#define P5 _Pragma /* { dg-note {12:in <_Pragma directive>} } */
+#define S5 "GCC diagnostic ignored \"oops5\""
+#define Y5 P5(S5) /* { dg-note {12:in expansion of macro 'P5'} } */
+Y5 /* { dg-note {1:in expansion of macro 'Y5'} } */
+/* { dg-warning {24:'oops5' is not an option} "" { target *-*-* } 1 } */
+
+#define P6 _Pragma /* { dg-note {12:in <_Pragma directive>} } */
+#define X6 P6("GCC diagnostic ignored \"oops6\"") /* { dg-note {12:in expansion of macro 'P6'} } */
+X6 /* { dg-note {1:in expansion of macro 'X6'} } */
+/* { dg-warning {24:'oops6' is not an option} "" { target *-*-* } 1 } */
+
+_Pragma(__DATE__) /* { dg-warning {-:[-Wunknown-pragmas]} } */
+
+_Pragma("once") /* { dg-note {1:in <_Pragma directive>} } */
+/* { dg-warning {#pragma once in main file} "" { target *-*-*} 1 } */
diff --git a/gcc/testsuite/c-c++-common/pragma-diag-18.c b/gcc/testsuite/c-c++-common/pragma-diag-18.c
new file mode 100644
index 00000000000..5de0fbcb8f1
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/pragma-diag-18.c
@@ -0,0 +1,18 @@
+/* Test virtual location aspects of _Pragmas, when an error is reported during
+   lexing of the _Pragma string itself or of the tokens within it.  */
+/* { dg-additional-options "-Wpragmas" } */
+
+#define X1 "\""
+_Pragma(X1) /* { dg-note {1:in <_Pragma directive>} } */
+/* { dg-warning {1:missing terminating " character} "" { target *-*-* } 1 } */
+
+#define X2a _Pragma("GCC warning \"hello\"") ( /* { dg-note {13:in <_Pragma directive>} } */
+#define X2b "GCC warning \"goodbye\"" )
+_Pragma X2a X2b /* { dg-note {9:in expansion of macro 'X2a'} } */
+/* { dg-note {1:in <_Pragma directive>} "" { target *-*-* } .-1 } */
+/* { dg-warning {13:hello} "" { target *-*-* } 1 } */
+/* { dg-warning {13:goodbye} "" { target *-*-* } 1 } */
+
+_Pragma() /* { dg-error {9:_Pragma takes a parenthesized string literal} } */
+/* { dg-note {1:in <_Pragma directive>} "" { target *-*-* } .-1 } */
+/* { dg-error {at end of input|'_Pragma' does not name a type} "" { target *-*-* } .-2 } */
diff --git a/gcc/testsuite/g++.dg/cpp/pragma-raw-string.C b/gcc/testsuite/g++.dg/cpp/pragma-raw-string.C
new file mode 100644
index 00000000000..5a495aadeec
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp/pragma-raw-string.C
@@ -0,0 +1,16 @@
+/* Test that _Pragma with a raw string works correctly.  */
+/* { dg-do compile { target c++11 } } */
+/* { dg-additional-options "-Wunused-variable -Wpragmas" } */
+
+_Pragma(R"delim(GCC diagnostic push)delim")
+_Pragma(R"(GCC diagnostic ignored "-Wunused-variable")")
+void f1 () { int i; }
+_Pragma(R"(GCC diagnostic pop)")
+void f2 () { int i; } /* { dg-warning {18:-Wunused-variable} } */
+
+/* Make sure lines stay in sync if there is an embedded newline too.  */
+_Pragma(R"xyz(GCC diagnostic ignored R"(two
+line option?)")xyz")
+/* { dg-note {1:in <_Pragma directive>} "" { target *-*-* } .-2 } */
+/* { dg-warning {24:unknown option} "" { target *-*-* } 1 } */
+void f3 () { int i; } /* { dg-warning {18:-Wunused-variable} } */
diff --git a/gcc/testsuite/g++.dg/pch/LC_GEN-maps.C b/gcc/testsuite/g++.dg/pch/LC_GEN-maps.C
new file mode 100644
index 00000000000..7b7e7880986
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pch/LC_GEN-maps.C
@@ -0,0 +1,20 @@
+#include "LC_GEN-maps.H"
+
+/* The LC_GEN map was written to the PCH, but there is not currently a way to
+   observe that fact in normal user code.  Let's try to test it anyway, using
+   -fdump-internal-locations to inspect the line_maps object we received from
+   the PCH.  */
+
+/* { dg-additional-options -fdump-internal-locations } */
+/* { dg-allow-blank-lines-in-output "" } */
+
+/* These regexps themselves will also appear in the output of
+   -fdump-internal-locations, so we need to make sure they contain at least
+   some regexp special characters, even if not strictly necessary, so they
+   match the intended text only, and not themselves.  Also, we make the second
+   one intentionally match the whole output if it maches anything.  We could
+   use dg-excess-errors instead, but that outputs XFAILS which are not really
+   helpful for this test.  */
+
+/* { dg-regexp {reason: . \(LC_GEN\)} } */
+/* { dg-regexp {(.|[\n\r])*file: this string should end up in the "PCH"(.|[\n\r])*} } */
diff --git a/gcc/testsuite/g++.dg/pch/LC_GEN-maps.Hs b/gcc/testsuite/g++.dg/pch/LC_GEN-maps.Hs
new file mode 100644
index 00000000000..76eefa7d1ae
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pch/LC_GEN-maps.Hs
@@ -0,0 +1,5 @@
+/* Evaluating the _Pragma directive here creates an LC_GEN map in the
+   line_maps object that will be stored in the PCH.  The test will make sure
+   that the buffer holding the de-stringified _Pragma string contents makes
+   its way there.  */
+_Pragma("this string should end up in the \"PCH\"")
diff --git a/gcc/testsuite/g++.dg/pch/operator-1.C b/gcc/testsuite/g++.dg/pch/operator-1.C
index 290b5f7ab21..bf1c8b07bdb 100644
--- a/gcc/testsuite/g++.dg/pch/operator-1.C
+++ b/gcc/testsuite/g++.dg/pch/operator-1.C
@@ -1,2 +1,3 @@
+/* { dg-additional-options "-ftrack-macro-expansion=0" } */
 #include "operator-1.H"
 int main(void){ major(0);} /* { dg-warning "Did not Work" } */
diff --git a/gcc/testsuite/gcc.dg/cpp/pr28165.c b/gcc/testsuite/gcc.dg/cpp/pr28165.c
index 71c7c1dba46..3e5e49ffa01 100644
--- a/gcc/testsuite/gcc.dg/cpp/pr28165.c
+++ b/gcc/testsuite/gcc.dg/cpp/pr28165.c
@@ -2,5 +2,6 @@
 /* PR preprocessor/28165 */
 
 /* { dg-do preprocess } */
+/* { dg-additional-options "-ftrack-macro-expansion=0" } */
 #pragma GCC system_header   /* { dg-warning "system_header" "ignored" } */
 _Pragma ("GCC system_header")   /* { dg-warning "system_header" "ignored" } */
diff --git a/gcc/testsuite/gcc.dg/cpp/pr35322.c b/gcc/testsuite/gcc.dg/cpp/pr35322.c
index 1af9605eac6..5bd5f69b73d 100644
--- a/gcc/testsuite/gcc.dg/cpp/pr35322.c
+++ b/gcc/testsuite/gcc.dg/cpp/pr35322.c
@@ -1,4 +1,5 @@
 /* Test case for PR 35322 -- _Pragma ICE.  */
 
 /* { dg-do preprocess } */
+/* { dg-additional-options "-ftrack-macro-expansion=0" } */
 _Pragma("GCC dependency") /* { dg-error "#pragma dependency expects" } */
diff --git a/gcc/testsuite/gcc.dg/dfp/pragma-float-const-decimal64-4.c b/gcc/testsuite/gcc.dg/dfp/pragma-float-const-decimal64-4.c
index af0398daf79..42fc28a4384 100644
--- a/gcc/testsuite/gcc.dg/dfp/pragma-float-const-decimal64-4.c
+++ b/gcc/testsuite/gcc.dg/dfp/pragma-float-const-decimal64-4.c
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-additional-options -ftrack-macro-expansion=0 } */
 
 /* N1312 7.1.1: The FLOAT_CONST_DECIMAL64 pragma.
    C99 6.4.4.2a (New).
diff --git a/gcc/testsuite/gcc.dg/dfp/pragma-float-const-decimal64-5.c b/gcc/testsuite/gcc.dg/dfp/pragma-float-const-decimal64-5.c
index 75e9525dda0..3aefede7b5d 100644
--- a/gcc/testsuite/gcc.dg/dfp/pragma-float-const-decimal64-5.c
+++ b/gcc/testsuite/gcc.dg/dfp/pragma-float-const-decimal64-5.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-std=c99 -pedantic" } */
+/* { dg-options "-std=c99 -pedantic -ftrack-macro-expansion=0" } */
 
 /* N1312 7.1.1: The FLOAT_CONST_DECIMAL64 pragma.
    C99 6.4.4.2a (New).
diff --git a/gcc/testsuite/gcc.dg/dfp/pragma-float-const-decimal64-6.c b/gcc/testsuite/gcc.dg/dfp/pragma-float-const-decimal64-6.c
index 03c1715bee6..6d70ce2bb8d 100644
--- a/gcc/testsuite/gcc.dg/dfp/pragma-float-const-decimal64-6.c
+++ b/gcc/testsuite/gcc.dg/dfp/pragma-float-const-decimal64-6.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-std=c99 -pedantic-errors" } */
+/* { dg-options "-std=c99 -pedantic-errors -ftrack-macro-expansion=0" } */
 
 /* N1312 7.1.1: The FLOAT_CONST_DECIMAL64 pragma.
    C99 6.4.4.2a (New).
diff --git a/gcc/testsuite/gcc.dg/gomp/macro-4.c b/gcc/testsuite/gcc.dg/gomp/macro-4.c
index a4ed9a3980a..c6817d40125 100644
--- a/gcc/testsuite/gcc.dg/gomp/macro-4.c
+++ b/gcc/testsuite/gcc.dg/gomp/macro-4.c
@@ -1,6 +1,6 @@
 /* PR preprocessor/27746 */
 /* { dg-do compile } */
-/* { dg-options "-fopenmp -Wunknown-pragmas" } */
+/* { dg-options "-fopenmp -Wunknown-pragmas -ftrack-macro-expansion=0" } */
 
 #define p		_Pragma ("omp parallel")
 #define omp_p		_Pragma ("omp p")
diff --git a/gcc/testsuite/gcc.dg/pragma-message.c b/gcc/testsuite/gcc.dg/pragma-message.c
index 1b7cf09de0a..72fb0da6f44 100644
--- a/gcc/testsuite/gcc.dg/pragma-message.c
+++ b/gcc/testsuite/gcc.dg/pragma-message.c
@@ -45,8 +45,9 @@
 #define DO_PRAGMA(x) _Pragma (#x) /* { dg-line pragma_loc1 } */
 #define TODO(x) DO_PRAGMA(message ("TODO - " #x)) /* { dg-line pragma_loc2 } */
 TODO(Okay 4) /* { dg-message "in expansion of macro 'TODO'" } */
-/* { dg-message "TODO - Okay 4" "test4.1" { target *-*-* } pragma_loc1 } */
+/* { dg-message "1:TODO - Okay 4" "test4.1" { target *-*-* } 1 } */
 /* { dg-message "in expansion of macro 'DO_PRAGMA'" "test4.2" { target *-*-* } pragma_loc2 } */
+/* { dg-note {in <_Pragma directive>} "test4.3" { target *-*-* } pragma_loc1 } */
 
 #if 0
 #pragma message ("Not printed")
diff --git a/gcc/testsuite/lib/prune.exp b/gcc/testsuite/lib/prune.exp
index 04c6a1dd7a1..a5788842652 100644
--- a/gcc/testsuite/lib/prune.exp
+++ b/gcc/testsuite/lib/prune.exp
@@ -54,6 +54,7 @@ proc prune_gcc_output { text } {
 
     # Diagnostic inclusion stack
     regsub -all "(^|\n)(In file)?\[ \]+included from \[^\n\]*" $text "" text
+    regsub -all "(^|\n)In buffer generated from \[^\n\]*" $text "" text
     regsub -all "(^|\n)\[ \]+from \[^\n\]*" $text "" text
     regsub -all "(^|\n)(In|of) module( \[^\n \]*,)? imported at \[^\n\]*" $text "" text
 
diff --git a/gcc/tree-diagnostic.cc b/gcc/tree-diagnostic.cc
index 5cf3a1c17d2..beedf576471 100644
--- a/gcc/tree-diagnostic.cc
+++ b/gcc/tree-diagnostic.cc
@@ -203,9 +203,12 @@ maybe_unwind_expanded_macro_loc (diagnostic_context *context,
 	const int resolved_def_loc_line = SOURCE_LINE (m, l0);
         if (ix == 0 && saved_location_line != resolved_def_loc_line)
           {
-            diagnostic_append_note (context, resolved_def_loc, 
-                                    "in definition of macro %qs",
-                                    linemap_map_get_macro_name (iter->map));
+	    const char *name = linemap_map_get_macro_name (iter->map);
+	    if (*name == '<')
+	      diagnostic_append_note (context, resolved_def_loc, "in %s", name);
+	    else
+	      diagnostic_append_note (context, resolved_def_loc,
+				      "in definition of macro %qs", name);
             /* At this step, as we've printed the context of the macro
                definition, we don't want to print the context of its
                expansion, otherwise, it'd be redundant.  */
@@ -220,9 +223,12 @@ maybe_unwind_expanded_macro_loc (diagnostic_context *context,
                                     MACRO_MAP_EXPANSION_POINT_LOCATION (iter->map),
                                     LRK_MACRO_DEFINITION_LOCATION, NULL);
 
-        diagnostic_append_note (context, resolved_exp_loc, 
-                                "in expansion of macro %qs",
-                                linemap_map_get_macro_name (iter->map));
+	const char *name = linemap_map_get_macro_name (iter->map);
+	if (*name == '<')
+	  diagnostic_append_note (context, resolved_exp_loc, "in %s", name);
+	else
+	  diagnostic_append_note (context, resolved_exp_loc,
+				  "in expansion of macro %qs", name);
       }
 }
 
diff --git a/libcpp/directives.cc b/libcpp/directives.cc
index 9dc4363c65a..2b733dbc2f7 100644
--- a/libcpp/directives.cc
+++ b/libcpp/directives.cc
@@ -127,10 +127,10 @@ static void do_pragma_warning_or_error (cpp_reader *, bool error);
 static void do_pragma_warning (cpp_reader *);
 static void do_pragma_error (cpp_reader *);
 static void do_linemarker (cpp_reader *);
-static const cpp_token *get_token_no_padding (cpp_reader *);
-static const cpp_token *get__Pragma_string (cpp_reader *);
-static void destringize_and_run (cpp_reader *, const cpp_string *,
-				 location_t);
+static const cpp_token *get_token_no_padding (cpp_reader *,
+					      location_t * = nullptr);
+static const cpp_token *get__Pragma_string (cpp_reader *,
+					    location_t * = nullptr);
 static bool parse_answer (cpp_reader *, int, location_t, cpp_macro **);
 static cpp_hashnode *parse_assertion (cpp_reader *, int, cpp_macro **);
 static cpp_macro **find_answer (cpp_hashnode *, const cpp_macro *);
@@ -1501,14 +1501,12 @@ do_pragma (cpp_reader *pfile)
 {
   const struct pragma_entry *p = NULL;
   const cpp_token *token, *pragma_token;
-  location_t pragma_token_virt_loc = 0;
   cpp_token ns_token;
   unsigned int count = 1;
 
   pfile->state.prevent_expansion++;
 
-  pragma_token = token = cpp_get_token_with_location (pfile,
-						      &pragma_token_virt_loc);
+  pragma_token = token = cpp_get_token (pfile);
   ns_token = *token;
   if (token->type == CPP_NAME)
     {
@@ -1534,7 +1532,7 @@ do_pragma (cpp_reader *pfile)
     {
       if (p->is_deferred)
 	{
-	  pfile->directive_result.src_loc = pragma_token_virt_loc;
+	  pfile->directive_result.src_loc = pragma_token->src_loc;
 	  pfile->directive_result.type = CPP_PRAGMA;
 	  pfile->directive_result.flags = pragma_token->flags;
 	  pfile->directive_result.val.pragma = p->u.ident;
@@ -1827,11 +1825,11 @@ do_pragma_error (cpp_reader *pfile)
 
 /* Get a token but skip padding.  */
 static const cpp_token *
-get_token_no_padding (cpp_reader *pfile)
+get_token_no_padding (cpp_reader *pfile, location_t *virt_loc)
 {
   for (;;)
     {
-      const cpp_token *result = cpp_get_token (pfile);
+      const cpp_token *result = cpp_get_token_with_location (pfile, virt_loc);
       if (result->type != CPP_PADDING)
 	return result;
     }
@@ -1840,7 +1838,7 @@ get_token_no_padding (cpp_reader *pfile)
 /* Check syntax is "(string-literal)".  Returns the string on success,
    or NULL on failure.  */
 static const cpp_token *
-get__Pragma_string (cpp_reader *pfile)
+get__Pragma_string (cpp_reader *pfile, location_t *string_virt_loc)
 {
   const cpp_token *string;
   const cpp_token *paren;
@@ -1851,7 +1849,7 @@ get__Pragma_string (cpp_reader *pfile)
   if (paren->type != CPP_OPEN_PAREN)
     return NULL;
 
-  string = get_token_no_padding (pfile);
+  string = get_token_no_padding (pfile, string_virt_loc);
   if (string->type == CPP_EOF)
     _cpp_backup_tokens (pfile, 1);
   if (string->type != CPP_STRING && string->type != CPP_WSTRING
@@ -1871,55 +1869,105 @@ get__Pragma_string (cpp_reader *pfile)
 /* Destringize IN into a temporary buffer, by removing the first \ of
    \" and \\ sequences, and process the result as a #pragma directive.  */
 static void
-destringize_and_run (cpp_reader *pfile, const cpp_string *in,
-		     location_t expansion_loc)
-{
-  const unsigned char *src, *limit;
-  char *dest, *result;
-  cpp_context *saved_context;
-  cpp_token *saved_cur_token;
-  tokenrun *saved_cur_run;
-  cpp_token *toks;
-  int count;
-  const struct directive *save_directive;
-
-  dest = result = (char *) alloca (in->len - 1);
-  src = in->text + 1 + (in->text[0] == 'L');
-  limit = in->text + in->len - 1;
-  while (src < limit)
+destringize_and_run (cpp_reader *pfile, _cpp__Pragma_state *pstate)
+{
+  uchar *dest, *result;
+
+  /* Determine where the data starts, and what kind of string it is.  */
+  const cpp_string *const in = &pstate->string_tok->val.str;
+  const uchar *src = in->text;
+  bool is_raw_string = false;
+  for (;;)
     {
-      /* We know there is a character following the backslash.  */
-      if (*src == '\\' && (src[1] == '\\' || src[1] == '"'))
-	src++;
-      *dest++ = *src++;
+      switch (*src++)
+	{
+	case '\"': break;
+	case 'R': is_raw_string = true; continue;
+	case '\0': gcc_assert (false);
+	default: continue;
+	}
+      break;
     }
-  *dest = '\n';
 
-  /* Ugh; an awful kludge.  We are really not set up to be lexing
-     tokens when in the middle of a macro expansion.  Use a new
-     context to force cpp_get_token to lex, and so skip_rest_of_line
-     doesn't go beyond the end of the text.  Also, remember the
-     current lexing position so we can return to it later.
+  /* If we were given a raw string literal, we don't need to destringize it,
+     but we do need to strip off the prefix and the suffix.  */
+  if (is_raw_string)
+    {
+      cpp_string buf;
+      const bool ok
+	= cpp_interpret_string_notranslate (pfile, in, 1, &buf, CPP_STRING);
+      gcc_assert (ok);
 
-     Something like line-at-a-time lexing should remove the need for
-     this.  */
-  saved_context = pfile->context;
-  saved_cur_token = pfile->cur_token;
-  saved_cur_run = pfile->cur_run;
+      /* BUF.TEXT ends with a terminating null (which is counted in BUF.LEN).
+	 We want to end with a newline as required by cpp_push_buffer.  While it
+	 is not strictly necessary to null terminate our buffer, it is useful to
+	 do so for safety, so we reserve one extra byte.  The \n\0 sequence is
+	 appended after the else block.  */
+      result = _cpp_unaligned_alloc (pfile, buf.len + 1);
+      memcpy (result, buf.text, buf.len - 1);
+      dest = result + (buf.len - 1);
+      XDELETEVEC (buf.text);
+    }
+  else
+    {
+      const auto last_ptr = in->text + in->len - 1;
+      /* +2 for the trailing \n\0 as above.  */
+      dest = result = _cpp_unaligned_alloc (pfile, last_ptr - src + 1 + 2);
+      while (src < last_ptr)
+	{
+	  /* We know there is a character following the backslash.  */
+	  if (*src == '\\' && (src[1] == '\\' || src[1] == '"'))
+	    src++;
+	  *dest++ = *src++;
+	}
+    }
+  *dest++ = '\n';
+  *dest++ = '\0';
 
-  pfile->context = XCNEW (cpp_context);
+  /* We will now ask PFILE to interrupt what it was doing (obtaining tokens
+     either from the main context via lexing, or from a macro context), and get
+     tokens from the string argument instead.  We create a new isolated
+     cpp_context so that cpp_get_token will think it is working on the main
+     buffer and call cpp_lex_token accordingly.  Save all the relevant state so
+     we can return to the previous task once that is completed.
 
-  /* Inline run_directive, since we need to delay the _cpp_pop_buffer
-     until we've read all of the tokens that we want.  */
-  cpp_push_buffer (pfile, (const uchar *) result, dest - result,
-		   /* from_stage3 */ true);
-  /* ??? Antique Disgusting Hack.  What does this do?  */
-  if (pfile->buffer->prev)
-    pfile->buffer->file = pfile->buffer->prev->file;
+     Doing things this way is a bit of a kludge, but the alternative would be
+     to create a new context type to support lexing from a string, and that
+     would add overhead to every token parse, while _Pragma is relatively rarely
+     needed.  */
 
+  const auto saved_context = pfile->context;
+  const auto saved_cur_token = pfile->cur_token;
+  const auto saved_cur_run = pfile->cur_run;
+  pfile->context = XCNEW (cpp_context);
   start_directive (pfile);
+
+  /* Set up an LC_GEN line map to get valid locations for the tokens we are
+     about to lex.  We need to do this after calling start_directive, because
+     historically pfile->directive_line is what's been passed to
+     pfile->cb.def_pragma, and we are not proposing to change that now.  To
+     decide if we are in a system header or not, look at the location of the
+     _Pragma token.  So for instance if we have _Pragma(S) in the main file,
+     where S is a macro defined in a system header, we will decide we are not in
+     a system location.  */
+  const unsigned int buf_len = dest - result;
+  const int sysp = linemap_location_in_system_header_p (pfile->line_table,
+							pstate->pragma_loc);
+  linemap_add (pfile->line_table, LC_GEN, sysp, (const char *)result, 1,
+	       buf_len);
+  const auto col_hint = (uchar *) memchr (result, '\n', buf_len) - result;
+  linemap_line_start (pfile->line_table, 1, col_hint);
+
+  /* Push the buffer.  */
+  cpp_push_buffer (pfile, result, buf_len - 2, true);
+
+  /* This is needed to make _Pragma("once") work correctly, as it needs
+     pfile->buffer->file to be set to the current source file.  */
+  pfile->buffer->file = pfile->buffer->prev->file;
+
+  /* We are ready to start handling the directive as normal.  */
   _cpp_clean_line (pfile);
-  save_directive = pfile->directive;
+  const auto save_directive = pfile->directive;
   pfile->directive = &dtable[T_PRAGMA];
   do_pragma (pfile);
   if (pfile->directive_result.type == CPP_PRAGMA)
@@ -1928,80 +1976,123 @@ destringize_and_run (cpp_reader *pfile, const cpp_string *in,
   pfile->directive = save_directive;
 
   /* We always insert at least one token, the directive result.  It'll
-     either be a CPP_PADDING or a CPP_PRAGMA.  In the later case, we 
+     either be a CPP_PADDING or a CPP_PRAGMA.  In the latter case, we
      need to insert *all* of the tokens, including the CPP_PRAGMA_EOL.  */
 
   /* If we're not handling the pragma internally, read all of the tokens from
-     the string buffer now, while the string buffer is still installed.  */
-  /* ??? Note that the token buffer allocated here is leaked.  It's not clear
-     to me what the true lifespan of the tokens are.  It would appear that
-     the lifespan is the entire parse of the main input stream, in which case
-     this may not be wrong.  */
-  if (pfile->directive_result.type == CPP_PRAGMA)
-    {
-      int maxcount;
-
-      count = 1;
-      maxcount = 50;
-      toks = XNEWVEC (cpp_token, maxcount);
-      toks[0] = pfile->directive_result;
-      toks[0].src_loc = expansion_loc;
-
-      do
+     the string buffer now, while the string buffer is still installed, and then
+     push them as a new token context after.  This way, we can clean up the
+     temporarily modified state of the lexer now.  */
+
+  const bool is_deferred = (pfile->directive_result.type == CPP_PRAGMA);
+  if (is_deferred)
+    {
+      /* Using _cpp_buff allows us to arrange for this buffer to be freed when
+	 the new token context is popped, without adding any additional space
+	 overhead to the cpp_context structure.  In order to support
+	 track_macro_expansion==0, we need to store the cpp_token objects
+	 contiguously, and the virt locs separately.  (Note that these tokens
+	 may acquire a virtual loc here, in case the pragma allows macro
+	 expansion.  But they will not yet have virtual locs representing them
+	 as part of the expansion of the _Pragma directive; this will be handled
+	 later in _cpp_push__Pragma_token_context.  */
+      const size_t init_count = 50;
+      _cpp_buff *tok_buff
+	= _cpp_get_buff (pfile, init_count * sizeof (cpp_token));
+      _cpp_buff *loc_buff
+	= _cpp_get_buff (pfile, init_count * sizeof (location_t));
+
+      /* Remember the base buffs so we can chain the final loc buff after it
+	 once we are done collecting tokens.  */
+      const auto tok_buff0 = tok_buff;
+      pstate->buff_chain = &loc_buff->next;
+
+      /* DIRECTIVE_RESULT is the first token we return (a CPP_PRAGMA).  This
+	 location cannot result from macro expansion, so there is no virtual
+	 location to worry about.  */
+      auto tok_out = (cpp_token *) tok_buff->base;
+      *tok_out++ = pfile->directive_result;
+      auto loc_out = (location_t *) loc_buff->base;
+      *loc_out++ = pfile->directive_result.src_loc;
+      unsigned int ntoks = 1;
+
+      /* Finally get all the tokens.  */
+      for (;;)
 	{
-	  if (count == maxcount)
+	  if (tok_buff->limit - (uchar *)tok_out < (int)sizeof (cpp_token))
 	    {
-	      maxcount = maxcount * 3 / 2;
-	      toks = XRESIZEVEC (cpp_token, toks, maxcount);
+	      _cpp_extend_buff (pfile, &tok_buff,
+				tok_buff->limit - tok_buff->base);
+	      tok_out = ((cpp_token *)tok_buff->base) + ntoks;
 	    }
-	  toks[count] = *cpp_get_token (pfile);
-	  /* _Pragma is a builtin, so we're not within a macro-map, and so
-	     the token locations are set to bogus ordinary locations
-	     near to, but after that of the "_Pragma".
-	     Paper over this by setting them equal to the location of the
-	     _Pragma itself (PR preprocessor/69126).  */
-	  toks[count].src_loc = expansion_loc;
+
+	  if (loc_buff->limit - (uchar *)loc_out < (int)sizeof (location_t))
+	    {
+	      _cpp_extend_buff (pfile, &loc_buff,
+				loc_buff->limit - loc_buff->base);
+	      loc_out = ((location_t *)loc_buff->base) + ntoks;
+	    }
+
+	  const auto this_tok = tok_out;
+	  *tok_out++ = *cpp_get_token_with_location (pfile, loc_out++);
+	  ++ntoks;
+
 	  /* Macros have been already expanded by cpp_get_token
 	     if the pragma allowed expansion.  */
-	  toks[count++].flags |= NO_EXPAND;
+	  this_tok->flags |= NO_EXPAND;
+	  if (this_tok->type == CPP_PRAGMA_EOL)
+	    break;
 	}
-      while (toks[count-1].type != CPP_PRAGMA_EOL);
+
+      /* Finalize the buffers so they can be stored as one chain in a
+	 cpp_context and freed when that context is popped.  */
+      tok_buff0->next = loc_buff;
+      pstate->ntoks = ntoks;
+      pstate->tok_buff = tok_buff;
+      pstate->loc_buff = loc_buff;
     }
   else
     {
-      count = 1;
-      toks = &pfile->avoid_paste;
-
       /* If we handled the entire pragma internally, make sure we get the
 	 line number correct for the next token.  */
       if (pfile->cb.line_change)
 	pfile->cb.line_change (pfile, pfile->cur_token, false);
     }
 
-  /* Finish inlining run_directive.  */
+  /* Reset the old state before...  */
+  const auto map = linemap_add (pfile->line_table, LC_LEAVE, 0, nullptr, 0);
+  linemap_line_start
+    (pfile->line_table,
+     ORDINARY_MAP_STARTING_LINE_NUMBER (linemap_check_ordinary (map)),
+     127);
   pfile->buffer->file = NULL;
   _cpp_pop_buffer (pfile);
-
-  /* Reset the old macro state before ...  */
   XDELETE (pfile->context);
   pfile->context = saved_context;
   pfile->cur_token = saved_cur_token;
   pfile->cur_run = saved_cur_run;
 
-  /* ... inserting the new tokens we collected.  */
-  _cpp_push_token_context (pfile, NULL, toks, count);
+  /* ...inserting the new tokens we collected.  This is not a simple call to
+     _cpp_push_token_context, because we need to create virtual locations
+     for the tokens and push an extended token context to return them.  */
+  if (is_deferred)
+    _cpp_push__Pragma_token_context (pfile, pstate);
+  else
+    _cpp_push_token_context (pfile, nullptr, &pfile->avoid_paste, 1);
 }
 
+
 /* Handle the _Pragma operator.  Return 0 on error, 1 if ok.  */
+
 int
-_cpp_do__Pragma (cpp_reader *pfile, location_t expansion_loc)
+_cpp_do__Pragma (cpp_reader *pfile, _cpp__Pragma_state *pstate)
 {
-  const cpp_token *string = get__Pragma_string (pfile);
-  pfile->directive_result.type = CPP_PADDING;
+  pstate->string_tok = get__Pragma_string (pfile, &pstate->string_loc);
 
-  if (string)
+  pfile->directive_result.type = CPP_PADDING;
+  if (pstate->string_tok)
     {
-      destringize_and_run (pfile, &string->val.str, expansion_loc);
+      destringize_and_run (pfile, pstate);
       return 1;
     }
   cpp_error (pfile, CPP_DL_ERROR,
diff --git a/libcpp/errors.cc b/libcpp/errors.cc
index df5f8d6fa32..351c7bc17e8 100644
--- a/libcpp/errors.cc
+++ b/libcpp/errors.cc
@@ -60,13 +60,11 @@ cpp_diagnostic_at (cpp_reader * pfile, enum cpp_diagnostic_level level,
 		   enum cpp_warning_reason reason, rich_location *richloc,
 		   const char *msgid, va_list *ap)
 {
-  bool ret;
-
   if (!pfile->cb.diagnostic)
     abort ();
-  ret = pfile->cb.diagnostic (pfile, level, reason, richloc, _(msgid), ap);
-
-  return ret;
+  if (pfile->diagnostic_rebase_loc)
+    _cpp_rebase_diagnostic_location (pfile, richloc);
+  return pfile->cb.diagnostic (pfile, level, reason, richloc, _(msgid), ap);
 }
 
 /* Print a diagnostic at the location of the previously lexed token.  */
@@ -197,16 +195,14 @@ cpp_diagnostic_with_line (cpp_reader * pfile, enum cpp_diagnostic_level level,
 			  location_t src_loc, unsigned int column,
 			  const char *msgid, va_list *ap)
 {
-  bool ret;
-  
   if (!pfile->cb.diagnostic)
     abort ();
   rich_location richloc (pfile->line_table, src_loc);
   if (column)
     richloc.override_column (column);
-  ret = pfile->cb.diagnostic (pfile, level, reason, &richloc, _(msgid), ap);
-
-  return ret;
+  if (pfile->diagnostic_rebase_loc)
+    _cpp_rebase_diagnostic_location (pfile, &richloc);
+  return pfile->cb.diagnostic (pfile, level, reason, &richloc, _(msgid), ap);
 }
 
 /* Print a warning or error, depending on the value of LEVEL.  */
diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index eb281809cbd..08ac243a6ce 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -1715,6 +1715,7 @@ class rich_location
   location_range *get_range (unsigned int idx);
 
   expanded_location get_expanded_location (unsigned int idx);
+  void forget_cached_expanded_location () { m_have_expanded_location = false; }
 
   void
   override_column (int column);
diff --git a/libcpp/internal.h b/libcpp/internal.h
index badfd1b40da..fd0de49e302 100644
--- a/libcpp/internal.h
+++ b/libcpp/internal.h
@@ -292,6 +292,28 @@ struct lexer_state
   unsigned char ignore__Pragma;
 };
 
+/* Because handling of _Pragma bounces back and forth between macro.cc and
+   directives.cc, it is useful to keep the needed state in one place.  */
+struct _cpp__Pragma_state
+{
+  const cpp_token *string_tok; /* The token for the argument string.  */
+
+  /* These locations are the virtual locations returned by
+     cpp_get_token_with_location, if the relevant tokens came from macro
+     expansions.  */
+  location_t pragma_loc; /* Location of the _Pragma token.  */
+  location_t string_loc; /* Location of the string arg.  */
+
+  /* The tokens lexed from the _Pragma string.  */
+  unsigned int ntoks;
+  _cpp_buff *tok_buff;
+  _cpp_buff *loc_buff;
+  _cpp_buff **buff_chain;
+};
+
+/* In macro.cc, implements pstate->diagnostic_rebase_loc handling.  */
+void _cpp_rebase_diagnostic_location (cpp_reader *, rich_location *);
+
 /* Special nodes - identifiers with predefined significance.  */
 struct spec_nodes
 {
@@ -601,6 +623,12 @@ struct cpp_reader
      zero of said file.  */
   location_t main_loc;
 
+  /* Location from which we would like to pretend a given token was
+     macro-expanded, if a diagnostic is issued.  Useful for improving
+     _Pragma diagnostics.  */
+  location_t diagnostic_rebase_loc;
+  cpp_hashnode *diagnostic_rebase_node;
+
   /* Returns true iff we should warn about UTF-8 bidirectional control
      characters.  */
   bool warn_bidi_p () const
@@ -701,6 +729,8 @@ extern const unsigned char *_cpp_builtin_macro_text (cpp_reader *,
 extern int _cpp_warn_if_unused_macro (cpp_reader *, cpp_hashnode *, void *);
 extern void _cpp_push_token_context (cpp_reader *, cpp_hashnode *,
 				     const cpp_token *, unsigned int);
+extern void _cpp_push__Pragma_token_context (cpp_reader *,
+					     _cpp__Pragma_state *);
 extern void _cpp_backup_tokens_direct (cpp_reader *, unsigned int);
 
 /* In identifiers.cc */
@@ -772,7 +802,7 @@ extern int _cpp_handle_directive (cpp_reader *, bool);
 extern void _cpp_define_builtin (cpp_reader *, const char *);
 extern char ** _cpp_save_pragma_names (cpp_reader *);
 extern void _cpp_restore_pragma_names (cpp_reader *, char **);
-extern int _cpp_do__Pragma (cpp_reader *, location_t);
+extern int _cpp_do__Pragma (cpp_reader *, _cpp__Pragma_state *);
 extern void _cpp_init_directives (cpp_reader *);
 extern void _cpp_init_internal_pragmas (cpp_reader *);
 extern void _cpp_do_file_change (cpp_reader *, enum lc_reason, const char *,
diff --git a/libcpp/macro.cc b/libcpp/macro.cc
index 8ebf360c03c..e1091badf25 100644
--- a/libcpp/macro.cc
+++ b/libcpp/macro.cc
@@ -93,6 +93,8 @@ struct macro_arg_saved_data {
 static const char *vaopt_paste_error =
   N_("'##' cannot appear at either end of __VA_OPT__");
 
+static const uchar pragma_str[] = N_("<_Pragma directive>");
+
 static void expand_arg (cpp_reader *, macro_arg *);
 
 /* A class for tracking __VA_OPT__ state while iterating over a
@@ -756,7 +758,31 @@ builtin_macro (cpp_reader *pfile, cpp_hashnode *node,
       if (pfile->state.in_directive || pfile->state.ignore__Pragma)
 	return 0;
 
-      return _cpp_do__Pragma (pfile, loc);
+      _cpp__Pragma_state pstate = {};
+      pstate.pragma_loc = loc;
+
+      /* The diagnostic_rebase stuff arranges that any diagnostics issued during
+	 lexing will point the user back to the _Pragma location.  */
+      const auto prev_rloc = pfile->diagnostic_rebase_loc;
+      const auto prev_rnode = pfile->diagnostic_rebase_node;
+      pfile->diagnostic_rebase_loc = loc;
+      pfile->diagnostic_rebase_node
+	= cpp_lookup (pfile, pragma_str, (sizeof pragma_str) - 1);
+
+      /* While lexing tokens, if we end up expanding some macros, we would
+	 like not to override top_most_macro_node; preserving it pointing
+	 to the _Pragma helps out the case of -ftrack-macro-expansion=0.
+	 Setting this flag causes in_macro_expansion_p to return TRUE,
+	 even though we are not technically in a macro context.  */
+      const bool prev_expand = pfile->about_to_expand_macro_p;
+      pfile->about_to_expand_macro_p = true;
+
+      /* Get the tokens, then reset everything back how it was.  */
+      const int res = _cpp_do__Pragma (pfile, &pstate);
+      pfile->about_to_expand_macro_p = prev_expand;
+      pfile->diagnostic_rebase_loc = prev_rloc;
+      pfile->diagnostic_rebase_node = prev_rnode;
+      return res;
     }
 
   buf = _cpp_builtin_macro_text (pfile, node, expand_loc);
@@ -2802,7 +2828,8 @@ _cpp_pop_context (cpp_reader *pfile)
 	  && macro_of_context (context->prev) != macro)
 	macro->flags &= ~NODE_DISABLED;
 
-      if (macro == pfile->top_most_macro_node && context->prev == NULL)
+      if (!pfile->about_to_expand_macro_p
+	  && context->prev == &pfile->base_context)
 	/* We are popping the context of the top-most macro node.  */
 	pfile->top_most_macro_node = NULL;
     }
@@ -2836,10 +2863,10 @@ reached_end_of_context (cpp_context *context)
 
 /* Consume the next token contained in the current context of PFILE,
    and return it in *TOKEN. It's "full location" is returned in
-   *LOCATION. If -ftrack-macro-location is in effeect, fFull location"
-   means the location encoding the locus of the token across macro
-   expansion; otherwise it's just is the "normal" location of the
-   token which (*TOKEN)->src_loc.  */
+   *LOCATION.  If -ftrack-macro-location is in effect, "full location"
+   means the virtual location encoding the locus of the token across macro
+   expansion; otherwise it's just the "normal" (spelling) location of the
+   token, which is (*TOKEN)->src_loc.  */
 static inline void
 consume_next_token_from_context (cpp_reader *pfile,
 				 const cpp_token ** token,
@@ -4129,3 +4156,90 @@ cpp_macro_definition (cpp_reader *pfile, cpp_hashnode *node,
   *buffer = '\0';
   return pfile->macro_buffer;
 }
+
+/* Handle the list of tokens lexed from a _Pragma string.  We need to create
+   virtual locations (reflecting the fact that these tokens are logically
+   within the expansion of the _Pragma string), and push an extended token
+   context.  */
+
+void
+_cpp_push__Pragma_token_context (cpp_reader *pfile,
+				 _cpp__Pragma_state *pstate)
+{
+  const auto node = cpp_lookup (pfile, pragma_str, (sizeof pragma_str) - 1);
+  const auto toks = (const cpp_token *) pstate->tok_buff->base;
+
+  /* If not tracking macro expansions, then just push a normal token context.
+     cpp_get_token () will return the user the location of the _Pragma
+     directive, so they will have a valid location for the _Pragma which is
+     outside the LC_GEN map.  */
+  if (!CPP_OPTION (pfile, track_macro_expansion))
+    {
+      _cpp_push_token_context (pfile, node, toks, pstate->ntoks);
+      /* Arrange to free the buffers when the context is popped.  */
+      pfile->context->buff = pstate->tok_buff;
+      return;
+    }
+
+  location_t *virt_locs = nullptr;
+  _cpp_buff *const macro_tokens = tokens_buff_new (pfile, pstate->ntoks,
+						   &virt_locs);
+  const auto map = linemap_enter_macro (pfile->line_table, node,
+					pstate->pragma_loc, pstate->ntoks);
+  const auto locs = (location_t *)pstate->loc_buff->base;
+  for (unsigned int i = 0; i != pstate->ntoks; ++i)
+    {
+      tokens_buff_add_token (macro_tokens, virt_locs, toks + i,
+			     locs[i], locs[i], map, i);
+    }
+
+  /* Chain tok_buff ahead of macro_tokens so both are freed together
+     when the context is popped.  pstate->buff_chain is the NEXT pointer
+     of the last buffer in the LOC_BUFF chain, so it looks like:
+     TOK_BUFF_1 -> ... -> TOK_BUFF_N -> ... -> LOC_BUFF_1 -> ... ->
+     LOC_BUFF_N -> MACRO_TOKENS_1 -> ... -> MACRO_TOKENS_N.  */
+  *pstate->buff_chain = macro_tokens;
+  push_extended_tokens_context (pfile, node, pstate->tok_buff, virt_locs,
+				(const cpp_token **) macro_tokens->base,
+				pstate->ntoks);
+}
+
+void
+_cpp_rebase_diagnostic_location (cpp_reader *pfile, rich_location *richloc)
+{
+  /* If we are here, it means a diagnostic is being generated while lexing
+     tokens outside a macro context, but pfile->diagnostic_rebase_loc indicates
+     a location from which we would like to pretend we are actually expanding a
+     macro.  This works around the fact that a macro map can only be generated
+     once we know how many tokens it will contain, but the number of tokens to
+     be lexed from, say, a _Pragma string, is not known ahead of time.  In the
+     case of _Pragma, _cpp_push__Pragma_token_context above handles creating the
+     proper macro map once all the tokens are available.  This function runs
+     earlier than that, while in the middle of lexing tokens, so it creates a
+     temporary macro map which serves only to improve the information content of
+     the diagnostic that's about to be generated.  */
+
+  const int nlocs = richloc->get_num_locations ();
+
+  if (CPP_OPTION (pfile, track_macro_expansion))
+    {
+      const auto map
+	= linemap_enter_macro (pfile->line_table, pfile->diagnostic_rebase_node,
+			       pfile->diagnostic_rebase_loc, nlocs);
+      for (int i = 0; i != nlocs; ++i)
+	{
+	  location_range& r = *richloc->get_range (i);
+	  r.m_loc = linemap_add_macro_token (map, i, r.m_loc, r.m_loc);
+	}
+    }
+  else
+    {
+      /* When not tracking macro expansion, then set the location to the
+	 expansion point for all tokens, which is what would be returned
+	 by cpp_get_token in the normal case.  */
+      for (int i = 0; i != nlocs; ++i)
+	richloc->get_range (i)->m_loc = pfile->invocation_location;
+    }
+
+  richloc->forget_cached_expanded_location ();
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-5.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-5.c
index ddccfe89e73..f518915492d 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-5.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-5.c
@@ -46,7 +46,8 @@ main (void)
   /* Nvptx targets require a vector_length or 32 in to allow spinlocks with
      gangs.  */
   check_reduction (num_workers (nw) vector_length (vl), worker); /* { dg-line check_reduction_loc } */
-  /* { dg-warning "22:region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } pragma_loc }
+  /* { dg-warning "1:region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } 1 }
+     { dg-note "22:in <_Pragma directive>" "" { target *-*-* xfail offloading_enabled} pragma_loc }
      { dg-note "1:in expansion of macro 'DO_PRAGMA'" "" { target *-*-* xfail offloading_enabled } DO_PRAGMA_loc }
      { dg-note "3:in expansion of macro 'check_reduction'" "" { target *-*-* xfail offloading_enabled } check_reduction_loc }
      TODO See PR101551 for 'offloading_enabled' XFAILs.  */
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/vred2d-128.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/vred2d-128.c
index 84e6d51670b..bd2567d96f8 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/vred2d-128.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/vred2d-128.c
@@ -40,46 +40,54 @@ int a1[n], a2[n];
 
 gentest (test1, "acc parallel loop gang vector_length (128) firstprivate (t1, t2)",
 	 "acc loop vector reduction(+:t1) reduction(-:t2)")
-/* { dg-warning {'t1' is used uninitialized} {} { target *-*-* } outer }
+/* { dg-warning {'t1' is used uninitialized} {} { target *-*-* } 1 }
+   { dg-note {in <_Pragma directive>} {} { target { ! offloading_enabled } } outer }
    { dg-note {'t1' was declared here} {} { target *-*-* } vars }
-   { dg-note {in expansion of macro 'gentest'} {} { target { ! offloading_enabled } } .-4 }
+   { dg-note {in expansion of macro 'gentest'} {} { target { ! offloading_enabled } } .-5 }
      TODO See PR101551 for 'offloading_enabled' differences.  */
-/* { dg-warning {'t2' is used uninitialized} {} { target *-*-* } outer }
+/* { dg-warning {'t2' is used uninitialized} {} { target *-*-* } 1 }
+   { DUPdg-note {in <_Pragma directive>} {} { target { ! offloading_enabled } } outer }
    { dg-note {'t2' was declared here} {} { target *-*-* } vars }
-   { DUP_dg-note {in expansion of macro 'gentest'} {} { target { ! offloading_enabled } } .-8 }
+   { DUP_dg-note {in expansion of macro 'gentest'} {} { target { ! offloading_enabled } } .-10 }
      TODO See PR101551 for 'offloading_enabled' differences.  */
 
 gentest (test2, "acc parallel loop gang vector_length (128) firstprivate (t1, t2)",
 	 "acc loop worker vector reduction(+:t1) reduction(-:t2)")
-/* { DUPdg-warning {'t1' is used uninitialized} {} { target *-*-* } outer }
+/* { DUPdg-warning {'t1' is used uninitialized} {} { target *-*-* } 1 }
+   { DUPdg-note {in <_Pragma directive>} {} { target { ! offloading_enabled } } outer }
    { DUP_dg-note {'t1' was declared here} {} { target *-*-* } vars }
-   { dg-note {in expansion of macro 'gentest'} {} { target { ! offloading_enabled } } .-4 }
+   { dg-note {in expansion of macro 'gentest'} {} { target { ! offloading_enabled } } .-5 }
      TODO See PR101551 for 'offloading_enabled' differences.  */
-/* { DUPdg-warning {'t2' is used uninitialized} {} { target *-*-* } outer }
+/* { DUPdg-warning {'t2' is used uninitialized} {} { target *-*-* } 1 }
+   { DUPdg-note {in <_Pragma directive>} {} { target { ! offloading_enabled } } outer }
    { DUP_dg-note {'t2' was declared here} {} { target *-*-* } vars }
-   { DUP_dg-note {in expansion of macro 'gentest'} {} { target { ! offloading_enabled } } .-8 }
+   { DUP_dg-note {in expansion of macro 'gentest'} {} { target { ! offloading_enabled } } .-10 }
      TODO See PR101551 for 'offloading_enabled' differences.  */
 
 gentest (test3, "acc parallel loop gang worker vector_length (128) firstprivate (t1, t2)",
 	 "acc loop vector reduction(+:t1) reduction(-:t2)")
-/* { DUPdg-warning {'t1' is used uninitialized} {} { target *-*-* } outer }
+/* { DUPdg-warning {'t1' is used uninitialized} {} { target *-*-* } 1 }
+   { DUPdg-note {in <_Pragma directive>} {} { target { ! offloading_enabled } } outer }
    { DUP_dg-note {'t1' was declared here} {} { target *-*-* } vars }
-   { dg-note {in expansion of macro 'gentest'} {} { target { ! offloading_enabled } } .-4 }
+   { dg-note {in expansion of macro 'gentest'} {} { target { ! offloading_enabled } } .-5 }
      TODO See PR101551 for 'offloading_enabled' differences.  */
-/* { DUPdg-warning {'t2' is used uninitialized} {} { target *-*-* } outer }
+/* { DUPdg-warning {'t2' is used uninitialized} {} { target *-*-* } 1 }
+   { DUPdg-note {in <_Pragma directive>} {} { target { ! offloading_enabled } } outer }
    { DUP_dg-note {'t2' was declared here} {} { target *-*-* } vars }
-   { DUP_dg-note {in expansion of macro 'gentest'} {} { target { ! offloading_enabled } } .-8 }
+   { DUP_dg-note {in expansion of macro 'gentest'} {} { target { ! offloading_enabled } } .-10 }
      TODO See PR101551 for 'offloading_enabled' differences.  */
 
 gentest (test4, "acc parallel loop firstprivate (t1, t2)",
 	 "acc loop reduction(+:t1) reduction(-:t2)")
-/* { DUPdg-warning {'t1' is used uninitialized} {} { target *-*-* } outer }
+/* { DUPdg-warning {'t1' is used uninitialized} {} { target *-*-* } 1 }
+   { DUPdg-note {in <_Pragma directive>} {} { target { ! offloading_enabled } } outer }
    { DUP_dg-note {'t1' was declared here} {} { target *-*-* } vars }
-   { dg-note {in expansion of macro 'gentest'} {} { target { ! offloading_enabled } } .-4 }
+   { dg-note {in expansion of macro 'gentest'} {} { target { ! offloading_enabled } } .-5 }
      TODO See PR101551 for 'offloading_enabled' differences.  */
-/* { DUPdg-warning {'t2' is used uninitialized} {} { target *-*-* } outer }
+/* { DUPdg-warning {'t2' is used uninitialized} {} { target *-*-* } 1 }
+   { DUPdg-note {in <_Pragma directive>} {} { target { ! offloading_enabled } } outer }
    { DUP_dg-note {'t2' was declared here} {} { target *-*-* } vars }
-   { DUP_dg-note {in expansion of macro 'gentest'} {} { target { ! offloading_enabled } } .-8 }
+   { DUP_dg-note {in expansion of macro 'gentest'} {} { target { ! offloading_enabled } } .-10 }
      TODO See PR101551 for 'offloading_enabled' differences.  */
 
 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/6] diagnostics: Fix macro tracking for ad-hoc locations
  2022-11-04 13:44 ` [PATCH 1/6] diagnostics: Fix macro tracking for ad-hoc locations Lewis Hyatt
@ 2022-11-04 15:53   ` David Malcolm
  0 siblings, 0 replies; 18+ messages in thread
From: David Malcolm @ 2022-11-04 15:53 UTC (permalink / raw)
  To: Lewis Hyatt, gcc-patches

On Fri, 2022-11-04 at 09:44 -0400, Lewis Hyatt via Gcc-patches wrote:
> The result of linemap_resolve_location() can be an ad-hoc location,
> if that is
> what was stored in a relevant macro map. 
> maybe_unwind_expanded_macro_loc()
> did not previously handle this case, causing it to print the wrong
> tracking
> information for an example such as the new testcase macro-trace-1.c. 
> Fix that
> by checking for ad-hoc locations where needed.

Thanks; looks good to me.
Dave

> 
> gcc/ChangeLog:
> 
>         * tree-diagnostic.cc (maybe_unwind_expanded_macro_loc):
> Handle ad-hoc
>         location in return value of linemap_resolve_location().
> 
> gcc/testsuite/ChangeLog:
> 
>         * c-c++-common/cpp/macro-trace-1.c: New test.
> ---
>  gcc/testsuite/c-c++-common/cpp/macro-trace-1.c | 4 ++++
>  gcc/tree-diagnostic.cc                         | 7 +++++--
>  2 files changed, 9 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/c-c++-common/cpp/macro-trace-1.c
> 
> diff --git a/gcc/testsuite/c-c++-common/cpp/macro-trace-1.c
> b/gcc/testsuite/c-c++-common/cpp/macro-trace-1.c
> new file mode 100644
> index 00000000000..34cfbb3dad3
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/cpp/macro-trace-1.c
> @@ -0,0 +1,4 @@
> +/* This token is long enough to require an ad-hoc location. Make
> sure that
> +   the macro trace still prints properly.  */
> +#define X "0123456789012345678901234567689" /* { dg-error {expected
> .* before string constant} } */
> +X /* { dg-note {in expansion of macro 'X'} } */
> diff --git a/gcc/tree-diagnostic.cc b/gcc/tree-diagnostic.cc
> index 0d79fe3c3c1..5cf3a1c17d2 100644
> --- a/gcc/tree-diagnostic.cc
> +++ b/gcc/tree-diagnostic.cc
> @@ -190,14 +190,17 @@ maybe_unwind_expanded_macro_loc
> (diagnostic_context *context,
>          location_t l = 
>            linemap_resolve_location (line_table, resolved_def_loc,
>                                      LRK_SPELLING_LOCATION,  &m);
> -        if (l < RESERVED_LOCATION_COUNT || LINEMAP_SYSP (m))
> +       location_t l0 = l;
> +       if (IS_ADHOC_LOC (l0))
> +         l0 = get_location_from_adhoc_loc (line_table, l0);
> +       if (l0 < RESERVED_LOCATION_COUNT || LINEMAP_SYSP (m))
>            continue;
>          
>         /* We need to print the context of the macro definition only
>            when the locus of the first displayed diagnostic
> (displayed
>            before this trace) was inside the definition of the
>            macro.  */
> -        int resolved_def_loc_line = SOURCE_LINE (m, l);
> +       const int resolved_def_loc_line = SOURCE_LINE (m, l0);
>          if (ix == 0 && saved_location_line != resolved_def_loc_line)
>            {
>              diagnostic_append_note (context, resolved_def_loc, 
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/6] diagnostics: Use an inline function rather than hardcoding <built-in> string
  2022-11-04 13:44 ` [PATCH 2/6] diagnostics: Use an inline function rather than hardcoding <built-in> string Lewis Hyatt
@ 2022-11-04 15:55   ` David Malcolm
  0 siblings, 0 replies; 18+ messages in thread
From: David Malcolm @ 2022-11-04 15:55 UTC (permalink / raw)
  To: Lewis Hyatt, gcc-patches

On Fri, 2022-11-04 at 09:44 -0400, Lewis Hyatt via Gcc-patches wrote:
> The string "<built-in>" is hard-coded in several places throughout
> the
> diagnostics code, and in some of those places, it is used incorrectly
> with
> respect to internationalization. (Comparing a translated string to an
> untranslated string.) The error is not currently observable in any
> output GCC
> actually produces, hence no testcase added here, but it's worth
> fixing, and
> also, I am shortly going to add a new such string and want to avoid
> hardcoding
> that one in similar places.

Thanks; looks good to me.

Dave

> 
> gcc/c-family/ChangeLog:
> 
>         * c-opts.cc (c_finish_options): Use special_fname_builtin ()
> rather
>         than a hard-coded string.
> 
> gcc/ChangeLog:
> 
>         * diagnostic.cc (diagnostic_get_location_text): Use
>         special_fname_builtin () rather than a hardcoded string
> (which was
>         also incorrectly left untranslated previously.)
>         * input.cc (special_fname_builtin): New function.
>         (expand_location_1): Use special_fname_builtin () rather than
> a
>         hard-coded string.
>         (test_builtins): Likewise.
>         * input.h (special_fname_builtin): Declare.
> 
> gcc/fortran/ChangeLog:
> 
>         * cpp.cc (gfc_cpp_init): Use special_fname_builtin () rather
> than a
>         hardcoded string (which was also incorrectly left
> untranslated
>         previously.)
>         * error.cc (gfc_diagnostic_build_locus_prefix): Likewise.
>         * f95-lang.cc (gfc_init): Likewise.
> ---
>  gcc/c-family/c-opts.cc  |  2 +-
>  gcc/diagnostic.cc       |  2 +-
>  gcc/fortran/cpp.cc      |  2 +-
>  gcc/fortran/error.cc    |  4 ++--
>  gcc/fortran/f95-lang.cc |  2 +-
>  gcc/input.cc            | 10 ++++++++--
>  gcc/input.h             |  3 +++
>  7 files changed, 17 insertions(+), 8 deletions(-)
> 
> diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
> index 32b929e3ece..521797fb7eb 100644
> --- a/gcc/c-family/c-opts.cc
> +++ b/gcc/c-family/c-opts.cc
> @@ -1476,7 +1476,7 @@ c_finish_options (void)
>      {
>        const line_map_ordinary *bltin_map
>         = linemap_check_ordinary (linemap_add (line_table, LC_RENAME,
> 0,
> -                                              _("<built-in>"), 0));
> +                                              special_fname_builtin
> (), 0));
>        cb_file_change (parse_in, bltin_map);
>        linemap_line_start (line_table, 0, 1);
>  
> diff --git a/gcc/diagnostic.cc b/gcc/diagnostic.cc
> index 22f7b0b6d6e..7c7ee6da746 100644
> --- a/gcc/diagnostic.cc
> +++ b/gcc/diagnostic.cc
> @@ -470,7 +470,7 @@ diagnostic_get_location_text (diagnostic_context
> *context,
>    const char *file = s.file ? s.file : progname;
>    int line = 0;
>    int col = -1;
> -  if (strcmp (file, N_("<built-in>")))
> +  if (strcmp (file, special_fname_builtin ()))
>      {
>        line = s.line;
>        if (context->show_column)
> diff --git a/gcc/fortran/cpp.cc b/gcc/fortran/cpp.cc
> index 364bd0d2a85..0b5755edbb4 100644
> --- a/gcc/fortran/cpp.cc
> +++ b/gcc/fortran/cpp.cc
> @@ -605,7 +605,7 @@ gfc_cpp_init (void)
>    if (gfc_option.flag_preprocessed)
>      return;
>  
> -  cpp_change_file (cpp_in, LC_RENAME, _("<built-in>"));
> +  cpp_change_file (cpp_in, LC_RENAME, special_fname_builtin ());
>    if (!gfc_cpp_option.no_predefined)
>      {
>        /* Make sure all of the builtins about to be declared have
> diff --git a/gcc/fortran/error.cc b/gcc/fortran/error.cc
> index c9d6edbb923..214fb78ba7b 100644
> --- a/gcc/fortran/error.cc
> +++ b/gcc/fortran/error.cc
> @@ -1147,7 +1147,7 @@ gfc_diagnostic_build_locus_prefix
> (diagnostic_context *context,
>    const char *locus_ce = colorize_stop (pp_show_color (pp));
>    return (s.file == NULL
>           ? build_message_string ("%s%s:%s", locus_cs, progname,
> locus_ce )
> -         : !strcmp (s.file, N_("<built-in>"))
> +         : !strcmp (s.file, special_fname_builtin ())
>           ? build_message_string ("%s%s:%s", locus_cs, s.file,
> locus_ce)
>           : context->show_column
>           ? build_message_string ("%s%s:%d:%d:%s", locus_cs, s.file,
> s.line,
> @@ -1167,7 +1167,7 @@ gfc_diagnostic_build_locus_prefix
> (diagnostic_context *context,
>  
>    return (s.file == NULL
>           ? build_message_string ("%s%s:%s", locus_cs, progname,
> locus_ce )
> -         : !strcmp (s.file, N_("<built-in>"))
> +         : !strcmp (s.file, special_fname_builtin ())
>           ? build_message_string ("%s%s:%s", locus_cs, s.file,
> locus_ce)
>           : context->show_column
>           ? build_message_string ("%s%s:%d:%d-%d:%s", locus_cs,
> s.file, s.line,
> diff --git a/gcc/fortran/f95-lang.cc b/gcc/fortran/f95-lang.cc
> index a6750bea787..0d83f3f8b69 100644
> --- a/gcc/fortran/f95-lang.cc
> +++ b/gcc/fortran/f95-lang.cc
> @@ -259,7 +259,7 @@ gfc_init (void)
>    if (!gfc_cpp_enabled ())
>      {
>        linemap_add (line_table, LC_ENTER, false, gfc_source_file, 1);
> -      linemap_add (line_table, LC_RENAME, false, "<built-in>", 0);
> +      linemap_add (line_table, LC_RENAME, false,
> special_fname_builtin (), 0);
>      }
>    else
>      gfc_cpp_init_0 ();
> diff --git a/gcc/input.cc b/gcc/input.cc
> index a28abfac5ac..483cb6e940d 100644
> --- a/gcc/input.cc
> +++ b/gcc/input.cc
> @@ -29,6 +29,12 @@ along with GCC; see the file COPYING3.  If not see
>  #define HAVE_ICONV 0
>  #endif
>  
> +const char *
> +special_fname_builtin ()
> +{
> +  return _("<built-in>");
> +}
> +
>  /* Input charset configuration.  */
>  static const char *default_charset_callback (const char *)
>  {
> @@ -275,7 +281,7 @@ expand_location_1 (location_t loc,
>  
>    xloc.data = block;
>    if (loc <= BUILTINS_LOCATION)
> -    xloc.file = loc == UNKNOWN_LOCATION ? NULL : _("<built-in>");
> +    xloc.file = loc == UNKNOWN_LOCATION ? NULL :
> special_fname_builtin ();
>  
>    return xloc;
>  }
> @@ -2102,7 +2108,7 @@ test_unknown_location ()
>  static void
>  test_builtins ()
>  {
> -  assert_loceq (_("<built-in>"), 0, 0, BUILTINS_LOCATION);
> +  assert_loceq (special_fname_builtin (), 0, 0, BUILTINS_LOCATION);
>    ASSERT_PRED1 (is_location_from_builtin_token, BUILTINS_LOCATION);
>  }
>  
> diff --git a/gcc/input.h b/gcc/input.h
> index 11c571d076f..0b23e66e53b 100644
> --- a/gcc/input.h
> +++ b/gcc/input.h
> @@ -32,6 +32,9 @@ extern GTY(()) class line_maps *saved_line_table;
>  /* The location for declarations in "<built-in>" */
>  #define BUILTINS_LOCATION ((location_t) 1)
>  
> +/* Returns the translated string referring to the special location. 
> */
> +const char *special_fname_builtin ();
> +
>  /* line-map.cc reserves RESERVED_LOCATION_COUNT to the user.  Ensure
>     both UNKNOWN_LOCATION and BUILTINS_LOCATION fit into that.  */
>  STATIC_ASSERT (BUILTINS_LOCATION < RESERVED_LOCATION_COUNT);
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/6] diagnostics: Support generated data in additional contexts
  2022-11-04 13:44 ` [PATCH 5/6] diagnostics: Support generated data in additional contexts Lewis Hyatt
@ 2022-11-04 16:42   ` David Malcolm
  2022-11-04 21:05     ` Lewis Hyatt
  0 siblings, 1 reply; 18+ messages in thread
From: David Malcolm @ 2022-11-04 16:42 UTC (permalink / raw)
  To: Lewis Hyatt, gcc-patches

On Fri, 2022-11-04 at 09:44 -0400, Lewis Hyatt via Gcc-patches wrote:
> Add awareness that diagnostic locations may be in generated buffers
> rather
> than an actual file to other places in the diagnostics code that may
> care,
> most notably SARIF output (which needs to obtain its own snapshots of
> the code
> involved). For edit context output, which outputs fixit hints as
> diffs, for
> now just make sure we ignore generated data buffers. At the moment,
> there is
> no ability for a fixit hint to be generated in such a buffer.
> 
> Because SARIF uses JSON as well, also add the ability to the
> json::string
> class to handle a buffer with nulls in the middle (since we place no
> restriction on LC_GEN content) by providing the option to specify the
> data
> length.

Please can you split this patch into three parts:
- the SARIF part
- the json changes
- the edit-context.cc changes (I think this at least counts as an
"obvious" change with respect to the other changes in the kit, though
I'm still working my way through patch 4 in the kit).

Please add a DejaGnu testcase to the SARIF part, with a diagnostic that
references a generated data buffer; see
  gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-*.c 
for examples of SARIF testcases.

Please add a selftest to the json change so that we have a unit test of
constructing a json::string with an embedded NUL, and how we serialize
such a string (probably to json.cc's test_writing_strings)

Thanks
Dave

> 
> gcc/ChangeLog:
> 
>         * diagnostic-format-sarif.cc (sarif_builder::xloc_to_fb): New
> function.
>         (sarif_builder::maybe_make_physical_location_object): Support
>         generated data locations.
>         (sarif_builder::make_artifact_location_object): Likewise.
>         (sarif_builder::maybe_make_region_object_for_context):
> Likewise.
>         (sarif_builder::make_artifact_object): Likewise.
>         (sarif_builder::maybe_make_artifact_content_object):
> Likewise.
>         (get_source_lines): Likewise.
>         * edit-context.cc (edit_context::apply_fixit): Ignore
> generated
>         locations if one should make its way this far.
>         * json.cc (string::string): Support non-null-terminated
> string.
>         (string::print): Likewise.
>         * json.h (class string): Likewise.
> ---
>  gcc/diagnostic-format-sarif.cc | 86 +++++++++++++++++++++-----------
> --
>  gcc/edit-context.cc            |  4 ++
>  gcc/json.cc                    | 17 +++++--
>  gcc/json.h                     |  5 +-
>  4 files changed, 75 insertions(+), 37 deletions(-)
> 
> diff --git a/gcc/diagnostic-format-sarif.cc b/gcc/diagnostic-format-
> sarif.cc
> index 7110db4edd6..c2d18a1a16e 100644
> --- a/gcc/diagnostic-format-sarif.cc
> +++ b/gcc/diagnostic-format-sarif.cc
> @@ -125,7 +125,10 @@ private:
>    json::array *maybe_make_kinds_array (diagnostic_event::meaning m)
> const;
>    json::object *maybe_make_physical_location_object (location_t
> loc);
>    json::object *make_artifact_location_object (location_t loc);
> -  json::object *make_artifact_location_object (const char
> *filename);
> +
> +  typedef std::pair<const char *, unsigned int> filename_or_buffer;
> +  json::object *make_artifact_location_object (filename_or_buffer
> fb);
> +
>    json::object *make_artifact_location_object_for_pwd () const;
>    json::object *maybe_make_region_object (location_t loc) const;
>    json::object *maybe_make_region_object_for_context (location_t
> loc) const;
> @@ -146,16 +149,17 @@ private:
>    json::object *make_reporting_descriptor_object_for_cwe_id (int
> cwe_id) const;
>    json::object *
>    make_reporting_descriptor_reference_object_for_cwe_id (int
> cwe_id);
> -  json::object *make_artifact_object (const char *filename);
> -  json::object *maybe_make_artifact_content_object (const char
> *filename) const;
> -  json::object *maybe_make_artifact_content_object (const char
> *filename,
> -                                                   int start_line,
> +  json::object *make_artifact_object (filename_or_buffer fb);
> +  json::object *
> +  maybe_make_artifact_content_object (filename_or_buffer fb) const;
> +  json::object *maybe_make_artifact_content_object
> (expanded_location xloc,
>                                                     int end_line)
> const;
>    json::object *make_fix_object (const rich_location &rich_loc);
>    json::object *make_artifact_change_object (const rich_location
> &richloc);
>    json::object *make_replacement_object (const fixit_hint &hint)
> const;
>    json::object *make_artifact_content_object (const char *text)
> const;
>    int get_sarif_column (expanded_location exploc) const;
> +  static filename_or_buffer xloc_to_fb (expanded_location xloc);
>  
>    diagnostic_context *m_context;
>  
> @@ -166,7 +170,11 @@ private:
>       diagnostic group.  */
>    sarif_result *m_cur_group_result;
>  
> -  hash_set <const char *> m_filenames;
> +  /* If the second member is >0, then this is a buffer of generated
> content,
> +     with that length, not a filename.  */
> +  hash_set <pair_hash <nofree_ptr_hash <const char>,
> +                      int_hash <unsigned int, -1U> >
> +           > m_filenames;
>    bool m_seen_any_relative_paths;
>    hash_set <free_string_hash> m_rule_id_set;
>    json::array *m_rules_arr;
> @@ -588,6 +596,15 @@ sarif_builder::make_location_object (const
> diagnostic_event &event)
>    return location_obj;
>  }
>  
> +/* Populate a filename_or_buffer pair from an expanded location.  */
> +sarif_builder::filename_or_buffer
> +sarif_builder::xloc_to_fb (expanded_location xloc)
> +{
> +  if (xloc.generated_data_len)
> +    return filename_or_buffer (xloc.generated_data,
> xloc.generated_data_len);
> +  return filename_or_buffer (xloc.file, 0);
> +}
> +
>  /* Make a physicalLocation object (SARIF v2.1.0 section 3.29) for
> LOC,
>     or return NULL;
>     Add any filename to the m_artifacts.  */
> @@ -603,7 +620,7 @@
> sarif_builder::maybe_make_physical_location_object (location_t loc)
>    /* "artifactLocation" property (SARIF v2.1.0 section 3.29.3).  */
>    json::object *artifact_loc_obj = make_artifact_location_object
> (loc);
>    phys_loc_obj->set ("artifactLocation", artifact_loc_obj);
> -  m_filenames.add (LOCATION_FILE (loc));
> +  m_filenames.add (xloc_to_fb (expand_location (loc)));
>  
>    /* "region" property (SARIF v2.1.0 section 3.29.4).  */
>    if (json::object *region_obj = maybe_make_region_object (loc))
> @@ -627,7 +644,7 @@
> sarif_builder::maybe_make_physical_location_object (location_t loc)
>  json::object *
>  sarif_builder::make_artifact_location_object (location_t loc)
>  {
> -  return make_artifact_location_object (LOCATION_FILE (loc));
> +  return make_artifact_location_object (xloc_to_fb (expand_location
> (loc)));
>  }
>  
>  /* The ID value for use in "uriBaseId" properties (SARIF v2.1.0
> section 3.4.4)
> @@ -639,10 +656,12 @@ sarif_builder::make_artifact_location_object
> (location_t loc)
>     or return NULL.  */
>  
>  json::object *
> -sarif_builder::make_artifact_location_object (const char *filename)
> +sarif_builder::make_artifact_location_object (filename_or_buffer fb)
>  {
>    json::object *artifact_loc_obj = new json::object ();
>  
> +  const auto filename = (fb.second ? special_fname_generated () :
> fb.first);
> +
>    /* "uri" property (SARIF v2.1.0 section 3.4.3).  */
>    artifact_loc_obj->set ("uri", new json::string (filename));
>  
> @@ -795,9 +814,7 @@
> sarif_builder::maybe_make_region_object_for_context (location_t loc)
> const
>  
>    /* "snippet" property (SARIF v2.1.0 section 3.30.13).  */
>    if (json::object *artifact_content_obj
> -        = maybe_make_artifact_content_object (exploc_start.file,
> -                                              exploc_start.line,
> -                                              exploc_finish.line))
> +       = maybe_make_artifact_content_object (exploc_start,
> exploc_finish.line))
>      region_obj->set ("snippet", artifact_content_obj);
>  
>    return region_obj;
> @@ -1248,24 +1265,24 @@ sarif_builder::maybe_make_cwe_taxonomy_object
> () const
>  /* Make an artifact object (SARIF v2.1.0 section 3.24).  */
>  
>  json::object *
> -sarif_builder::make_artifact_object (const char *filename)
> +sarif_builder::make_artifact_object (filename_or_buffer fb)
>  {
>    json::object *artifact_obj = new json::object ();
>  
>    /* "location" property (SARIF v2.1.0 section 3.24.2).  */
> -  json::object *artifact_loc_obj = make_artifact_location_object
> (filename);
> +  json::object *artifact_loc_obj = make_artifact_location_object
> (fb);
>    artifact_obj->set ("location", artifact_loc_obj);
>  
>    /* "contents" property (SARIF v2.1.0 section 3.24.8).  */
>    if (json::object *artifact_content_obj
> -       = maybe_make_artifact_content_object (filename))
> +       = maybe_make_artifact_content_object (fb))
>      artifact_obj->set ("contents", artifact_content_obj);
>  
>    /* "sourceLanguage" property (SARIF v2.1.0 section 3.24.10).  */
>    if (m_context->m_client_data_hooks)
>      if (const char *source_lang
>         = m_context->m_client_data_hooks-
> >maybe_get_sarif_source_language
> -           (filename))
> +           (fb.first))
>        artifact_obj->set ("sourceLanguage", new json::string
> (source_lang));
>  
>    return artifact_obj;
> @@ -1331,16 +1348,21 @@ maybe_read_file (const char *filename)
>     full contents of FILENAME.  */
>  
>  json::object *
> -sarif_builder::maybe_make_artifact_content_object (const char
> *filename) const
> +sarif_builder::maybe_make_artifact_content_object
> (filename_or_buffer fb) const
>  {
> -  char *text_utf8 = maybe_read_file (filename);
> -  if (!text_utf8)
> -    return NULL;
> -
> -  json::object *artifact_content_obj = new json::object ();
> -  artifact_content_obj->set ("text", new json::string (text_utf8));
> -  free (text_utf8);
> -
> +  json::object *artifact_content_obj = nullptr;
> +  if (fb.second)
> +    {
> +      artifact_content_obj = new json::object ();
> +      artifact_content_obj->set ("text", new json::string (fb.first,
> +                                                         
> fb.second));
> +    }
> +  else if (char *text_utf8 = maybe_read_file (fb.first))
> +    {
> +      artifact_content_obj = new json::object ();
> +      artifact_content_obj->set ("text", new json::string
> (text_utf8));
> +      free (text_utf8);
> +    }
>    return artifact_content_obj;
>  }
>  
> @@ -1348,15 +1370,14 @@
> sarif_builder::maybe_make_artifact_content_object (const char
> *filename) const
>     a freshly-allocated 0-terminated buffer containing them, or
> NULL.  */
>  
>  static char *
> -get_source_lines (const char *filename,
> -                 int start_line,
> +get_source_lines (expanded_location xloc,
>                   int end_line)
>  {
>    auto_vec<char> result;
>  
> -  for (int line = start_line; line <= end_line; line++)
> +  for (int line = xloc.line; line <= end_line; line++)
>      {
> -      char_span line_content = location_get_source_line (filename,
> line);
> +      char_span line_content = location_get_source_line (xloc,
> line);
>        if (!line_content.get_buffer ())
>         return NULL;
>        result.reserve (line_content.length () + 1);
> @@ -1370,14 +1391,13 @@ get_source_lines (const char *filename,
>  }
>  
>  /* Make an artifactContent object (SARIF v2.1.0 section 3.3) for the
> given
> -   run of lines within FILENAME (including the endpoints).  */
> +   run of lines starting at XLOC (including the endpoints).  */
>  
>  json::object *
> -sarif_builder::maybe_make_artifact_content_object (const char
> *filename,
> -                                                  int start_line,
> +sarif_builder::maybe_make_artifact_content_object (expanded_location
> xloc,
>                                                    int end_line)
> const
>  {
> -  char *text_utf8 = get_source_lines (filename, start_line,
> end_line);
> +  char *text_utf8 = get_source_lines (xloc, end_line);
>  
>    if (!text_utf8)
>      return NULL;
> diff --git a/gcc/edit-context.cc b/gcc/edit-context.cc
> index 6879ddd41b4..aa95bc0834f 100644
> --- a/gcc/edit-context.cc
> +++ b/gcc/edit-context.cc
> @@ -301,8 +301,12 @@ edit_context::apply_fixit (const fixit_hint
> *hint)
>      return false;
>    if (start.column == 0)
>      return false;
> +  if (start.generated_data)
> +    return false;
>    if (next_loc.column == 0)
>      return false;
> +  if (next_loc.generated_data)
> +    return false;
>  
>    edited_file &file = get_or_insert_file (start.file);
>    if (!m_valid)
> diff --git a/gcc/json.cc b/gcc/json.cc
> index 974f8c36825..3ebe8495e96 100644
> --- a/gcc/json.cc
> +++ b/gcc/json.cc
> @@ -190,6 +190,15 @@ string::string (const char *utf8)
>  {
>    gcc_assert (utf8);
>    m_utf8 = xstrdup (utf8);
> +  m_len = strlen (utf8);
> +}
> +
> +string::string (const char *utf8, size_t len)
> +{
> +  gcc_assert (utf8);
> +  m_utf8 = XNEWVEC (char, len);
> +  m_len = len;
> +  memcpy (m_utf8, utf8, len);
>  }
>  
>  /* Implementation of json::value::print for json::string.  */
> @@ -198,9 +207,9 @@ void
>  string::print (pretty_printer *pp) const
>  {
>    pp_character (pp, '"');
> -  for (const char *ptr = m_utf8; *ptr; ptr++)
> +  for (size_t i = 0; i != m_len; ++i)
>      {
> -      char ch = *ptr;
> +      char ch = m_utf8[i];
>        switch (ch)
>         {
>         case '"':
> @@ -224,7 +233,9 @@ string::print (pretty_printer *pp) const
>         case '\t':
>           pp_string (pp, "\\t");
>           break;
> -
> +       case '\0':
> +         pp_string (pp, "\\0");
> +         break;
>         default:
>           pp_character (pp, ch);
>         }
> diff --git a/gcc/json.h b/gcc/json.h
> index f272981259b..f7afd843dc5 100644
> --- a/gcc/json.h
> +++ b/gcc/json.h
> @@ -156,16 +156,19 @@ class integer_number : public value
>  class string : public value
>  {
>   public:
> -  string (const char *utf8);
> +  explicit string (const char *utf8);
> +  string (const char *utf8, size_t len);
>    ~string () { free (m_utf8); }
>  
>    enum kind get_kind () const final override { return JSON_STRING; }
>    void print (pretty_printer *pp) const final override;
>  
>    const char *get_string () const { return m_utf8; }
> +  size_t get_length () const { return m_len; }
>  
>   private:
>    char *m_utf8;
> +  size_t m_len;
>  };
>  
>  /* Subclass of value for the three JSON literals "true", "false",
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/6] diagnostics: Support generated data in additional contexts
  2022-11-04 16:42   ` David Malcolm
@ 2022-11-04 21:05     ` Lewis Hyatt
  2022-11-05  1:54       ` [PATCH 5b/6] diagnostics: Remove null-termination requirement for json::string David Malcolm
  2022-11-05  1:55       ` [PATCH 5a/6] diagnostics: Handle generated data locations in edit_context David Malcolm
  0 siblings, 2 replies; 18+ messages in thread
From: Lewis Hyatt @ 2022-11-04 21:05 UTC (permalink / raw)
  To: David Malcolm; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2765 bytes --]

On Fri, Nov 04, 2022 at 12:42:29PM -0400, David Malcolm wrote:
> On Fri, 2022-11-04 at 09:44 -0400, Lewis Hyatt via Gcc-patches wrote:
> > Add awareness that diagnostic locations may be in generated buffers
> > rather
> > than an actual file to other places in the diagnostics code that may
> > care,
> > most notably SARIF output (which needs to obtain its own snapshots of
> > the code
> > involved). For edit context output, which outputs fixit hints as
> > diffs, for
> > now just make sure we ignore generated data buffers. At the moment,
> > there is
> > no ability for a fixit hint to be generated in such a buffer.
> > 
> > Because SARIF uses JSON as well, also add the ability to the
> > json::string
> > class to handle a buffer with nulls in the middle (since we place no
> > restriction on LC_GEN content) by providing the option to specify the
> > data
> > length.
> 
> Please can you split this patch into three parts:
> - the SARIF part
> - the json changes
> - the edit-context.cc changes (I think this at least counts as an
> "obvious" change with respect to the other changes in the kit, though
> I'm still working my way through patch 4 in the kit).
> 
> Please add a DejaGnu testcase to the SARIF part, with a diagnostic that
> references a generated data buffer; see
>   gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-*.c 
> for examples of SARIF testcases.
> 
> Please add a selftest to the json change so that we have a unit test of
> constructing a json::string with an embedded NUL, and how we serialize
> such a string (probably to json.cc's test_writing_strings)
> 
> Thanks
> Dave

Yes, certainly, sorry for not splitting it up more to start with. Regarding
the SARIF testcase, it's not that easy to get SARIF output to actually output
generated data, because as of now it can only appear in a _Pragma, and SARIF
does not output macro definitions currently. I think the only way I know to do
it, is to make use of -fdump-internal-locations, which generates top-level
inform() calls inside the _Pragma that can end up in the SARIF output. So I
wrote a testcase that does this, but not sure how you will feel about having
the testsuite rely on this internal debugging option.

I wasn't sure what's the best way to send the 3 split up patches. I attached
them here as 5a/6, 5b/6, 5c/6, in case that's right, but I wasn't sure if I should just
resend the whole batch (minus perhaps the 2 you have already acked), and/or if
I should wait for feedback on the other patches first. Happy to do whatever
makes it easier for you, and thanks for your time! Note that the new SARIF
patch (5c/6) now needs to come last in the series, after the patch 6/6 that
actually supports _Pragma, so that the new testcase can make use of that.

-Lewis

[-- Attachment #2: Pragma_patch_5a.txt --]
[-- Type: text/plain, Size: 1119 bytes --]

[PATCH 5a/6] diagnostics: Handle generated data locations in edit_context

Class edit_context handles outputting fixit hints in diff form that could be
manually or automatically applied by the user. This will not make sense for
generated data locations, such as the contents of a _Pragma string, because
the text to be modified does not appear in the user's input files. We do not
currently ever generate fixit hints in such a context, but for future-proofing
purposes, ignore such locations in edit context now.

gcc/ChangeLog:

	* edit-context.cc (edit_context::apply_fixit): Ignore locations in
	generated data.

diff --git a/gcc/edit-context.cc b/gcc/edit-context.cc
index 6879ddd41b4..aa95bc0834f 100644
--- a/gcc/edit-context.cc
+++ b/gcc/edit-context.cc
@@ -301,8 +301,12 @@ edit_context::apply_fixit (const fixit_hint *hint)
     return false;
   if (start.column == 0)
     return false;
+  if (start.generated_data)
+    return false;
   if (next_loc.column == 0)
     return false;
+  if (next_loc.generated_data)
+    return false;
 
   edited_file &file = get_or_insert_file (start.file);
   if (!m_valid)

[-- Attachment #3: Pragma_patch_5b.txt --]
[-- Type: text/plain, Size: 2940 bytes --]

[PATCH 5b/6] diagnostics: Remove null-termination requirement for json::string

json::string currently handles null-terminated data and so can't work with
data that may contain embedded null bytes or that is not null-terminated.
Supporting such data will make json::string more robust in some contexts, such
as SARIF output, which uses it to output user source code that may contain
embedded null bytes.

gcc/ChangeLog:

	* json.h (class string): Add M_LEN member to store the length of
	the data.  Add constructor taking an explicit length.
	* json.cc (string::string):  Implement the new constructor.
	(string::print): Support print strings that are not null-terminated.
	Escape embdedded null bytes on output.
	(test_writing_strings): Test the new null-byte-related features of
	json::string.

diff --git a/gcc/json.cc b/gcc/json.cc
index 974f8c36825..3a79cac02ac 100644
--- a/gcc/json.cc
+++ b/gcc/json.cc
@@ -190,6 +190,15 @@ string::string (const char *utf8)
 {
   gcc_assert (utf8);
   m_utf8 = xstrdup (utf8);
+  m_len = strlen (utf8);
+}
+
+string::string (const char *utf8, size_t len)
+{
+  gcc_assert (utf8);
+  m_utf8 = XNEWVEC (char, len);
+  m_len = len;
+  memcpy (m_utf8, utf8, len);
 }
 
 /* Implementation of json::value::print for json::string.  */
@@ -198,9 +207,9 @@ void
 string::print (pretty_printer *pp) const
 {
   pp_character (pp, '"');
-  for (const char *ptr = m_utf8; *ptr; ptr++)
+  for (size_t i = 0; i != m_len; ++i)
     {
-      char ch = *ptr;
+      char ch = m_utf8[i];
       switch (ch)
 	{
 	case '"':
@@ -224,7 +233,9 @@ string::print (pretty_printer *pp) const
 	case '\t':
 	  pp_string (pp, "\\t");
 	  break;
-
+	case '\0':
+	  pp_string (pp, "\\0");
+	  break;
 	default:
 	  pp_character (pp, ch);
 	}
@@ -341,6 +352,12 @@ test_writing_strings ()
 
   string contains_quotes ("before \"quoted\" after");
   assert_print_eq (contains_quotes, "\"before \\\"quoted\\\" after\"");
+
+  const char data[] = {'a', 'b', 'c', 'd', '\0', 'e', 'f'};
+  string not_terminated (data, 3);
+  assert_print_eq (not_terminated, "\"abc\"");
+  string embedded_null (data, sizeof data);
+  assert_print_eq (embedded_null, "\"abcd\\0ef\"");
 }
 
 /* Verify that JSON literals are written correctly.  */
diff --git a/gcc/json.h b/gcc/json.h
index f272981259b..f7afd843dc5 100644
--- a/gcc/json.h
+++ b/gcc/json.h
@@ -156,16 +156,19 @@ class integer_number : public value
 class string : public value
 {
  public:
-  string (const char *utf8);
+  explicit string (const char *utf8);
+  string (const char *utf8, size_t len);
   ~string () { free (m_utf8); }
 
   enum kind get_kind () const final override { return JSON_STRING; }
   void print (pretty_printer *pp) const final override;
 
   const char *get_string () const { return m_utf8; }
+  size_t get_length () const { return m_len; }
 
  private:
   char *m_utf8;
+  size_t m_len;
 };
 
 /* Subclass of value for the three JSON literals "true", "false",

[-- Attachment #4: Pragma_patch_5c.txt --]
[-- Type: text/plain, Size: 12168 bytes --]

[PATCH 5c/6] diagnostics: Support generated data locations in SARIF output

The diagnostics routines for SARIF output need to read the source code back
in, so that they can generate "snippet" and "content" records, so they need to
be able to cope with generated data locations.  Add support for that in
diagnostic-format-sarif.cc.

gcc/ChangeLog:

	* diagnostic-format-sarif.cc (sarif_builder::xloc_to_fb): New function.
	(sarif_builder::maybe_make_physical_location_object): Support
	generated data locations.
	(sarif_builder::make_artifact_location_object): Likewise.
	(sarif_builder::maybe_make_region_object_for_context): Likewise.
	(sarif_builder::make_artifact_object): Likewise.
	(sarif_builder::maybe_make_artifact_content_object): Likewise.
	(get_source_lines): Likewise.

gcc/testsuite/ChangeLog:

	* c-c++-common/diagnostic-format-sarif-file-5.c: New test.

diff --git a/gcc/diagnostic-format-sarif.cc b/gcc/diagnostic-format-sarif.cc
index 7110db4edd6..81141c9358f 100644
--- a/gcc/diagnostic-format-sarif.cc
+++ b/gcc/diagnostic-format-sarif.cc
@@ -125,7 +125,10 @@ private:
   json::array *maybe_make_kinds_array (diagnostic_event::meaning m) const;
   json::object *maybe_make_physical_location_object (location_t loc);
   json::object *make_artifact_location_object (location_t loc);
-  json::object *make_artifact_location_object (const char *filename);
+
+  typedef std::pair<const char *, unsigned int> filename_or_buffer;
+  json::object *make_artifact_location_object (filename_or_buffer fb);
+
   json::object *make_artifact_location_object_for_pwd () const;
   json::object *maybe_make_region_object (location_t loc) const;
   json::object *maybe_make_region_object_for_context (location_t loc) const;
@@ -146,16 +149,17 @@ private:
   json::object *make_reporting_descriptor_object_for_cwe_id (int cwe_id) const;
   json::object *
   make_reporting_descriptor_reference_object_for_cwe_id (int cwe_id);
-  json::object *make_artifact_object (const char *filename);
-  json::object *maybe_make_artifact_content_object (const char *filename) const;
-  json::object *maybe_make_artifact_content_object (const char *filename,
-						    int start_line,
+  json::object *make_artifact_object (filename_or_buffer fb);
+  json::object *
+  maybe_make_artifact_content_object (filename_or_buffer fb) const;
+  json::object *maybe_make_artifact_content_object (expanded_location xloc,
 						    int end_line) const;
   json::object *make_fix_object (const rich_location &rich_loc);
   json::object *make_artifact_change_object (const rich_location &richloc);
   json::object *make_replacement_object (const fixit_hint &hint) const;
   json::object *make_artifact_content_object (const char *text) const;
   int get_sarif_column (expanded_location exploc) const;
+  static filename_or_buffer xloc_to_fb (expanded_location xloc);
 
   diagnostic_context *m_context;
 
@@ -166,7 +170,11 @@ private:
      diagnostic group.  */
   sarif_result *m_cur_group_result;
 
-  hash_set <const char *> m_filenames;
+  /* If the second member is >0, then this is a buffer of generated content,
+     with that length, not a filename.  */
+  hash_set <pair_hash <nofree_ptr_hash <const char>,
+		       int_hash <unsigned int, -1U> >
+	    > m_filenames;
   bool m_seen_any_relative_paths;
   hash_set <free_string_hash> m_rule_id_set;
   json::array *m_rules_arr;
@@ -588,6 +596,15 @@ sarif_builder::make_location_object (const diagnostic_event &event)
   return location_obj;
 }
 
+/* Populate a filename_or_buffer pair from an expanded location.  */
+sarif_builder::filename_or_buffer
+sarif_builder::xloc_to_fb (expanded_location xloc)
+{
+  if (xloc.generated_data_len)
+    return filename_or_buffer (xloc.generated_data, xloc.generated_data_len);
+  return filename_or_buffer (xloc.file, 0);
+}
+
 /* Make a physicalLocation object (SARIF v2.1.0 section 3.29) for LOC,
    or return NULL;
    Add any filename to the m_artifacts.  */
@@ -603,7 +620,7 @@ sarif_builder::maybe_make_physical_location_object (location_t loc)
   /* "artifactLocation" property (SARIF v2.1.0 section 3.29.3).  */
   json::object *artifact_loc_obj = make_artifact_location_object (loc);
   phys_loc_obj->set ("artifactLocation", artifact_loc_obj);
-  m_filenames.add (LOCATION_FILE (loc));
+  m_filenames.add (xloc_to_fb (expand_location (loc)));
 
   /* "region" property (SARIF v2.1.0 section 3.29.4).  */
   if (json::object *region_obj = maybe_make_region_object (loc))
@@ -627,7 +644,7 @@ sarif_builder::maybe_make_physical_location_object (location_t loc)
 json::object *
 sarif_builder::make_artifact_location_object (location_t loc)
 {
-  return make_artifact_location_object (LOCATION_FILE (loc));
+  return make_artifact_location_object (xloc_to_fb (expand_location (loc)));
 }
 
 /* The ID value for use in "uriBaseId" properties (SARIF v2.1.0 section 3.4.4)
@@ -639,10 +656,12 @@ sarif_builder::make_artifact_location_object (location_t loc)
    or return NULL.  */
 
 json::object *
-sarif_builder::make_artifact_location_object (const char *filename)
+sarif_builder::make_artifact_location_object (filename_or_buffer fb)
 {
   json::object *artifact_loc_obj = new json::object ();
 
+  const auto filename = (fb.second ? special_fname_generated () : fb.first);
+
   /* "uri" property (SARIF v2.1.0 section 3.4.3).  */
   artifact_loc_obj->set ("uri", new json::string (filename));
 
@@ -795,9 +814,7 @@ sarif_builder::maybe_make_region_object_for_context (location_t loc) const
 
   /* "snippet" property (SARIF v2.1.0 section 3.30.13).  */
   if (json::object *artifact_content_obj
-	 = maybe_make_artifact_content_object (exploc_start.file,
-					       exploc_start.line,
-					       exploc_finish.line))
+	= maybe_make_artifact_content_object (exploc_start, exploc_finish.line))
     region_obj->set ("snippet", artifact_content_obj);
 
   return region_obj;
@@ -1248,24 +1265,24 @@ sarif_builder::maybe_make_cwe_taxonomy_object () const
 /* Make an artifact object (SARIF v2.1.0 section 3.24).  */
 
 json::object *
-sarif_builder::make_artifact_object (const char *filename)
+sarif_builder::make_artifact_object (filename_or_buffer fb)
 {
   json::object *artifact_obj = new json::object ();
 
   /* "location" property (SARIF v2.1.0 section 3.24.2).  */
-  json::object *artifact_loc_obj = make_artifact_location_object (filename);
+  json::object *artifact_loc_obj = make_artifact_location_object (fb);
   artifact_obj->set ("location", artifact_loc_obj);
 
   /* "contents" property (SARIF v2.1.0 section 3.24.8).  */
   if (json::object *artifact_content_obj
-	= maybe_make_artifact_content_object (filename))
+	= maybe_make_artifact_content_object (fb))
     artifact_obj->set ("contents", artifact_content_obj);
 
   /* "sourceLanguage" property (SARIF v2.1.0 section 3.24.10).  */
   if (m_context->m_client_data_hooks)
     if (const char *source_lang
 	= m_context->m_client_data_hooks->maybe_get_sarif_source_language
-	    (filename))
+	    (fb.first))
       artifact_obj->set ("sourceLanguage", new json::string (source_lang));
 
   return artifact_obj;
@@ -1331,34 +1348,40 @@ maybe_read_file (const char *filename)
    full contents of FILENAME.  */
 
 json::object *
-sarif_builder::maybe_make_artifact_content_object (const char *filename) const
+sarif_builder::maybe_make_artifact_content_object (filename_or_buffer fb) const
 {
-  char *text_utf8 = maybe_read_file (filename);
-  if (!text_utf8)
-    return NULL;
-
-  json::object *artifact_content_obj = new json::object ();
-  artifact_content_obj->set ("text", new json::string (text_utf8));
-  free (text_utf8);
-
+  json::object *artifact_content_obj = nullptr;
+  if (fb.second)
+    {
+      artifact_content_obj = new json::object ();
+      artifact_content_obj->set ("text", new json::string (fb.first,
+							   fb.second));
+    }
+  else if (char *text_utf8 = maybe_read_file (fb.first))
+    {
+      artifact_content_obj = new json::object ();
+      artifact_content_obj->set ("text", new json::string (text_utf8));
+      free (text_utf8);
+    }
   return artifact_content_obj;
 }
 
 /* Attempt to read the given range of lines from FILENAME; return
-   a freshly-allocated 0-terminated buffer containing them, or NULL.  */
+   a freshly-allocated buffer containing them, or NULL.
+   The buffer is null-terminated, but could also contain embedded null
+   bytes, so the char_span's length() accessor should be used.  */
 
-static char *
-get_source_lines (const char *filename,
-		  int start_line,
+static char_span
+get_source_lines (expanded_location xloc,
 		  int end_line)
 {
   auto_vec<char> result;
 
-  for (int line = start_line; line <= end_line; line++)
+  for (int line = xloc.line; line <= end_line; line++)
     {
-      char_span line_content = location_get_source_line (filename, line);
+      char_span line_content = location_get_source_line (xloc, line);
       if (!line_content.get_buffer ())
-	return NULL;
+	return char_span (nullptr, 0);
       result.reserve (line_content.length () + 1);
       for (size_t i = 0; i < line_content.length (); i++)
 	result.quick_push (line_content[i]);
@@ -1366,26 +1389,25 @@ get_source_lines (const char *filename,
     }
   result.safe_push ('\0');
 
-  return xstrdup (result.address ());
+  return char_span (xstrdup (result.address ()), result.length() - 1);
 }
 
 /* Make an artifactContent object (SARIF v2.1.0 section 3.3) for the given
-   run of lines within FILENAME (including the endpoints).  */
+   run of lines starting at XLOC (including the endpoints).  */
 
 json::object *
-sarif_builder::maybe_make_artifact_content_object (const char *filename,
-						   int start_line,
+sarif_builder::maybe_make_artifact_content_object (expanded_location xloc,
 						   int end_line) const
 {
-  char *text_utf8 = get_source_lines (filename, start_line, end_line);
+  const char_span text_utf8 = get_source_lines (xloc, end_line);
 
   if (!text_utf8)
     return NULL;
 
   json::object *artifact_content_obj = new json::object ();
-  artifact_content_obj->set ("text", new json::string (text_utf8));
-  free (text_utf8);
-
+  artifact_content_obj->set ("text", new json::string (text_utf8.get_buffer (),
+						       text_utf8.length ()));
+  free (const_cast<char *> (text_utf8.get_buffer ()));
   return artifact_content_obj;
 }
 
diff --git a/gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-5.c b/gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-5.c
new file mode 100644
index 00000000000..2ca6a069d3f
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-5.c
@@ -0,0 +1,31 @@
+/* The goal is to test SARIF output of generated data, such as a _Pragma string.
+   But SARIF output as of yet does not output macro definitions, so such
+   generated data buffers never end up in the typical SARIF output.  One way we
+   can achieve it is to use -fdump-internal-locations, which outputs top-level
+   diagnostic notes inside macro definitions, that SARIF will end up processing.
+   It also outputs a lot of other stuff to stderr (not to the SARIF file) that
+   is not relevant to this test, so we use a blanket dg-regexp to filter all of
+   that away.  */
+
+/* { dg-do compile } */
+/* { dg-options "-fdiagnostics-format=sarif-file -fdump-internal-locations" } */
+/* { dg-allow-blank-lines-in-output "" } */
+
+_Pragma("GCC diagnostic push")
+
+/* { dg-regexp {(.|[\n\r])*} } */
+
+/* Because of the way -fdump-internal-locations works, these regexes themselves
+   will end up in the sarif output also.  But due to the escaping, they don't
+   match themselves, so they still test what we need.  */
+
+/* Four of this pair are output for the tokens inside the
+   _Pragma string (3 plus a PRAGMA_EOL).  */
+
+/* { dg-final { scan-sarif-file "\"artifactLocation\": \{\"uri\": \"<generated>\"," } } */
+/* { dg-final { scan-sarif-file "\"snippet\": \{\"text\": \"GCC diagnostic push\\\\n\"" } } */
+
+/* One of this pair is output for the overall internal location.  */
+
+/* { dg-final { scan-sarif-file "\{\"location\": \{\"uri\": \"<generated>\"," } } */
+/* { dg-final { scan-sarif-file "\"contents\": \{\"text\": \"GCC diagnostic push\\\\n\\\\0" } } */

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5b/6] diagnostics: Remove null-termination requirement for json::string
  2022-11-04 21:05     ` Lewis Hyatt
@ 2022-11-05  1:54       ` David Malcolm
  2022-11-05  1:55       ` [PATCH 5a/6] diagnostics: Handle generated data locations in edit_context David Malcolm
  1 sibling, 0 replies; 18+ messages in thread
From: David Malcolm @ 2022-11-05  1:54 UTC (permalink / raw)
  To: Lewis Hyatt; +Cc: gcc-patches

On Fri, 2022-11-04 at 17:05 -0400, Lewis Hyatt wrote:
> [PATCH 5b/6] diagnostics: Remove null-termination requirement for
> json::string
> 
> json::string currently handles null-terminated data and so can't work
> with
> data that may contain embedded null bytes or that is not null-
> terminated.
> Supporting such data will make json::string more robust in some
> contexts, such
> as SARIF output, which uses it to output user source code that may
> contain
> embedded null bytes.
> 
> gcc/ChangeLog:
> 
> 	* json.h (class string): Add M_LEN member to store the
> length of
> 	the data.  Add constructor taking an explicit length.
> 	* json.cc (string::string):  Implement the new constructor.
> 	(string::print): Support print strings that are not null-
> terminated.
> 	Escape embdedded null bytes on output.
> 	(test_writing_strings): Test the new null-byte-related
> features of
> 	json::string.
> 

[...snip...]

> diff --git a/gcc/json.h b/gcc/json.h
> index f272981259b..f7afd843dc5 100644
> --- a/gcc/json.h
> +++ b/gcc/json.h
> @@ -156,16 +156,19 @@ class integer_number : public value
>  class string : public value
>  {
>   public:
> -  string (const char *utf8);
> +  explicit string (const char *utf8);
> +  string (const char *utf8, size_t len);
>    ~string () { free (m_utf8); }
>  
>    enum kind get_kind () const final override { return JSON_STRING; }
>    void print (pretty_printer *pp) const final override;
>  
>    const char *get_string () const { return m_utf8; }

I worried that json::string::get_string previously returned a NUL-
terminated string, but now there's no guarantee of termination, and
that this might break something.  But I checked, and it seems that this
accessor doesn't get used anywhere in our source tree.

> +  size_t get_length () const { return m_len; }

Does anything actually use this?

Perhaps it might make sense to delete the get_string accessor, and if
we ever need one, replace it with an accessor that returns a char_span?

>  
>   private:
>    char *m_utf8;
> +  size_t m_len;
>  };
>  

Thanks for adding the unit test.

The 5b patch is OK for trunk.

Dave


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5a/6] diagnostics: Handle generated data locations in edit_context
  2022-11-04 21:05     ` Lewis Hyatt
  2022-11-05  1:54       ` [PATCH 5b/6] diagnostics: Remove null-termination requirement for json::string David Malcolm
@ 2022-11-05  1:55       ` David Malcolm
  1 sibling, 0 replies; 18+ messages in thread
From: David Malcolm @ 2022-11-05  1:55 UTC (permalink / raw)
  To: Lewis Hyatt; +Cc: gcc-patches

On Fri, 2022-11-04 at 17:05 -0400, Lewis Hyatt wrote:
> [PATCH 5a/6] diagnostics: Handle generated data locations in
> edit_context
> 
> Class edit_context handles outputting fixit hints in diff form that
> could be
> manually or automatically applied by the user. This will not make
> sense for
> generated data locations, such as the contents of a _Pragma string,
> because
> the text to be modified does not appear in the user's input files. We
> do not
> currently ever generate fixit hints in such a context, but for
> future-proofing
> purposes, ignore such locations in edit context now.
> 
> gcc/ChangeLog:
> 
> 	* edit-context.cc (edit_context::apply_fixit): Ignore
> locations in
> 	generated data.
> 
> diff --git a/gcc/edit-context.cc b/gcc/edit-context.cc
> index 6879ddd41b4..aa95bc0834f 100644
> --- a/gcc/edit-context.cc
> +++ b/gcc/edit-context.cc
> @@ -301,8 +301,12 @@ edit_context::apply_fixit (const fixit_hint
> *hint)
>      return false;
>    if (start.column == 0)
>      return false;
> +  if (start.generated_data)
> +    return false;
>    if (next_loc.column == 0)
>      return false;
> +  if (next_loc.generated_data)
> +    return false;
>  
>    edited_file &file = get_or_insert_file (start.file);
>    if (!m_valid)

This patch is OK for trunk once the prerequisite patch is also
approved.

Thanks
Dave


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/6] diagnostics: libcpp: Add LC_GEN linemaps to support in-memory buffers
  2022-11-04 13:44 ` [PATCH 4/6] diagnostics: libcpp: Add LC_GEN linemaps to support in-memory buffers Lewis Hyatt
@ 2022-11-05 16:23   ` David Malcolm
  2022-11-05 17:28     ` Lewis Hyatt
  2022-11-17 21:21     ` Lewis Hyatt
  0 siblings, 2 replies; 18+ messages in thread
From: David Malcolm @ 2022-11-05 16:23 UTC (permalink / raw)
  To: Lewis Hyatt, gcc-patches

On Fri, 2022-11-04 at 09:44 -0400, Lewis Hyatt via Gcc-patches wrote:
> Add a new linemap reason LC_GEN which enables encoding the location
> of data
> that was generated during compilation and does not appear in any
> source file.
> There could be many use cases, such as, for instance, referring to
> the content
> of builtin macros (not yet implemented, but an easy lift after this
> one.) The
> first intended application is to create a place to store the input to
> a
> _Pragma directive, so that proper locations can be assigned to those
> tokens. This will be done in a subsequent commit.
> 
> The actual change needed to the line-maps API in libcpp is very
> minimal and
> requires no space overhead in the line map data structures (on 64-bit
> systems
> that is; one newly added data member to class line_map_ordinary sits
> inside
> former padding bytes.) An LC_GEN map is just an ordinary map like any
> other,
> but the TO_FILE member that normally points to the file name points
> instead to
> the actual data.  This works automatically with PCH as well, for the
> same
> reason that the file name makes its way into a PCH.
> 
> Outside libcpp, there are many small changes but most of them are to
> selftests, which are necessarily more sensitive to implementation
> details. From the perspective of the user (the "user", here, being a
> frontend
> using line maps or else the diagnostics infrastructure), the chief
> visible
> change is that the function location_get_source_line() should be
> passed an
> expanded_location object instead of a separate filename and line
> number.  This
> is not a big change because in most cases, this information came
> anyway from a
> call to expand_location and the needed expanded_location object is
> readily
> available. The new overload of location_get_source_line() uses the
> extra
> information in the expanded_location object to obtain the data from
> the
> in-memory buffer when it originated from an LC_GEN map.
> 
> Until the subsequent patch that starts using LC_GEN maps, none are
> yet
> generated within GCC, hence nothing is added to the testsuite here;
> but all
> relevant selftests have been extended to cover generated data maps in
> addition to normal files.

Thanks for this patch.


[...snip...]
> 
> diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
> index 5890c18bdc3..2935d7fb236 100644
> --- a/gcc/c-family/c-common.cc
> +++ b/gcc/c-family/c-common.cc
> @@ -9183,11 +9183,14 @@ try_to_locate_new_include_insertion_point (const char *file, location_t loc)
>        const line_map_ordinary *ord_map
>         = LINEMAPS_ORDINARY_MAP_AT (line_table, i);
>  
> +      if (ord_map->reason == LC_GEN)
> +       continue;
> +
>        if (const line_map_ordinary *from
>           = linemap_included_from_linemap (line_table, ord_map))
>         /* We cannot use pointer equality, because with preprocessed
>            input all filename strings are unique.  */
> -       if (0 == strcmp (from->to_file, file))
> +       if (from->reason != LC_GEN && 0 == strcmp (from->to_file, file))
>           {
>             last_include_ord_map = from;
>             last_ord_map_after_include = NULL;

[...snip...]

I'm not a fan of having the "to_file" field change meaning based on
whether reason is LC_GEN.

How involved would it be to split line_map_ordinary into two
subclasses, so that we'd have this hierarchy (with indentation showing
inheritance):

line_map
  line_map_ordinary
    line_map_ordinary_file
    line_map_ordinary_generated
  line_map_macro

Alternatively, how about renaming "to_file" to be "data" (or "m_data"),
to emphasize that it might not be a filename, and that we have to check
everywhere we access that field.

Please can all those checks for LC_GEN go into an inline function so we
can write e.g.
  map->generated_p ()
or somesuch.

If I reading things right, patch 6 adds the sole usage of this in
destringize_and_run.  Would we ever want to discriminate between
different kinds of generated buffers?

[...snip...]

> @@ -796,10 +798,13 @@ diagnostic_report_current_module (diagnostic_context *context, location_t where)
>                  N_("of module"),
>                  N_("In module imported at"),   /* 6 */
>                  N_("imported at"),
> +                N_("In buffer generated from"),   /* 8 */
>                 };

We use the wording "destringized" in:

so maybe this should be "In buffer destringized from" ???  (I'm not
sure) 

[...snip...]

> diff --git a/gcc/input.cc b/gcc/input.cc
> index 483cb6e940d..3cf5480551d 100644
> --- a/gcc/input.cc
> +++ b/gcc/input.cc

[..snip...]

> @@ -58,7 +64,7 @@ public:
>    ~file_cache_slot ();

My initial thought reading the input.cc part of this patch was that I
want it to be very clear when a file_cache_slot is for a real file vs
when we're replaying generated data.  I'd hoped that this could have
been expressed via inheritance, but we preallocate all the cache slots
once in an array in file_cache's ctor and the slots get reused over
time.  So instead of that, can we please have some kind of:

   bool file_slot_p () const;
   bool generated_slot_p () const;

or somesuch, so that we can have clear assertions and conditionals
about the current state of a slot (I think the discriminating condition
is that generated_data_len > 0, right?)

If I'm reading things right, it looks like file_cache_slot::m_file_path
does double duty after this patch, and is either a filename, or a
pointer to the generated data.  If so, please can the patch rename it,
and have all usage guarded appropriately.  Can it be a union? (or does
the ctor prevent that?)

[...snip...]
 
> @@ -445,16 +461,23 @@ file_cache::evicted_cache_tab_entry (unsigned *highest_use_count)
>     num_file_slots files are cached.  */
>  
>  file_cache_slot*
> -file_cache::add_file (const char *file_path)
> +file_cache::add_file (const char *file_path, unsigned int generated_data_len)

Can we split this into two functions: one for files, and one for
generated data?  (add_file vs add_generated_data?)

>  {
>  
> -  FILE *fp = fopen (file_path, "r");
> -  if (fp == NULL)
> -    return NULL;
> +  FILE *fp;
> +  if (generated_data_len)
> +    fp = NULL;
> +  else
> +    {
> +      fp = fopen (file_path, "r");
> +      if (fp == NULL)
> +       return NULL;
> +    }
>  
>    unsigned highest_use_count = 0;
>    file_cache_slot *r = evicted_cache_tab_entry (&highest_use_count);
> -  if (!r->create (in_context, file_path, fp, highest_use_count))
> +  if (!r->create (in_context, file_path, fp, highest_use_count,
> +                 generated_data_len))
>      return NULL;
>    return r;
>  }

[...snip...]

> @@ -535,11 +571,12 @@ file_cache::~file_cache ()
>     it.  */
>  
>  file_cache_slot*
> -file_cache::lookup_or_add_file (const char *file_path)
> +file_cache::lookup_or_add_file (const char *file_path,
> +                               unsigned int generated_data_len)

Likewise, could this be split into:
  lookup_or_add_file
and
  lookup_or_add_generated
or somesuch?

>  {
>    file_cache_slot *r = lookup_file (file_path);

The patch doesn't seem to touch file_cache::lookup_file.  Is the
current implementation of that ideal (it looks like we're going to be
doing strcmp of generated buffers, when presumably for those we could
simply be doing pointer comparisons).

Maybe rename it to lookup_slot?

>    if (r == NULL)
> -    r = add_file (file_path);
> +    r = add_file (file_path, generated_data_len);
>    return r;
>  }
>  
> @@ -547,7 +584,8 @@ file_cache::lookup_or_add_file (const char *file_path)
>     diagnostic.  */
>  
>  file_cache_slot::file_cache_slot ()
> -: m_use_count (0), m_file_path (NULL), m_fp (NULL), m_data (0),
> +: m_use_count (0), m_file_path (NULL), m_fp (NULL),
> +  m_data (0), m_data_active (0),
>    m_alloc_offset (0), m_size (0), m_nb_read (0), m_line_start_idx (0),
>    m_line_num (0), m_total_lines (0), m_missing_trailing_newline (true)
>  {

[...snip...]



> diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
> index 50207cacc12..eb281809cbd 100644
> --- a/libcpp/include/line-map.h
> +++ b/libcpp/include/line-map.h
> @@ -75,6 +75,8 @@ enum lc_reason
>    LC_RENAME_VERBATIM,  /* Likewise, but "" != stdin.  */
>    LC_ENTER_MACRO,      /* Begin macro expansion.  */
>    LC_MODULE,           /* A (C++) Module.  */
> +  LC_GEN,              /* Internally generated source.  */
> +
>    /* FIXME: add support for stringize and paste.  */
>    LC_HWM /* High Water Mark.  */
>  };
> @@ -437,7 +439,11 @@ struct GTY((tag ("1"))) line_map_ordinary : public line_map {
>  
>    /* Pointer alignment boundary on both 32 and 64-bit systems.  */
>  
> -  const char *to_file;
> +  /* This GTY markup is in case this is an LC_GEN map, in which case
> +     to_file actually points to the generated data, which we do not
> +     want to require to be free of null bytes.  */
> +  const char * GTY((string_length ("%h.to_file_len"))) to_file;
> +  unsigned int to_file_len;
>    linenum_type to_line;

What's the intended interaction between this, the garbage-collector,
and PCH?  Is to_file to be allocated in the GC-managed heap, or can it
be outside of it?  Looking at patch 6 I see that this seems to be
allocated (in destringize_and_run) by _cpp_unaligned_alloc.  I don't
remember off the top of my head if that's valid.

>  
>    /* Location from whence this line map was included.  For regular
> @@ -1101,13 +1107,15 @@ extern line_map *line_map_new_raw (line_maps *, bool, unsigned);
>     at least as long as the lifetime of SET.  An empty
>     TO_FILE means standard input.  If reason is LC_LEAVE, and
>     TO_FILE is NULL, then TO_FILE, TO_LINE and SYSP are given their
> -   natural values considering the file we are returning to.
> +   natural values considering the file we are returning to.  If reason
> +   is LC_GEN, then TO_FILE is not a file name, but rather the actual
> +   content, and TO_FILE_LEN>0 is the length of it.
>  
>     A call to this function can relocate the previous set of
>     maps, so any stored line_map pointers should not be used.  */
>  extern const line_map *linemap_add
>    (class line_maps *, enum lc_reason, unsigned int sysp,
> -   const char *to_file, linenum_type to_line);
> +   const char *to_file, linenum_type to_line, unsigned int to_file_len = 0);
>  
>  /* Create a macro map.  A macro map encodes source locations of tokens
>     that are part of a macro replacement-list, at a macro expansion
> @@ -1304,7 +1312,8 @@ linemap_location_before_p (class line_maps *set,
>  
>  typedef struct
>  {
> -  /* The name of the source file involved.  */
> +  /* The name of the source file involved, or NULL if
> +     generated_data is non-NULL.  */
>    const char *file;
>  
>    /* The line-location in the source file.  */
> @@ -1316,6 +1325,10 @@ typedef struct
>  
>    /* In a system header?. */
>    bool sysp;
> +
> +  /* If generated data, the data and its length.  */
> +  unsigned int generated_data_len;
> +  const char *generated_data;
>  } expanded_location;

Probably worth noting that generated_data can contain NUL bytes, and
isn't necessarily NUL-terminated.


Thanks again for the patch; hope this is constructive
Dave


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/6] diagnostics: libcpp: Add LC_GEN linemaps to support in-memory buffers
  2022-11-05 16:23   ` David Malcolm
@ 2022-11-05 17:28     ` Lewis Hyatt
  2022-11-17 21:21     ` Lewis Hyatt
  1 sibling, 0 replies; 18+ messages in thread
From: Lewis Hyatt @ 2022-11-05 17:28 UTC (permalink / raw)
  To: David Malcolm; +Cc: gcc-patches

Thanks for the comments! I have some replies below.

On Sat, Nov 5, 2022 at 12:23 PM David Malcolm <dmalcolm@redhat.com> wrote:
> [...snip...]
> >
> > diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
> > index 5890c18bdc3..2935d7fb236 100644
> > --- a/gcc/c-family/c-common.cc
> > +++ b/gcc/c-family/c-common.cc
> > @@ -9183,11 +9183,14 @@ try_to_locate_new_include_insertion_point (const char *file, location_t loc)
> >        const line_map_ordinary *ord_map
> >         = LINEMAPS_ORDINARY_MAP_AT (line_table, i);
> >
> > +      if (ord_map->reason == LC_GEN)
> > +       continue;
> > +
> >        if (const line_map_ordinary *from
> >           = linemap_included_from_linemap (line_table, ord_map))
> >         /* We cannot use pointer equality, because with preprocessed
> >            input all filename strings are unique.  */
> > -       if (0 == strcmp (from->to_file, file))
> > +       if (from->reason != LC_GEN && 0 == strcmp (from->to_file, file))
> >           {
> >             last_include_ord_map = from;
> >             last_ord_map_after_include = NULL;
>
> [...snip...]
>
> I'm not a fan of having the "to_file" field change meaning based on
> whether reason is LC_GEN.
>
> How involved would it be to split line_map_ordinary into two
> subclasses, so that we'd have this hierarchy (with indentation showing
> inheritance):
>
> line_map
>   line_map_ordinary
>     line_map_ordinary_file
>     line_map_ordinary_generated
>   line_map_macro
>
> Alternatively, how about renaming "to_file" to be "data" (or "m_data"),
> to emphasize that it might not be a filename, and that we have to check
> everywhere we access that field.
>

Yeah, there were definitely a lot of ways to go about this. I settled
on the approach of minimizing the changes to libcpp for a couple
reasons. One is that I didn't want to add any extra overhead to
handling of non-_Pragma lexing, which is of course most of the time. I
think it's nice that lex.cpp was not touched at all for this change,
for example. The reason I re-used the to_file field was that this
class seems to be very concerned about minimizing space overhead (c.f.
all the comments about pointer alignment boundaries, etc.) I feel like
the reason for that attention was that the addition of macro location
tracking added a lot of overhead when it was implemented and the
authors wanted to minimize that. Nowadays, perhaps the RAM usage is
not as much of a concern. We do create a lot of line_map instances,
though. The other reason is that the line-maps API is already pretty
error-prone to use. A given location_t could be an ordinary location,
or a virtual location, or an ad-hoc location. Going through the
_Pragma location-related bugs that have been fixed over the years, it
seems like most of them stemmed from failing to check one or the other
of these cases when needed. So I was worried that adding yet another
type of location would make things worse.

But I see your point certainly. I feel like adding a new subclass will
require touching many more call sites, so not sure how it will look. I
guess I would be concerned about adding too many new conditional
branches. There are already very many, since almost every use of
line-maps API has to check for ad-hoc location first, etc. At some
point, if there are too many branches, it makes more sense to use
virtual functions instead and would perform better. I guess the
fundamental issue is that it's really a C-like API that has had C++
features added on to it over time, probably redesigning the API from
scratch would yield something cleaner. Given I wasn't proposing that
for now, I thought making the minimal possible change here would be
the way to go.

What do you think about making to_file a union and adjusting the
handful of places that would care? That could be a good improvement
that's in the right direction.

> Please can all those checks for LC_GEN go into an inline function so we
> can write e.g.
>   map->generated_p ()
> or somesuch.
>

Sure. I guess for consistency it has to look something like
LINEMAP_ORDINARY_GENERATED_P (map).

> If I reading things right, patch 6 adds the sole usage of this in
> destringize_and_run.  Would we ever want to discriminate between
> different kinds of generated buffers?
>

One other possible use case I had in mind was for builtin macros, e.g.
right now for something like

const char* line = __LINE__;

the diagnostic points just to the __LINE__ token. With an LC_GEN map
it could show the user that __LINE__ has expanded to an integer rather
than a string. Something like that. But anyway that was just an aside,
the way I was envisioning it, just one type of LC_GEN map is needed,
although I can see it might be nice to know further what it was made
for.

I could imagine eventually the static analyzer finding a use of it
also. For instance, you had a recent patch that asks libcpp to lex a
buffer containing a macro token, to get the expanded value. If a
diagnostic could be generated during that process for some reason,
then an LC_GEN map could be used to get a reasonable location for it.

> [...snip...]
>
> > @@ -796,10 +798,13 @@ diagnostic_report_current_module (diagnostic_context *context, location_t where)
> >                  N_("of module"),
> >                  N_("In module imported at"),   /* 6 */
> >                  N_("imported at"),
> > +                N_("In buffer generated from"),   /* 8 */
> >                 };
>
> We use the wording "destringized" in:
>
> so maybe this should be "In buffer destringized from" ???  (I'm not
> sure)
>

I was definitely not sure about what would be the best wording here
either. BTW clang has a similar concept, I think they call it "scratch
space". I didn't think to use destringized here, because it could be a
different type of generated data perhaps, although we could also check
specifically what it was for and output a custom string like you
suggest, I guess that's what your previous question was getting at
too.

> [...snip...]
>
> > diff --git a/gcc/input.cc b/gcc/input.cc
> > index 483cb6e940d..3cf5480551d 100644
> > --- a/gcc/input.cc
> > +++ b/gcc/input.cc
>
> [..snip...]
>
> > @@ -58,7 +64,7 @@ public:
> >    ~file_cache_slot ();
>
> My initial thought reading the input.cc part of this patch was that I
> want it to be very clear when a file_cache_slot is for a real file vs
> when we're replaying generated data.  I'd hoped that this could have
> been expressed via inheritance, but we preallocate all the cache slots
> once in an array in file_cache's ctor and the slots get reused over
> time.  So instead of that, can we please have some kind of:
>
>    bool file_slot_p () const;
>    bool generated_slot_p () const;
>
> or somesuch, so that we can have clear assertions and conditionals
> about the current state of a slot (I think the discriminating condition
> is that generated_data_len > 0, right?)
>
> If I'm reading things right, it looks like file_cache_slot::m_file_path
> does double duty after this patch, and is either a filename, or a
> pointer to the generated data.  If so, please can the patch rename it,
> and have all usage guarded appropriately.  Can it be a union? (or does
> the ctor prevent that?)
>

Yes, I did reuse it this way, I will make it better.

> [...snip...]
>
> > @@ -445,16 +461,23 @@ file_cache::evicted_cache_tab_entry (unsigned *highest_use_count)
> >     num_file_slots files are cached.  */
> >
> >  file_cache_slot*
> > -file_cache::add_file (const char *file_path)
> > +file_cache::add_file (const char *file_path, unsigned int generated_data_len)
>
> Can we split this into two functions: one for files, and one for
> generated data?  (add_file vs add_generated_data?)
>

> >  {
> >
> > -  FILE *fp = fopen (file_path, "r");
> > -  if (fp == NULL)
> > -    return NULL;
> > +  FILE *fp;
> > +  if (generated_data_len)
> > +    fp = NULL;
> > +  else
> > +    {
> > +      fp = fopen (file_path, "r");
> > +      if (fp == NULL)
> > +       return NULL;
> > +    }
> >
> >    unsigned highest_use_count = 0;
> >    file_cache_slot *r = evicted_cache_tab_entry (&highest_use_count);
> > -  if (!r->create (in_context, file_path, fp, highest_use_count))
> > +  if (!r->create (in_context, file_path, fp, highest_use_count,
> > +                 generated_data_len))
> >      return NULL;
> >    return r;
> >  }
>
> [...snip...]
>
> > @@ -535,11 +571,12 @@ file_cache::~file_cache ()
> >     it.  */
> >
> >  file_cache_slot*
> > -file_cache::lookup_or_add_file (const char *file_path)
> > +file_cache::lookup_or_add_file (const char *file_path,
> > +                               unsigned int generated_data_len)
>
> Likewise, could this be split into:
>   lookup_or_add_file
> and
>   lookup_or_add_generated
> or somesuch?
>
> >  {
> >    file_cache_slot *r = lookup_file (file_path);
>
> The patch doesn't seem to touch file_cache::lookup_file.  Is the
> current implementation of that ideal (it looks like we're going to be
> doing strcmp of generated buffers, when presumably for those we could
> simply be doing pointer comparisons).
>
> Maybe rename it to lookup_slot?
>

Yeah, so I guess there is also the potential for problems here, if a
generated data buffer were to have the same content as a filename.
Unlike the case of the line-maps API, there is not really as much
concern for optimizing the overhead here, since it only comes into
play when a diagnostic will be issued. So perhaps, I should take a
different approach along the lines of your initial suggestion, and
separate generated data into its own class entirely... I think that
will address most or all of your concerns here.

> >    if (r == NULL)
> > -    r = add_file (file_path);
> > +    r = add_file (file_path, generated_data_len);
> >    return r;
> >  }
> >
> > @@ -547,7 +584,8 @@ file_cache::lookup_or_add_file (const char *file_path)
> >     diagnostic.  */
> >
> >  file_cache_slot::file_cache_slot ()
> > -: m_use_count (0), m_file_path (NULL), m_fp (NULL), m_data (0),
> > +: m_use_count (0), m_file_path (NULL), m_fp (NULL),
> > +  m_data (0), m_data_active (0),
> >    m_alloc_offset (0), m_size (0), m_nb_read (0), m_line_start_idx (0),
> >    m_line_num (0), m_total_lines (0), m_missing_trailing_newline (true)
> >  {
>
> [...snip...]
>
>
>
> > diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
> > index 50207cacc12..eb281809cbd 100644
> > --- a/libcpp/include/line-map.h
> > +++ b/libcpp/include/line-map.h
> > @@ -75,6 +75,8 @@ enum lc_reason
> >    LC_RENAME_VERBATIM,  /* Likewise, but "" != stdin.  */
> >    LC_ENTER_MACRO,      /* Begin macro expansion.  */
> >    LC_MODULE,           /* A (C++) Module.  */
> > +  LC_GEN,              /* Internally generated source.  */
> > +
> >    /* FIXME: add support for stringize and paste.  */
> >    LC_HWM /* High Water Mark.  */
> >  };
> > @@ -437,7 +439,11 @@ struct GTY((tag ("1"))) line_map_ordinary : public line_map {
> >
> >    /* Pointer alignment boundary on both 32 and 64-bit systems.  */
> >
> > -  const char *to_file;
> > +  /* This GTY markup is in case this is an LC_GEN map, in which case
> > +     to_file actually points to the generated data, which we do not
> > +     want to require to be free of null bytes.  */
> > +  const char * GTY((string_length ("%h.to_file_len"))) to_file;
> > +  unsigned int to_file_len;
> >    linenum_type to_line;
>
> What's the intended interaction between this, the garbage-collector,
> and PCH?  Is to_file to be allocated in the GC-managed heap, or can it
> be outside of it?  Looking at patch 6 I see that this seems to be
> allocated (in destringize_and_run) by _cpp_unaligned_alloc.  I don't
> remember off the top of my head if that's valid.
>

As a small preparation for this patch, I recently did r13-3380, that
added support for GTY((string_length)). There was some discussion
here:
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603885.html .
The short answer is, char* is a special case for GGC, unlike other
pointer types. It will silently ignore anything that wasn't actually
GGC-allocated during the marking phase. This makes it convenient to
use char* pointers in classes like this, where the GTY markup is only
there for PCH purposes (since the line map instance otherwise lives
for the whole process, other than some isolated selftests). The PCH
mechanism still arranges to get them into the PCH even when they were
not allocated by GGC. (The existing to_file that's a filename, for
instance, is allocated by libcpp and not by GGC already.) So the only
constraint on to_file is that is should outlive the usage of the
line-map.

> >
> >    /* Location from whence this line map was included.  For regular
> > @@ -1101,13 +1107,15 @@ extern line_map *line_map_new_raw (line_maps *, bool, unsigned);
> >     at least as long as the lifetime of SET.  An empty
> >     TO_FILE means standard input.  If reason is LC_LEAVE, and
> >     TO_FILE is NULL, then TO_FILE, TO_LINE and SYSP are given their
> > -   natural values considering the file we are returning to.
> > +   natural values considering the file we are returning to.  If reason
> > +   is LC_GEN, then TO_FILE is not a file name, but rather the actual
> > +   content, and TO_FILE_LEN>0 is the length of it.
> >
> >     A call to this function can relocate the previous set of
> >     maps, so any stored line_map pointers should not be used.  */
> >  extern const line_map *linemap_add
> >    (class line_maps *, enum lc_reason, unsigned int sysp,
> > -   const char *to_file, linenum_type to_line);
> > +   const char *to_file, linenum_type to_line, unsigned int to_file_len = 0);
> >
> >  /* Create a macro map.  A macro map encodes source locations of tokens
> >     that are part of a macro replacement-list, at a macro expansion
> > @@ -1304,7 +1312,8 @@ linemap_location_before_p (class line_maps *set,
> >
> >  typedef struct
> >  {
> > -  /* The name of the source file involved.  */
> > +  /* The name of the source file involved, or NULL if
> > +     generated_data is non-NULL.  */
> >    const char *file;
> >
> >    /* The line-location in the source file.  */
> > @@ -1316,6 +1325,10 @@ typedef struct
> >
> >    /* In a system header?. */
> >    bool sysp;
> > +
> > +  /* If generated data, the data and its length.  */
> > +  unsigned int generated_data_len;
> > +  const char *generated_data;
> >  } expanded_location;
>
> Probably worth noting that generated_data can contain NUL bytes, and
> isn't necessarily NUL-terminated.
>

Sure.

>
> Thanks again for the patch; hope this is constructive

Of course, thanks for going through it. I will work on improving the
way input.cc handles this and send a new one. For the line-maps part,
I'll try something like the union for to_file, unless you would prefer
one of the more comprehensive routes there.

-Lewis

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/6] diagnostics: libcpp: Add LC_GEN linemaps to support in-memory buffers
  2022-11-05 16:23   ` David Malcolm
  2022-11-05 17:28     ` Lewis Hyatt
@ 2022-11-17 21:21     ` Lewis Hyatt
  2023-01-05 22:34       ` Lewis Hyatt
  1 sibling, 1 reply; 18+ messages in thread
From: Lewis Hyatt @ 2022-11-17 21:21 UTC (permalink / raw)
  To: David Malcolm; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 10718 bytes --]

On Sat, Nov 05, 2022 at 12:23:28PM -0400, David Malcolm wrote:
> On Fri, 2022-11-04 at 09:44 -0400, Lewis Hyatt via Gcc-patches wrote:
> [...snip...]
> > 
> > diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
> > index 5890c18bdc3..2935d7fb236 100644
> > --- a/gcc/c-family/c-common.cc
> > +++ b/gcc/c-family/c-common.cc
> > @@ -9183,11 +9183,14 @@ try_to_locate_new_include_insertion_point (const char *file, location_t loc)
> >        const line_map_ordinary *ord_map
> >         = LINEMAPS_ORDINARY_MAP_AT (line_table, i);
> >  
> > +      if (ord_map->reason == LC_GEN)
> > +       continue;
> > +
> >        if (const line_map_ordinary *from
> >           = linemap_included_from_linemap (line_table, ord_map))
> >         /* We cannot use pointer equality, because with preprocessed
> >            input all filename strings are unique.  */
> > -       if (0 == strcmp (from->to_file, file))
> > +       if (from->reason != LC_GEN && 0 == strcmp (from->to_file, file))
> >           {
> >             last_include_ord_map = from;
> >             last_ord_map_after_include = NULL;
> 
> [...snip...]
> 
> I'm not a fan of having the "to_file" field change meaning based on
> whether reason is LC_GEN.
> 
> How involved would it be to split line_map_ordinary into two
> subclasses, so that we'd have this hierarchy (with indentation showing
> inheritance):
> 
> line_map
>   line_map_ordinary
>     line_map_ordinary_file
>     line_map_ordinary_generated
>   line_map_macro
> 
> Alternatively, how about renaming "to_file" to be "data" (or "m_data"),
> to emphasize that it might not be a filename, and that we have to check
> everywhere we access that field.
> 
> Please can all those checks for LC_GEN go into an inline function so we
> can write e.g.
>   map->generated_p ()
> or somesuch.
> 
> If I reading things right, patch 6 adds the sole usage of this in
> destringize_and_run.  Would we ever want to discriminate between
> different kinds of generated buffers?
> 
> [...snip...]
> 
> > @@ -796,10 +798,13 @@ diagnostic_report_current_module (diagnostic_context *context, location_t where)
> >                  N_("of module"),
> >                  N_("In module imported at"),   /* 6 */
> >                  N_("imported at"),
> > +                N_("In buffer generated from"),   /* 8 */
> >                 };
> 
> We use the wording "destringized" in:
> 
> so maybe this should be "In buffer destringized from" ???  (I'm not
> sure) 
> 
> [...snip...]
> 
> > diff --git a/gcc/input.cc b/gcc/input.cc
> > index 483cb6e940d..3cf5480551d 100644
> > --- a/gcc/input.cc
> > +++ b/gcc/input.cc
> 
> [..snip...]
> 
> > @@ -58,7 +64,7 @@ public:
> >    ~file_cache_slot ();
> 
> My initial thought reading the input.cc part of this patch was that I
> want it to be very clear when a file_cache_slot is for a real file vs
> when we're replaying generated data.  I'd hoped that this could have
> been expressed via inheritance, but we preallocate all the cache slots
> once in an array in file_cache's ctor and the slots get reused over
> time.  So instead of that, can we please have some kind of:
> 
>    bool file_slot_p () const;
>    bool generated_slot_p () const;
> 
> or somesuch, so that we can have clear assertions and conditionals
> about the current state of a slot (I think the discriminating condition
> is that generated_data_len > 0, right?)
> 
> If I'm reading things right, it looks like file_cache_slot::m_file_path
> does double duty after this patch, and is either a filename, or a
> pointer to the generated data.  If so, please can the patch rename it,
> and have all usage guarded appropriately.  Can it be a union? (or does
> the ctor prevent that?)
> 
> [...snip...]
>  
> > @@ -445,16 +461,23 @@ file_cache::evicted_cache_tab_entry (unsigned *highest_use_count)
> >     num_file_slots files are cached.  */
> >  
> >  file_cache_slot*
> > -file_cache::add_file (const char *file_path)
> > +file_cache::add_file (const char *file_path, unsigned int generated_data_len)
> 
> Can we split this into two functions: one for files, and one for
> generated data?  (add_file vs add_generated_data?)
> 
> >  {
> >  
> > -  FILE *fp = fopen (file_path, "r");
> > -  if (fp == NULL)
> > -    return NULL;
> > +  FILE *fp;
> > +  if (generated_data_len)
> > +    fp = NULL;
> > +  else
> > +    {
> > +      fp = fopen (file_path, "r");
> > +      if (fp == NULL)
> > +       return NULL;
> > +    }
> >  
> >    unsigned highest_use_count = 0;
> >    file_cache_slot *r = evicted_cache_tab_entry (&highest_use_count);
> > -  if (!r->create (in_context, file_path, fp, highest_use_count))
> > +  if (!r->create (in_context, file_path, fp, highest_use_count,
> > +                 generated_data_len))
> >      return NULL;
> >    return r;
> >  }
> 
> [...snip...]
> 
> > @@ -535,11 +571,12 @@ file_cache::~file_cache ()
> >     it.  */
> >  
> >  file_cache_slot*
> > -file_cache::lookup_or_add_file (const char *file_path)
> > +file_cache::lookup_or_add_file (const char *file_path,
> > +                               unsigned int generated_data_len)
> 
> Likewise, could this be split into:
>   lookup_or_add_file
> and
>   lookup_or_add_generated
> or somesuch?
> 
> >  {
> >    file_cache_slot *r = lookup_file (file_path);
> 
> The patch doesn't seem to touch file_cache::lookup_file.  Is the
> current implementation of that ideal (it looks like we're going to be
> doing strcmp of generated buffers, when presumably for those we could
> simply be doing pointer comparisons).
> 
> Maybe rename it to lookup_slot?
> 
> >    if (r == NULL)
> > -    r = add_file (file_path);
> > +    r = add_file (file_path, generated_data_len);
> >    return r;
> >  }
> >  
> > @@ -547,7 +584,8 @@ file_cache::lookup_or_add_file (const char *file_path)
> >     diagnostic.  */
> >  
> >  file_cache_slot::file_cache_slot ()
> > -: m_use_count (0), m_file_path (NULL), m_fp (NULL), m_data (0),
> > +: m_use_count (0), m_file_path (NULL), m_fp (NULL),
> > +  m_data (0), m_data_active (0),
> >    m_alloc_offset (0), m_size (0), m_nb_read (0), m_line_start_idx (0),
> >    m_line_num (0), m_total_lines (0), m_missing_trailing_newline (true)
> >  {
> 
> [...snip...]
> 
> 
> 
> > diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
> > index 50207cacc12..eb281809cbd 100644
> > --- a/libcpp/include/line-map.h
> > +++ b/libcpp/include/line-map.h
> > @@ -75,6 +75,8 @@ enum lc_reason
> >    LC_RENAME_VERBATIM,  /* Likewise, but "" != stdin.  */
> >    LC_ENTER_MACRO,      /* Begin macro expansion.  */
> >    LC_MODULE,           /* A (C++) Module.  */
> > +  LC_GEN,              /* Internally generated source.  */
> > +
> >    /* FIXME: add support for stringize and paste.  */
> >    LC_HWM /* High Water Mark.  */
> >  };
> > @@ -437,7 +439,11 @@ struct GTY((tag ("1"))) line_map_ordinary : public line_map {
> >  
> >    /* Pointer alignment boundary on both 32 and 64-bit systems.  */
> >  
> > -  const char *to_file;
> > +  /* This GTY markup is in case this is an LC_GEN map, in which case
> > +     to_file actually points to the generated data, which we do not
> > +     want to require to be free of null bytes.  */
> > +  const char * GTY((string_length ("%h.to_file_len"))) to_file;
> > +  unsigned int to_file_len;
> >    linenum_type to_line;
> 
> What's the intended interaction between this, the garbage-collector,
> and PCH?  Is to_file to be allocated in the GC-managed heap, or can it
> be outside of it?  Looking at patch 6 I see that this seems to be
> allocated (in destringize_and_run) by _cpp_unaligned_alloc.  I don't
> remember off the top of my head if that's valid.
> 
> >  
> >    /* Location from whence this line map was included.  For regular
> > @@ -1101,13 +1107,15 @@ extern line_map *line_map_new_raw (line_maps *, bool, unsigned);
> >     at least as long as the lifetime of SET.  An empty
> >     TO_FILE means standard input.  If reason is LC_LEAVE, and
> >     TO_FILE is NULL, then TO_FILE, TO_LINE and SYSP are given their
> > -   natural values considering the file we are returning to.
> > +   natural values considering the file we are returning to.  If reason
> > +   is LC_GEN, then TO_FILE is not a file name, but rather the actual
> > +   content, and TO_FILE_LEN>0 is the length of it.
> >  
> >     A call to this function can relocate the previous set of
> >     maps, so any stored line_map pointers should not be used.  */
> >  extern const line_map *linemap_add
> >    (class line_maps *, enum lc_reason, unsigned int sysp,
> > -   const char *to_file, linenum_type to_line);
> > +   const char *to_file, linenum_type to_line, unsigned int to_file_len = 0);
> >  
> >  /* Create a macro map.  A macro map encodes source locations of tokens
> >     that are part of a macro replacement-list, at a macro expansion
> > @@ -1304,7 +1312,8 @@ linemap_location_before_p (class line_maps *set,
> >  
> >  typedef struct
> >  {
> > -  /* The name of the source file involved.  */
> > +  /* The name of the source file involved, or NULL if
> > +     generated_data is non-NULL.  */
> >    const char *file;
> >  
> >    /* The line-location in the source file.  */
> > @@ -1316,6 +1325,10 @@ typedef struct
> >  
> >    /* In a system header?. */
> >    bool sysp;
> > +
> > +  /* If generated data, the data and its length.  */
> > +  unsigned int generated_data_len;
> > +  const char *generated_data;
> >  } expanded_location;
> 
> Probably worth noting that generated_data can contain NUL bytes, and
> isn't necessarily NUL-terminated.
> 
> 
> Thanks again for the patch; hope this is constructive
> Dave
> 

Hi Dave-

Thanks again for taking a look at this one, sorry it's so long. I redid this
patch 4/6 taking into account all of your suggestions. It's attached here.
Now, on the linemap side of things, I renamed the member variable from TO_FILE
to DATA, and created inline accessor functions to get at it.  The inline
accessors will assert that the linemap is of the correct type.  I checked all
the call sites and adjusted as needed.  On the input.cc side of things, I
switched it to use inheritance.  The logic for finding and caching lines
resides in a base class, while the two derived classes handle retrieving the
data from the necessary source (a file, or an in-memory buffer).  I think it's
much nicer now, please let me know what you think? Thanks!

BTW, the remaining patches downstream of this one do not need to be modified,
except that the new testcase for 5c/6 (the SARIF output patch) needs one line
changed since the output of -fdump-internal-locations now distinguishes LC_GEN
maps as well. I can resend that and/or any of the others once the dust
settles with this one if that's helpful.

-Lewis

[-- Attachment #2: LC_GEN_v2.txt --]
[-- Type: text/plain, Size: 116275 bytes --]

[PATCH] diagnostics: libcpp: Add LC_GEN linemaps to support in-memory buffers

Add a new linemap reason LC_GEN which enables encoding the location of data
that was generated during compilation and does not appear in any source file.
There could be many use cases, such as, for instance, referring to the content
of builtin macros (not yet implemented, but an easy lift after this one.) The
first intended application is to create a place to store the input to a
_Pragma directive, so that proper locations can be assigned to those
tokens. This will be done in a subsequent commit.

The actual change needed to the line-maps API in libcpp is not too large and
requires no space overhead in the line map data structures (on 64-bit systems
that is; one newly added data member to class line_map_ordinary sits inside
former padding bytes.) An LC_GEN map is just an ordinary map like any other,
but the TO_FILE member that normally points to the file name points instead to
the actual data.  This works automatically with PCH as well, for the same
reason that the file name makes its way into a PCH.  In order to avoid
confusion, the member has been renamed from TO_FILE to DATA, and associated
accessors adjusted.

Outside libcpp, there are many small changes but most of them are to
selftests, which are necessarily more sensitive to implementation
details. From the perspective of the user (the "user", here, being a frontend
using line maps or else the diagnostics infrastructure), the chief visible
change is that the function location_get_source_line() should be passed an
expanded_location object instead of a separate filename and line number.  This
is not a big change because in most cases, this information came anyway from a
call to expand_location and the needed expanded_location object is readily
available. The new overload of location_get_source_line() uses the extra
information in the expanded_location object to obtain the data from the
in-memory buffer when it originated from an LC_GEN map.

Until the subsequent patch that starts using LC_GEN maps, none are yet
generated within GCC, hence nothing is added to the testsuite here; but all
relevant selftests have been extended to cover generated data maps in addition
to normal files.

libcpp/ChangeLog:

	* include/line-map.h (enum lc_reason): Add LC_GEN.
	(struct line_map_ordinary): Add new members to support LC_GEN concept.
	(ORDINARY_MAP_FILE_NAME): Assert that map really does encode a file
	and not generated data.
	(ORDINARY_MAP_GENERATED_DATA_P): New function.
	(ORDINARY_MAP_GENERATED_DATA): New function.
	(ORDINARY_MAP_GENERATED_DATA_LEN): New function.
	(ORDINARY_MAP_FILE_NAME_OR_DATA): New function.
	(ORDINARY_MAPS_SAME_FILE_P): Declare new function.
	(ORDINARY_MAP_CONTAINING_FILE_NAME): Declare new function.
	(LINEMAP_FILE): This was always a synonym for ORDINARY_MAP_FILE_NAME;
	make this explicit.
	(linemap_get_file_highest_location): Adjust prototype.
	(linemap_add): Adjust prototype.
	(class expanded_location): Add new members to store generated content.
	* line-map.cc (ORDINARY_MAP_CONTAINING_FILE_NAME): New function.
	(ORDINARY_MAPS_SAME_FILE_P): New function.
	(linemap_add): Add new argument DATA_LEN. Support generated data in
	LC_GEN maps.
	(linemap_check_files_exited): Adapt to API changes supporting LC_GEN.
	(linemap_line_start): Likewise.
	(linemap_position_for_loc_and_offset): Likewise.
	(linemap_get_expansion_filename): Likewise.
	(linemap_expand_location): Likewise.
	(linemap_dump): Likewise.
	(linemap_dump_location): Likewise.
	(linemap_get_file_highest_location): Likewise.
	* directives.cc (_cpp_do_file_change): Likewise.

gcc/ChangeLog:

	* diagnostic-show-locus.cc (make_range): Initialize new fields in
	expanded_location.
	(compatible_locations_p): Use new ORDINARY_MAPS_SAME_FILE_P ()
	function.
	(layout::calculate_x_offset_display): Use the new expanded_location
	overload of location_get_source_line(), so as to support LC_GEN maps.
	(layout::print_line): Likewise.
	(source_line::source_line): Likewise.
	(line_corrections::add_hint): Likewise.
	(class line_corrections): Store the location as an exploc rather than
	individual filename, so as to support LC_GEN maps.
	(layout::print_trailing_fixits): Use the new exploc constructor for
	class line_corrections.
	(test_layout_x_offset_display_utf8): Test LC_GEN maps as well as normal.
	(test_layout_x_offset_display_tab): Likewise.
	(test_diagnostic_show_locus_one_liner): Likewise.
	(test_diagnostic_show_locus_one_liner_utf8): Likewise.
	(test_add_location_if_nearby): Likewise.
	(test_diagnostic_show_locus_fixit_lines): Likewise.
	(test_fixit_consolidation): Likewise.
	(test_overlapped_fixit_printing): Likewise.
	(test_overlapped_fixit_printing_utf8): Likewise.
	(test_overlapped_fixit_printing_2): Likewise.
	(test_fixit_insert_containing_newline): Likewise.
	(test_fixit_insert_containing_newline_2): Likewise.
	(test_fixit_replace_containing_newline): Likewise.
	(test_fixit_deletion_affecting_newline): Likewise.
	(test_tab_expansion): Likewise.
	(test_escaping_bytes_1): Likewise.
	(test_escaping_bytes_2): Likewise.
	(test_line_numbers_multiline_range): Likewise.
	(diagnostic_show_locus_cc_tests): Likewise.
	* diagnostic.cc (diagnostic_report_current_module): Support LC_GEN
	maps when outputting include trace.
	(assert_location_text): Zero-initialize the expanded_location so as to
	cover all fields, including the newly added ones.
	* gcc-rich-location.cc (blank_line_before_p): Use the new
	expanded_location overload of location_get_source_line().
	* input.cc (special_fname_generated): New function.
	(class file_cache_slot): Factored out most of implementation to a new
	base class...
	(class cache_data_source): ... here.
	(cache_data_source::cache_data_source): New member function.
	(cache_data_source::~cache_data_source): New member function.
	(cache_data_source::reset): New member function.
	(class data_cache_slot): New derived class of cache_data_source which
	handles generated data.
	(data_cache_slot::create): New function.
	(expand_location_1): Handle LC_GEN locations.
	(total_lines_num): Likewise.
	(file_cache::lookup_data): New member function.
	(diagnostics_file_cache_forcibly_evict_data): New function.
	(file_cache::forcibly_evict_data): New member function.
	(file_cache::add_data): New member function.
	(file_cache::lookup_or_add_data): New member function.
	(file_cache::evicted_cache_tab_entry): Adapt to handle generated data
	locations.
	(file_cache::file_cache): Likewise.
	(file_cache::~file_cache): Likewise.
	(file_cache_slot::evict): Rename to...
	(file_cache_slot::reset): ...the new interface here.
	(file_cache_slot::create): Likewise.
	(file_cache_slot::file_cache_slot): Likewise.
	(file_cache_slot::~file_cache_slot): Likewise.
	(file_cache_slot::needs_read_p): Likewise.
	(file_cache_slot::needs_grow_p): Likewise.
	(file_cache_slot::maybe_grow): Likewise.
	(file_cache_slot::read_data): Likewise.
	(file_cache_slot::maybe_read_data): Rename to...
	(file_cache_slot::get_more_data): ...the new interface here.
	(find_end_of_line): Add missing const.
	(file_cache_slot::get_next_line): Refactored to...
	(cache_data_source::get_next_line): ...here.
	(file_cache_slot::goto_next_line): Refactored to...
	(cache_data_source::goto_next_line): ...here.
	(file_cache_slot::read_line_num): Refactored to...
	(cache_data_source::read_line_num): ...here.
	(location_get_source_line): Change to take an expanded_location
	argument instead of a filename.  Support generated data. Add another
	overload taking a filename that delegates to this one.
	(location_compute_display_column): Use new overload of
	location_get_source_line and handle generated data locations.
	(dump_location_info): Likewise.
	(get_substring_ranges_for_loc): Likewise.
	(temp_source_file::do_linemap_add): New member function.
	(line_table_test::line_table_test): Initialize the new member.
	(test_accessing_ordinary_linemaps): Test generated data as well as
	normal files.
	(test_make_location_nonpure_range_endpoints): Likewise.
	(test_line_offset_overflow): Likewise.
	(for_each_line_table_case): Add new argument requesting to test
	generated data.
	(input_cc_tests): Enable testing generated data in the selftests.
	* input.h (special_fname_generated): Declare new function.
	(location_get_source_line): Add new overload taking an
	expanded_location.
	(class data_cache_slot): Forward declare.
	(class file_cache): Add a cache of generated data buffers as well as
	ordinary file buffers.
	(diagnostics_file_cache_forcibly_evict_data): Declare new function.
	* selftest.cc (named_temp_file::named_temp_file): Support nullptr
	argument to disable creating any file.
	(named_temp_file::~named_temp_file): Likewise.
	(temp_source_file::temp_source_file): Add a new constructor argument
	to enable creating generated data instead of a file.
	(temp_source_file::~temp_source_file): Handle freeing generated data buffer.
	* selftest.h (struct line_map_ordinary): Forward declare.
	(class named_temp_file): Add missing explicit on constructor.
	(class temp_source_file): Add new members to store generated content.
	(class line_table_test): Add new m_generated_data member.
	(for_each_line_table_case): Update prototype for new argument.

gcc/c-family/ChangeLog:
	* c-common.cc (try_to_locate_new_include_insertion_point): Add
	awareness of LC_GEN maps.
	* c-format.cc (get_corrected_substring): Use the new expanded_location
	overload of location_get_source_line(), so as to support LC_GEN maps.
	* c-indentation.cc (get_visual_column): Likewise.
	(get_first_nws_vis_column): Likewise.
	(detect_intervening_unindent): Likewise.
	(should_warn_for_misleading_indentation): Likewise.
	(assert_get_visual_column_succeeds): Zero-initialize the exploc to
	cover all fields including those newly added.
	(assert_get_visual_column_fails): Likewise.

gcc/cp/ChangeLog:

	* module.cc (module_state::write_ordinary_maps): Ignore LC_GEN maps to
	be safe.
	(module_state::read_ordinary_maps): Likewise.

gcc/go/ChangeLog:

	* go-linemap.cc (Gcc_linemap::to_string): Adapt to linemaps API change.

gcc/testsuite/ChangeLog:

	* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c: Use the new
	overload of location_get_source_line.

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 6f1f21bc4c1..e372d28c546 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -9185,11 +9185,15 @@ try_to_locate_new_include_insertion_point (const char *file, location_t loc)
       const line_map_ordinary *ord_map
 	= LINEMAPS_ORDINARY_MAP_AT (line_table, i);
 
+      if (ORDINARY_MAP_GENERATED_DATA_P (ord_map))
+	continue;
+
       if (const line_map_ordinary *from
 	  = linemap_included_from_linemap (line_table, ord_map))
 	/* We cannot use pointer equality, because with preprocessed
 	   input all filename strings are unique.  */
-	if (0 == strcmp (from->to_file, file))
+	if (!ORDINARY_MAP_GENERATED_DATA_P (from)
+	    && 0 == strcmp (ORDINARY_MAP_FILE_NAME (from), file))
 	  {
 	    last_include_ord_map = from;
 	    last_ord_map_after_include = NULL;
@@ -9197,7 +9201,8 @@ try_to_locate_new_include_insertion_point (const char *file, location_t loc)
 
       /* Likewise, use strcmp, and reject any line-zero introductory
 	 map.  */
-      if (ord_map->to_line && 0 == strcmp (ord_map->to_file, file))
+      if (ord_map->to_line
+	  && 0 == strcmp (ORDINARY_MAP_FILE_NAME (ord_map), file))
 	{
 	  if (!first_ord_map_in_file)
 	    first_ord_map_in_file = ord_map;
diff --git a/gcc/c-family/c-format.cc b/gcc/c-family/c-format.cc
index 01adea4ff41..87ad3e74b72 100644
--- a/gcc/c-family/c-format.cc
+++ b/gcc/c-family/c-format.cc
@@ -4538,7 +4538,7 @@ get_corrected_substring (const substring_loc &fmt_loc,
   if (caret.column > finish.column)
     return NULL;
 
-  char_span line = location_get_source_line (start.file, start.line);
+  char_span line = location_get_source_line (start);
   if (!line)
     return NULL;
 
diff --git a/gcc/c-family/c-indentation.cc b/gcc/c-family/c-indentation.cc
index 85a3ae1b303..42738dd4d13 100644
--- a/gcc/c-family/c-indentation.cc
+++ b/gcc/c-family/c-indentation.cc
@@ -50,7 +50,7 @@ get_visual_column (expanded_location exploc,
 		   unsigned int *first_nws,
 		   unsigned int tab_width)
 {
-  char_span line = location_get_source_line (exploc.file, exploc.line);
+  char_span line = location_get_source_line (exploc);
   if (!line)
     return false;
   if ((size_t)exploc.column > line.length ())
@@ -87,13 +87,13 @@ get_visual_column (expanded_location exploc,
    Otherwise, return false, leaving *FIRST_NWS untouched.  */
 
 static bool
-get_first_nws_vis_column (const char *file, int line_num,
+get_first_nws_vis_column (expanded_location exploc,
 			  unsigned int *first_nws,
 			  unsigned int tab_width)
 {
   gcc_assert (first_nws);
 
-  char_span line = location_get_source_line (file, line_num);
+  char_span line = location_get_source_line (exploc);
   if (!line)
     return false;
   unsigned int vis_column = 0;
@@ -158,19 +158,18 @@ get_first_nws_vis_column (const char *file, int line_num,
    Return true if such an unindent/outdent is detected.  */
 
 static bool
-detect_intervening_unindent (const char *file,
-			     int body_line,
+detect_intervening_unindent (expanded_location exploc,
 			     int next_stmt_line,
 			     unsigned int vis_column,
 			     unsigned int tab_width)
 {
-  gcc_assert (file);
-  gcc_assert (next_stmt_line > body_line);
+  gcc_assert (exploc.file);
+  gcc_assert (next_stmt_line > exploc.line);
 
-  for (int line = body_line + 1; line < next_stmt_line; line++)
+  while (++exploc.line < next_stmt_line)
     {
       unsigned int line_vis_column;
-      if (get_first_nws_vis_column (file, line, &line_vis_column, tab_width))
+      if (get_first_nws_vis_column (exploc, &line_vis_column, tab_width))
 	if (line_vis_column < vis_column)
 	  return true;
     }
@@ -528,8 +527,7 @@ should_warn_for_misleading_indentation (const token_indent_info &guard_tinfo,
 
 	  /* Don't warn if there is an unindent between the two statements. */
 	  int vis_column = MIN (next_stmt_vis_column, body_vis_column);
-	  if (detect_intervening_unindent (body_exploc.file, body_exploc.line,
-					   next_stmt_exploc.line,
+	  if (detect_intervening_unindent (body_exploc, next_stmt_exploc.line,
 					   vis_column, tab_width))
 	    return false;
 
@@ -691,12 +689,10 @@ assert_get_visual_column_succeeds (const location &loc,
 				   unsigned int expected_visual_column,
 				   unsigned int expected_first_nws)
 {
-  expanded_location exploc;
+  expanded_location exploc = {};
   exploc.file = file;
   exploc.line = line;
   exploc.column = column;
-  exploc.data = NULL;
-  exploc.sysp = false;
   unsigned int actual_visual_column;
   unsigned int actual_first_nws;
   bool result = get_visual_column (exploc,
@@ -729,12 +725,10 @@ assert_get_visual_column_fails (const location &loc,
 				const char *file, int line, int column,
 				const unsigned int tab_width)
 {
-  expanded_location exploc;
+  expanded_location exploc = {};
   exploc.file = file;
   exploc.line = line;
   exploc.column = column;
-  exploc.data = NULL;
-  exploc.sysp = false;
   unsigned int actual_visual_column;
   unsigned int actual_first_nws;
   bool result = get_visual_column (exploc,
diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 0e9af318ba4..8defcf85a64 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -16212,6 +16212,8 @@ module_state::write_ordinary_maps (elf_out *to, range_t &info,
        iter != end; ++iter)
     if (iter->src != current)
       {
+	if (ORDINARY_MAP_GENERATED_DATA_P (iter->src))
+	  continue;
 	current = iter->src;
 	const char *fname = ORDINARY_MAP_FILE_NAME (iter->src);
 
@@ -16229,7 +16231,7 @@ module_state::write_ordinary_maps (elf_out *to, range_t &info,
 		   preprocessed input we could have multiple instances
 		   of the same name, and we'd rather not percolate
 		   that.  */
-		const_cast<line_map_ordinary *> (iter->src)->to_file = name;
+		const_cast<line_map_ordinary *> (iter->src)->data = name;
 		fname = NULL;
 		break;
 	      }
@@ -16257,6 +16259,8 @@ module_state::write_ordinary_maps (elf_out *to, range_t &info,
   for (auto iter = ord_loc_remap->begin (), end = ord_loc_remap->end ();
        iter != end; ++iter)
     {
+      if (ORDINARY_MAP_GENERATED_DATA_P (iter->src))
+	continue;
       dump (dumper::LOCATION)
 	&& dump ("Span:%u ordinary [%u+%u,+%u)->[%u,+%u)",
 		 iter - ord_loc_remap->begin (),
@@ -16418,7 +16422,8 @@ module_state::read_ordinary_maps (unsigned num_ord_locs, unsigned range_bits)
 	  map->m_range_bits = sec.u ();
 	  map->m_column_and_range_bits = sec.u () + map->m_range_bits;
 	  unsigned fnum = sec.u ();
-	  map->to_file = (fnum < filenames.length () ? filenames[fnum] : "");
+	  map->data = (fnum < filenames.length () ? filenames[fnum] : "");
+	  map->data_len = 1 + strlen (map->data);
 	  map->to_line = sec.u ();
 	  base = map;
 	}
diff --git a/gcc/diagnostic-show-locus.cc b/gcc/diagnostic-show-locus.cc
index 9d430b5189c..c5a6e888314 100644
--- a/gcc/diagnostic-show-locus.cc
+++ b/gcc/diagnostic-show-locus.cc
@@ -709,9 +709,9 @@ static layout_range
 make_range (int start_line, int start_col, int end_line, int end_col)
 {
   const expanded_location start_exploc
-    = {"", start_line, start_col, NULL, false};
+    = {"", start_line, start_col, NULL, false, 0, NULL};
   const expanded_location finish_exploc
-    = {"", end_line, end_col, NULL, false};
+    = {"", end_line, end_col, NULL, false, 0, NULL};
   return layout_range (exploc_with_display_col (start_exploc, def_policy (),
 						LOCATION_ASPECT_START),
 		       exploc_with_display_col (finish_exploc, def_policy (),
@@ -998,7 +998,7 @@ compatible_locations_p (location_t loc_a, location_t loc_b)
 	 are in the same file.  */
       const line_map_ordinary *ord_map_a = linemap_check_ordinary (map_a);
       const line_map_ordinary *ord_map_b = linemap_check_ordinary (map_b);
-      return ord_map_a->to_file == ord_map_b->to_file;
+      return ORDINARY_MAPS_SAME_FILE_P (ord_map_a, ord_map_b);
     }
 }
 
@@ -1614,8 +1614,7 @@ layout::calculate_x_offset_display ()
       return;
     }
 
-  const char_span line = location_get_source_line (m_exploc.file,
-						   m_exploc.line);
+  const char_span line = location_get_source_line (m_exploc);
   if (!line)
     {
       /* Nothing to do, we couldn't find the source line.  */
@@ -2398,17 +2397,18 @@ class line_corrections
 {
 public:
   line_corrections (const char_display_policy &policy,
-		    const char *filename,
-		    linenum_type row)
-  : m_policy (policy), m_filename (filename), m_row (row)
-  {}
+		    expanded_location exploc, linenum_type row = 0)
+  : m_policy (policy), m_exploc (exploc)
+  {
+    if (row)
+      m_exploc.line = row;
+  }
   ~line_corrections ();
 
   void add_hint (const fixit_hint *hint);
 
   const char_display_policy &m_policy;
-  const char *m_filename;
-  linenum_type m_row;
+  expanded_location m_exploc;
   auto_vec <correction *> m_corrections;
 };
 
@@ -2428,7 +2428,7 @@ line_corrections::~line_corrections ()
 class source_line
 {
 public:
-  source_line (const char *filename, int line);
+  explicit source_line (expanded_location xloc);
 
   char_span as_span () { return char_span (chars, width); }
 
@@ -2438,9 +2438,9 @@ public:
 
 /* source_line's ctor.  */
 
-source_line::source_line (const char *filename, int line)
+source_line::source_line (expanded_location exploc)
 {
-  char_span span = location_get_source_line (filename, line);
+  char_span span = location_get_source_line (exploc);
   chars = span.get_buffer ();
   width = span.length ();
 }
@@ -2482,7 +2482,7 @@ line_corrections::add_hint (const fixit_hint *hint)
 				affected_bytes.start - 1);
 
 	  /* Try to read the source.  */
-	  source_line line (m_filename, m_row);
+	  source_line line (m_exploc);
 	  if (line.chars && between.finish < line.width)
 	    {
 	      /* Consolidate into the last correction:
@@ -2538,7 +2538,7 @@ layout::print_trailing_fixits (linenum_type row)
 {
   /* Build a list of correction instances for the line,
      potentially consolidating hints (for the sake of readability).  */
-  line_corrections corrections (m_policy, m_exploc.file, row);
+  line_corrections corrections (m_policy, m_exploc, row);
   for (unsigned int i = 0; i < m_fixit_hints.length (); i++)
     {
       const fixit_hint *hint = m_fixit_hints[i];
@@ -2776,7 +2776,7 @@ layout::show_ruler (int max_column) const
 void
 layout::print_line (linenum_type row)
 {
-  char_span line = location_get_source_line (m_exploc.file, row);
+  char_span line = location_get_source_line (m_exploc, row);
   if (!line)
     return;
 
@@ -2985,10 +2985,10 @@ test_layout_x_offset_display_utf8 (const line_table_case &case_)
      no multibyte characters earlier on the line.  */
   const int emoji_col = 102;
 
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
   line_table_test ltt (case_);
-
-  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content, 1 + line_bytes,
+			ltt.m_generated_data);
+  tmp.do_linemap_add (1);
 
   location_t line_end = linemap_position_for_column (line_table, line_bytes);
 
@@ -2996,17 +2996,23 @@ test_layout_x_offset_display_utf8 (const line_table_case &case_)
   if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS)
     return;
 
-  ASSERT_STREQ (tmp.get_filename (), LOCATION_FILE (line_end));
+  if (ltt.m_generated_data)
+    {
+      ASSERT_EQ (nullptr, tmp.get_filename ());
+      ASSERT_STREQ (special_fname_generated (), LOCATION_FILE (line_end));
+    }
+  else
+    ASSERT_STREQ (tmp.get_filename (), LOCATION_FILE (line_end));
   ASSERT_EQ (1, LOCATION_LINE (line_end));
   ASSERT_EQ (line_bytes, LOCATION_COLUMN (line_end));
 
-  char_span lspan = location_get_source_line (tmp.get_filename (), 1);
+  const expanded_location xloc = expand_location (line_end);
+  char_span lspan = location_get_source_line (xloc);
   ASSERT_EQ (line_display_cols,
 	     cpp_display_width (lspan.get_buffer (), lspan.length (),
 				def_policy ()));
   ASSERT_EQ (line_display_cols,
-	     location_compute_display_column (expand_location (line_end),
-					      def_policy ()));
+	     location_compute_display_column (xloc, def_policy ()));
   ASSERT_EQ (0, memcmp (lspan.get_buffer () + (emoji_col - 1),
 			"\xf0\x9f\x98\x82\xf0\x9f\x98\x82", 8));
 
@@ -3138,10 +3144,10 @@ test_layout_x_offset_display_tab (const line_table_case &case_)
      a space would have taken up.  */
   ASSERT_EQ (7, extra_width[10]);
 
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
   line_table_test ltt (case_);
-
-  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content, line_bytes + 1,
+			ltt.m_generated_data);
+  tmp.do_linemap_add (1);
 
   location_t line_end = linemap_position_for_column (line_table, line_bytes);
 
@@ -3150,7 +3156,8 @@ test_layout_x_offset_display_tab (const line_table_case &case_)
     return;
 
   /* Check that cpp_display_width handles the tabs as expected.  */
-  char_span lspan = location_get_source_line (tmp.get_filename (), 1);
+  const expanded_location xloc = expand_location (line_end);
+  char_span lspan = location_get_source_line (xloc);
   ASSERT_EQ ('\t', *(lspan.get_buffer () + (tab_col - 1)));
   for (int tabstop = 1; tabstop != num_tabstops; ++tabstop)
     {
@@ -3159,8 +3166,7 @@ test_layout_x_offset_display_tab (const line_table_case &case_)
 		 cpp_display_width (lspan.get_buffer (), lspan.length (),
 				    policy));
       ASSERT_EQ (line_bytes + extra_width[tabstop],
-		 location_compute_display_column (expand_location (line_end),
-						  policy));
+		 location_compute_display_column (xloc, policy));
     }
 
   /* Check that the tab is expanded to the expected number of spaces.  */
@@ -3784,10 +3790,10 @@ test_diagnostic_show_locus_one_liner (const line_table_case &case_)
      ....................0000000001111111.
      ....................1234567890123456.  */
   const char *content = "foo = bar.field;\n";
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
   line_table_test ltt (case_);
-
-  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content, strlen (content),
+			ltt.m_generated_data);
+  tmp.do_linemap_add (1);
 
   location_t line_end = linemap_position_for_column (line_table, 16);
 
@@ -3795,7 +3801,14 @@ test_diagnostic_show_locus_one_liner (const line_table_case &case_)
   if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS)
     return;
 
-  ASSERT_STREQ (tmp.get_filename (), LOCATION_FILE (line_end));
+  if (ltt.m_generated_data)
+    {
+      ASSERT_EQ (nullptr, tmp.get_filename ());
+      ASSERT_STREQ (special_fname_generated (), LOCATION_FILE (line_end));
+    }
+  else
+    ASSERT_STREQ (tmp.get_filename (), LOCATION_FILE (line_end));
+
   ASSERT_EQ (1, LOCATION_LINE (line_end));
   ASSERT_EQ (16, LOCATION_COLUMN (line_end));
 
@@ -4366,10 +4379,10 @@ test_diagnostic_show_locus_one_liner_utf8 (const line_table_case &case_)
     /* 0000000000000000000001111111111111111111222222222222222222222233333
        1111222233334444567890122223333456789999000011112222345678999900001
        Byte columns.  */
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
   line_table_test ltt (case_);
-
-  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content, strlen (content),
+			ltt.m_generated_data);
+  tmp.do_linemap_add (1);
 
   location_t line_end = linemap_position_for_column (line_table, 31);
 
@@ -4377,11 +4390,18 @@ test_diagnostic_show_locus_one_liner_utf8 (const line_table_case &case_)
   if (line_end > LINE_MAP_MAX_LOCATION_WITH_COLS)
     return;
 
-  ASSERT_STREQ (tmp.get_filename (), LOCATION_FILE (line_end));
+  if (ltt.m_generated_data)
+    {
+      ASSERT_EQ (nullptr, tmp.get_filename ());
+      ASSERT_STREQ (special_fname_generated (), LOCATION_FILE (line_end));
+    }
+  else
+    ASSERT_STREQ (tmp.get_filename (), LOCATION_FILE (line_end));
+
   ASSERT_EQ (1, LOCATION_LINE (line_end));
   ASSERT_EQ (31, LOCATION_COLUMN (line_end));
 
-  char_span lspan = location_get_source_line (tmp.get_filename (), 1);
+  char_span lspan = location_get_source_line (expand_location (line_end));
   ASSERT_EQ (25, cpp_display_width (lspan.get_buffer (), lspan.length (),
 				    def_policy ()));
   ASSERT_EQ (25, location_compute_display_column (expand_location (line_end),
@@ -4418,12 +4438,10 @@ test_add_location_if_nearby (const line_table_case &case_)
        "  double x;\n"                              /* line 4.  */
        "  double y;\n"                              /* line 5.  */
        ";\n");                                      /* line 6.  */
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
   line_table_test ltt (case_);
-
-  const line_map_ordinary *ord_map
-    = linemap_check_ordinary (linemap_add (line_table, LC_ENTER, false,
-					   tmp.get_filename (), 0));
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content, strlen (content),
+			ltt.m_generated_data);
+  const line_map_ordinary *ord_map = tmp.do_linemap_add (0);
 
   linemap_line_start (line_table, 1, 100);
 
@@ -4482,12 +4500,10 @@ test_diagnostic_show_locus_fixit_lines (const line_table_case &case_)
        "\n"                                      /* line 4.  */
        "\n"                                      /* line 5.  */
        "                        : 0.0};\n");     /* line 6.  */
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
   line_table_test ltt (case_);
-
-  const line_map_ordinary *ord_map
-    = linemap_check_ordinary (linemap_add (line_table, LC_ENTER, false,
-					   tmp.get_filename (), 0));
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content, strlen (content),
+			ltt.m_generated_data);
+  const line_map_ordinary *ord_map = tmp.do_linemap_add (0);
 
   linemap_line_start (line_table, 1, 100);
 
@@ -4578,8 +4594,10 @@ static void
 test_fixit_consolidation (const line_table_case &case_)
 {
   line_table_test ltt (case_);
-
-  linemap_add (line_table, LC_ENTER, false, "test.c", 1);
+  if (ltt.m_generated_data)
+    linemap_add (line_table, LC_GEN, false, "some content", 1, 13);
+  else
+    linemap_add (line_table, LC_ENTER, false, "test.c", 1);
 
   const location_t c10 = linemap_position_for_column (line_table, 10);
   const location_t c15 = linemap_position_for_column (line_table, 15);
@@ -4725,13 +4743,11 @@ test_overlapped_fixit_printing (const line_table_case &case_)
      ...123456789012345678901234567890123456789.  */
   const char *content
     = ("  foo *f = (foo *)ptr->field;\n");
-  temp_source_file tmp (SELFTEST_LOCATION, ".C", content);
   line_table_test ltt (case_);
+  temp_source_file tmp (SELFTEST_LOCATION, ".C", content, strlen (content),
+			ltt.m_generated_data);
 
-  const line_map_ordinary *ord_map
-    = linemap_check_ordinary (linemap_add (line_table, LC_ENTER, false,
-					   tmp.get_filename (), 0));
-
+  const line_map_ordinary *ord_map = tmp.do_linemap_add (0);
   linemap_line_start (line_table, 1, 100);
 
   const location_t final_line_end
@@ -4752,6 +4768,8 @@ test_overlapped_fixit_printing (const line_table_case &case_)
     = linemap_position_for_line_and_column (line_table, ord_map, 1, 28);
   const location_t expr = make_location (expr_start, expr_start, expr_finish);
 
+  const expanded_location xloc = expand_location (expr);
+
   /* Various examples of fix-it hints that aren't themselves consolidated,
      but for which the *printing* may need consolidation.  */
 
@@ -4795,7 +4813,7 @@ test_overlapped_fixit_printing (const line_table_case &case_)
     /* Add each hint in turn to a line_corrections instance,
        and verify that they are consolidated into one correction instance
        as expected.  */
-    line_corrections lc (policy, tmp.get_filename (), 1);
+    line_corrections lc (policy, xloc);
 
     /* The first replace hint by itself.  */
     lc.add_hint (hint_0);
@@ -4936,13 +4954,10 @@ test_overlapped_fixit_printing_utf8 (const line_table_case &case_)
        12344445555666677778901234566667777888899990123456789012333344445
        Byte columns.  */
 
-  temp_source_file tmp (SELFTEST_LOCATION, ".C", content);
   line_table_test ltt (case_);
-
-  const line_map_ordinary *ord_map
-    = linemap_check_ordinary (linemap_add (line_table, LC_ENTER, false,
-					   tmp.get_filename (), 0));
-
+  temp_source_file tmp (SELFTEST_LOCATION, ".C", content, strlen (content),
+			ltt.m_generated_data);
+  const line_map_ordinary *ord_map = tmp.do_linemap_add (0);
   linemap_line_start (line_table, 1, 100);
 
   const location_t final_line_end
@@ -4963,6 +4978,8 @@ test_overlapped_fixit_printing_utf8 (const line_table_case &case_)
     = linemap_position_for_line_and_column (line_table, ord_map, 1, 34);
   const location_t expr = make_location (expr_start, expr_start, expr_finish);
 
+  const expanded_location xloc = expand_location (expr);
+
   /* Various examples of fix-it hints that aren't themselves consolidated,
      but for which the *printing* may need consolidation.  */
 
@@ -5011,7 +5028,7 @@ test_overlapped_fixit_printing_utf8 (const line_table_case &case_)
     /* Add each hint in turn to a line_corrections instance,
        and verify that they are consolidated into one correction instance
        as expected.  */
-    line_corrections lc (policy, tmp.get_filename (), 1);
+    line_corrections lc (policy, xloc);
 
     /* The first replace hint by itself.  */
     lc.add_hint (hint_0);
@@ -5169,13 +5186,11 @@ test_overlapped_fixit_printing_2 (const line_table_case &case_)
      ...123456789012345678901234567890123456789.  */
   const char *content
     = ("int a5[][0][0] = { 1, 2 };\n");
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
+
   line_table_test ltt (case_);
-
-  const line_map_ordinary *ord_map
-    = linemap_check_ordinary (linemap_add (line_table, LC_ENTER, false,
-					   tmp.get_filename (), 0));
-
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content, strlen (content),
+			ltt.m_generated_data);
+  const line_map_ordinary *ord_map = tmp.do_linemap_add (0);
   linemap_line_start (line_table, 1, 100);
 
   const location_t final_line_end
@@ -5260,10 +5275,10 @@ test_fixit_insert_containing_newline (const line_table_case &case_)
 			     "      x = a;\n"  /* line 2. */
 			     "    case 'b':\n" /* line 3. */
 			     "      x = b;\n");/* line 4. */
-
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", old_content);
   line_table_test ltt (case_);
-  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 3);
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", old_content,
+			strlen (old_content), false);
+  tmp.do_linemap_add (3);
 
   location_t case_start = linemap_position_for_column (line_table, 5);
   location_t case_finish = linemap_position_for_column (line_table, 13);
@@ -5331,12 +5346,11 @@ test_fixit_insert_containing_newline_2 (const line_table_case &case_)
 			     "{\n"              /* line 2. */
 			     " putchar (ch);\n" /* line 3. */
 			     "}\n");            /* line 4. */
-
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", old_content);
   line_table_test ltt (case_);
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", old_content,
+			strlen (old_content), ltt.m_generated_data);
 
-  const line_map_ordinary *ord_map = linemap_check_ordinary
-    (linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 0));
+  const line_map_ordinary *ord_map = tmp.do_linemap_add (0);
   linemap_line_start (line_table, 1, 100);
 
   /* The primary range is the "putchar" token.  */
@@ -5395,9 +5409,10 @@ test_fixit_replace_containing_newline (const line_table_case &case_)
     .........................1234567890123.  */
   const char *old_content = "foo = bar ();\n";
 
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", old_content);
   line_table_test ltt (case_);
-  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", old_content,
+			strlen (old_content), ltt.m_generated_data);
+  tmp.do_linemap_add (1);
 
   /* Replace the " = " with "\n  = ", as if we were reformatting an
      overly long line.  */
@@ -5435,10 +5450,10 @@ test_fixit_deletion_affecting_newline (const line_table_case &case_)
   const char *old_content = ("foo = bar (\n"
 			     "      );\n");
 
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", old_content);
   line_table_test ltt (case_);
-  const line_map_ordinary *ord_map = linemap_check_ordinary
-    (linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 0));
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", old_content,
+			strlen (old_content), ltt.m_generated_data);
+  const line_map_ordinary *ord_map = tmp.do_linemap_add (0);
   linemap_line_start (line_table, 1, 100);
 
   /* Attempt to delete the " (\n...)".  */
@@ -5487,9 +5502,10 @@ test_tab_expansion (const line_table_case &case_)
   const int last_byte_col = 25;
   ASSERT_EQ (35, cpp_display_width (content, last_byte_col, policy));
 
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", content);
   line_table_test ltt (case_);
-  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content, strlen (content),
+			ltt.m_generated_data);
+  tmp.do_linemap_add (1);
 
   /* Don't attempt to run the tests if column data might be unavailable.  */
   location_t line_end = linemap_position_for_column (line_table, last_byte_col);
@@ -5536,15 +5552,14 @@ test_escaping_bytes_1 (const line_table_case &case_)
 {
   const char content[] = "before\0\1\2\3\v\x80\xff""after\n";
   const size_t sz = sizeof (content);
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", content, sz);
   line_table_test ltt (case_);
-  const line_map_ordinary *ord_map = linemap_check_ordinary
-    (linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 0));
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content, sz,
+			ltt.m_generated_data);
+  const line_map_ordinary *ord_map = tmp.do_linemap_add (0);
   linemap_line_start (line_table, 1, 100);
 
   location_t finish
-    = linemap_position_for_line_and_column (line_table, ord_map, 1,
-					    strlen (content));
+    = linemap_position_for_line_and_column (line_table, ord_map, 1, sz);
 
   if (finish > LINE_MAP_MAX_LOCATION_WITH_COLS)
     return;
@@ -5592,15 +5607,14 @@ test_escaping_bytes_2 (const line_table_case &case_)
 {
   const char content[]  = "\0after\n";
   const size_t sz = sizeof (content);
-  temp_source_file tmp (SELFTEST_LOCATION, ".c", content, sz);
   line_table_test ltt (case_);
-  const line_map_ordinary *ord_map = linemap_check_ordinary
-    (linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 0));
+  temp_source_file tmp (SELFTEST_LOCATION, ".c", content, sz,
+			ltt.m_generated_data);
+  const line_map_ordinary *ord_map = tmp.do_linemap_add (0);
   linemap_line_start (line_table, 1, 100);
 
   location_t finish
-    = linemap_position_for_line_and_column (line_table, ord_map, 1,
-					    strlen (content));
+    = linemap_position_for_line_and_column (line_table, ord_map, 1, sz);
 
   if (finish > LINE_MAP_MAX_LOCATION_WITH_COLS)
     return;
@@ -5652,8 +5666,7 @@ test_line_numbers_multiline_range ()
   temp_source_file tmp (SELFTEST_LOCATION, ".txt", pp_formatted_text (&pp));
   line_table_test ltt;
 
-  const line_map_ordinary *ord_map = linemap_check_ordinary
-    (linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 0));
+  const line_map_ordinary *ord_map = tmp.do_linemap_add (0);
   linemap_line_start (line_table, 1, 100);
 
   /* Create a multi-line location, starting at the "line" of line 9, with
@@ -5694,28 +5707,28 @@ diagnostic_show_locus_cc_tests ()
 
   test_display_widths ();
 
-  for_each_line_table_case (test_layout_x_offset_display_utf8);
-  for_each_line_table_case (test_layout_x_offset_display_tab);
+  for_each_line_table_case (test_layout_x_offset_display_utf8, true);
+  for_each_line_table_case (test_layout_x_offset_display_tab, true);
 
   test_get_line_bytes_without_trailing_whitespace ();
 
   test_diagnostic_show_locus_unknown_location ();
 
-  for_each_line_table_case (test_diagnostic_show_locus_one_liner);
-  for_each_line_table_case (test_diagnostic_show_locus_one_liner_utf8);
-  for_each_line_table_case (test_add_location_if_nearby);
-  for_each_line_table_case (test_diagnostic_show_locus_fixit_lines);
-  for_each_line_table_case (test_fixit_consolidation);
-  for_each_line_table_case (test_overlapped_fixit_printing);
-  for_each_line_table_case (test_overlapped_fixit_printing_utf8);
-  for_each_line_table_case (test_overlapped_fixit_printing_2);
-  for_each_line_table_case (test_fixit_insert_containing_newline);
-  for_each_line_table_case (test_fixit_insert_containing_newline_2);
-  for_each_line_table_case (test_fixit_replace_containing_newline);
-  for_each_line_table_case (test_fixit_deletion_affecting_newline);
-  for_each_line_table_case (test_tab_expansion);
-  for_each_line_table_case (test_escaping_bytes_1);
-  for_each_line_table_case (test_escaping_bytes_2);
+  for_each_line_table_case (test_diagnostic_show_locus_one_liner, true);
+  for_each_line_table_case (test_diagnostic_show_locus_one_liner_utf8, true);
+  for_each_line_table_case (test_add_location_if_nearby, true);
+  for_each_line_table_case (test_diagnostic_show_locus_fixit_lines, true);
+  for_each_line_table_case (test_fixit_consolidation, true);
+  for_each_line_table_case (test_overlapped_fixit_printing, true);
+  for_each_line_table_case (test_overlapped_fixit_printing_utf8, true);
+  for_each_line_table_case (test_overlapped_fixit_printing_2, true);
+  for_each_line_table_case (test_fixit_insert_containing_newline, true);
+  for_each_line_table_case (test_fixit_insert_containing_newline_2, true);
+  for_each_line_table_case (test_fixit_replace_containing_newline, true);
+  for_each_line_table_case (test_fixit_deletion_affecting_newline, true);
+  for_each_line_table_case (test_tab_expansion, true);
+  for_each_line_table_case (test_escaping_bytes_1, true);
+  for_each_line_table_case (test_escaping_bytes_2, true);
 
   test_line_numbers_multiline_range ();
 }
diff --git a/gcc/diagnostic.cc b/gcc/diagnostic.cc
index 7c7ee6da746..52b2f2df75f 100644
--- a/gcc/diagnostic.cc
+++ b/gcc/diagnostic.cc
@@ -771,13 +771,15 @@ diagnostic_report_current_module (diagnostic_context *context, location_t where)
       if (!includes_seen (context, map))
 	{
 	  bool first = true, need_inc = true, was_module = MAP_MODULE_P (map);
+	  const bool was_gen = ORDINARY_MAP_GENERATED_DATA_P (map);
 	  expanded_location s = {};
 	  do
 	    {
 	      where = linemap_included_from (map);
 	      map = linemap_included_from_linemap (line_table, map);
 	      bool is_module = MAP_MODULE_P (map);
-	      s.file = LINEMAP_FILE (map);
+	      s.file = (ORDINARY_MAP_GENERATED_DATA_P (map)
+			? special_fname_generated () : LINEMAP_FILE (map));
 	      s.line = SOURCE_LINE (map, where);
 	      int col = -1;
 	      if (first && context->show_column)
@@ -796,10 +798,13 @@ diagnostic_report_current_module (diagnostic_context *context, location_t where)
 		 N_("of module"),
 		 N_("In module imported at"),	/* 6 */
 		 N_("imported at"),
+		 N_("In buffer generated from"),   /* 8 */
 		};
 
-	      unsigned index = (was_module ? 6 : is_module ? 4
-				: need_inc ? 2 : 0) + !first;
+	      const unsigned index
+		= was_gen ? 8
+		: ((was_module ? 6 : is_module ? 4 : need_inc ? 2 : 0)
+		   + !first);
 
 	      pp_verbatim (context->printer, "%s%s %r%s%s%R",
 			   first ? "" : was_module ? ", " : ",\n",
@@ -2573,12 +2578,10 @@ assert_location_text (const char *expected_loc_text,
   dc.column_unit = column_unit;
   dc.column_origin = origin;
 
-  expanded_location xloc;
+  expanded_location xloc = {};
   xloc.file = filename;
   xloc.line = line;
   xloc.column = column;
-  xloc.data = NULL;
-  xloc.sysp = false;
 
   char *actual_loc_text = diagnostic_get_location_text (&dc, xloc);
   ASSERT_STREQ (expected_loc_text, actual_loc_text);
diff --git a/gcc/gcc-rich-location.cc b/gcc/gcc-rich-location.cc
index 0fa4239bd29..4836624f03c 100644
--- a/gcc/gcc-rich-location.cc
+++ b/gcc/gcc-rich-location.cc
@@ -78,7 +78,7 @@ static bool
 blank_line_before_p (location_t loc)
 {
   expanded_location exploc = expand_location (loc);
-  char_span line = location_get_source_line (exploc.file, exploc.line);
+  char_span line = location_get_source_line (exploc);
   if (!line)
     return false;
   if (line.length () < (size_t)exploc.column)
diff --git a/gcc/go/go-linemap.cc b/gcc/go/go-linemap.cc
index 1d72e79647d..02d4ce04181 100644
--- a/gcc/go/go-linemap.cc
+++ b/gcc/go/go-linemap.cc
@@ -84,7 +84,8 @@ Gcc_linemap::to_string(Location location)
   resolved_location =
       linemap_resolve_location (line_table, location.gcc_location(),
                                 LRK_SPELLING_LOCATION, &lmo);
-  if (lmo == NULL || resolved_location < RESERVED_LOCATION_COUNT)
+  if (lmo == NULL || resolved_location < RESERVED_LOCATION_COUNT
+      || ORDINARY_MAP_GENERATED_DATA_P (lmo))
     return "";
   const char *path = LINEMAP_FILE (lmo);
   if (!path)
diff --git a/gcc/input.cc b/gcc/input.cc
index 18777a813b0..afeca61297a 100644
--- a/gcc/input.cc
+++ b/gcc/input.cc
@@ -35,6 +35,12 @@ special_fname_builtin ()
   return _("<built-in>");
 }
 
+const char *
+special_fname_generated ()
+{
+  return _("<generated>");
+}
+
 /* Input charset configuration.  */
 static const char *default_charset_callback (const char *)
 {
@@ -49,33 +55,87 @@ file_cache::initialize_input_context (diagnostic_input_charset_callback ccb,
   in_context.should_skip_bom = should_skip_bom;
 }
 
-/* This is a cache used by get_next_line to store the content of a
-   file to be searched for file lines.  */
-class file_cache_slot
+/* This is an abstract interface for a class that provides data which we want to
+   look up by line number.  Concrete implementations will follow, which handle
+   the cases of reading the data from the input source files, or of reading it
+   from in-memory generated data buffers.  The design is driven with reading
+   from files in mind, in particular it is desirable to read only as much of a
+   file from disk as necessary.  It works like a simplified std::istream, i.e.
+   virtual function calls are only needed when we need to retrieve more data
+   from the underlying source.  */
+
+class cache_data_source
 {
+
 public:
-  file_cache_slot ();
-  ~file_cache_slot ();
-
-  bool read_line_num (size_t line_num,
-		      char ** line, ssize_t *line_len);
-
-  /* Accessors.  */
-  const char *get_file_path () const { return m_file_path; }
+  bool read_line_num (size_t line_num, const char **line, ssize_t *line_len);
   unsigned get_use_count () const { return m_use_count; }
+  void inc_use_count () { m_use_count++; }
+  bool get_next_line (const char **line, ssize_t *line_len);
+  bool goto_next_line ();
   bool missing_trailing_newline_p () const
   {
     return m_missing_trailing_newline;
   }
+  bool unused () const { return !m_data_begin; }
+  virtual void reset ();
 
-  void inc_use_count () { m_use_count++; }
+protected:
+  cache_data_source ();
+  virtual ~cache_data_source ();
 
-  bool create (const file_cache::input_context &in_context,
-	       const char *file_path, FILE *fp, unsigned highest_use_count);
-  void evict ();
+  /* These pointers delimit the data that we are processing.  They are
+     maintained by the derived classes, we only ask for more by calling
+     get_more_data().  That function should return TRUE if more data was
+     obtained.  Calling get_more_data () may invalidate these pointers
+     (i.e. reallocating them to a larger buffer).  */
+  const char *m_data_begin;
+  const char *m_data_end;
+  virtual bool get_more_data () = 0;
 
- private:
-  /* These are information used to store a line boundary.  */
+  /* This is to be called by the derived classes when this object is
+     being activated.  */
+  void on_create (unsigned int use_count, size_t total_lines)
+  {
+    m_use_count = use_count;
+    m_total_lines = total_lines;
+  }
+
+private:
+  /* Non-copyable.  */
+  cache_data_source (const cache_data_source &) = delete;
+  cache_data_source& operator= (const cache_data_source &) = delete;
+
+  /* The number of times this data has been accessed.  This is used to designate
+     which entry to evict from the cache array when needed.  */
+  unsigned m_use_count;
+
+  /* Could this file be missing a trailing newline on its final line?
+     Initially true (to cope with empty files), set to true/false
+     as each line is read.  */
+  bool m_missing_trailing_newline;
+
+  /* This is the total number of lines in the current data.  At the
+     moment, we try to get this information from the line map
+     subsystem.  Note that this is just a hint.  When using the C++
+     front-end, this hint is correct because the input file is then
+     completely tokenized before parsing starts; so the line map knows
+     the number of lines before compilation really starts.  For e.g,
+     the C front-end, it can happen that we start emitting diagnostics
+     before the line map has seen the end of the file.  */
+  size_t m_total_lines;
+
+  /* The number of the previous lines read.  This starts at 1.  Zero
+     means we've read no line so far.  */
+  size_t m_line_num;
+
+  /* The index of the beginning of the current line.  */
+  size_t m_line_start_idx;
+
+  /* These are information used to store a line boundary.  Here and below, we
+     store always byte offsets, not pointers, since the underlying buffer may be
+     reallocated by the derived implementation unbeknownst to us after calling
+     get_more_data().  */
   class line_info
   {
   public:
@@ -83,13 +143,12 @@ public:
     size_t line_num;
 
     /* The position (byte count) of the beginning of the line,
-       relative to the file data pointer.  This starts at zero.  */
+       relative to M_DATA_BEGIN.  This starts at zero.  */
     size_t start_pos;
 
-    /* The position (byte count) of the last byte of the line.  This
-       normally points to the '\n' character, or to one byte after the
-       last byte of the file, if the file doesn't contain a '\n'
-       character.  */
+    /* The position (byte count) of the last byte of the line.  This normally
+       points to the '\n' character, or to M_DATA_END, if the data doesn't end
+       with a '\n' character.  */
     size_t end_pos;
 
     line_info (size_t l, size_t s, size_t e)
@@ -97,91 +156,76 @@ public:
     {}
 
     line_info ()
-      :line_num (0), start_pos (0), end_pos (0)
+      : line_num (0), start_pos (0), end_pos (0)
     {}
   };
 
-  bool needs_read_p () const;
-  bool needs_grow_p () const;
-  void maybe_grow ();
-  bool read_data ();
-  bool maybe_read_data ();
-  bool get_next_line (char **line, ssize_t *line_len);
-  bool read_next_line (char ** line, ssize_t *line_len);
-  bool goto_next_line ();
-
-  static const size_t buffer_size = 4 * 1024;
-  static const size_t line_record_size = 100;
-
-  /* The number of time this file has been accessed.  This is used
-     to designate which file cache to evict from the cache
-     array.  */
-  unsigned m_use_count;
-
-  /* The file_path is the key for identifying a particular file in
-     the cache.
-     For libcpp-using code, the underlying buffer for this field is
-     owned by the corresponding _cpp_file within the cpp_reader.  */
-  const char *m_file_path;
-
-  FILE *m_fp;
-
-  /* This points to the content of the file that we've read so
-     far.  */
-  char *m_data;
-
-  /* The allocated buffer to be freed may start a little earlier than DATA,
-     e.g. if a UTF8 BOM was skipped at the beginning.  */
-  int m_alloc_offset;
-
-  /*  The size of the DATA array above.*/
-  size_t m_size;
-
-  /* The number of bytes read from the underlying file so far.  This
-     must be less (or equal) than SIZE above.  */
-  size_t m_nb_read;
-
-  /* The index of the beginning of the current line.  */
-  size_t m_line_start_idx;
-
-  /* The number of the previous line read.  This starts at 1.  Zero
-     means we've read no line so far.  */
-  size_t m_line_num;
-
-  /* This is the total number of lines of the current file.  At the
-     moment, we try to get this information from the line map
-     subsystem.  Note that this is just a hint.  When using the C++
-     front-end, this hint is correct because the input file is then
-     completely tokenized before parsing starts; so the line map knows
-     the number of lines before compilation really starts.  For e.g,
-     the C front-end, it can happen that we start emitting diagnostics
-     before the line map has seen the end of the file.  */
-  size_t m_total_lines;
-
-  /* Could this file be missing a trailing newline on its final line?
-     Initially true (to cope with empty files), set to true/false
-     as each line is read.  */
-  bool m_missing_trailing_newline;
-
   /* This is a record of the beginning and end of the lines we've seen
      while reading the file.  This is useful to avoid walking the data
      from the beginning when we are asked to read a line that is
-     before LINE_START_IDX above.  Note that the maximum size of this
+     before M_LINE_START_IDX.  Note that the maximum size of this
      record is line_record_size, so that the memory consumption
      doesn't explode.  We thus scale total_lines down to
      line_record_size.  */
   vec<line_info, va_heap> m_line_record;
+  static const size_t line_record_size = 100;
+};
 
-  void offset_buffer (int offset)
+/* This is the implementation of cache_data_source for ordinary
+   source files.  */
+class file_cache_slot final : public cache_data_source
+{
+
+public:
+  file_cache_slot ();
+  ~file_cache_slot ();
+
+  const char *get_file_path () const { return m_file_path; }
+  bool create (const file_cache::input_context &in_context,
+	       const char *file_path, FILE *fp, unsigned highest_use_count);
+  void reset () override;
+
+protected:
+  bool get_more_data () override;
+
+private:
+  /* The file_path is the key for identifying a particular file in the cache.
+     For libcpp-using code, the underlying buffer for this field is owned by the
+     corresponding _cpp_file within the cpp_reader.  */
+  const char *m_file_path;
+
+  FILE *m_fp;
+
+  /* The base class M_DATA_BEGIN and M_DATA_END delimit the bytes that are ready
+     to process.  These two pointers here track a growable memory buffer, owned
+     by this object, where we store data as we read it from the file; we arrange
+     for the base class pointers to point to the right place within this
+     buffer.  */
+  char *m_buf_begin;
+  char *m_buf_end;
+  void maybe_grow ();
+};
+
+/* This is the implementation of cache_data_source for generated
+   data that is already in memory.  */
+class data_cache_slot final : public cache_data_source
+{
+public:
+  void create (const char *data, unsigned int data_len,
+	       unsigned int highest_use_count);
+  bool represents_data (const char *data, unsigned int) const
   {
-    gcc_assert (offset < 0 ? m_alloc_offset + offset >= 0
-		: (size_t) offset <= m_size);
-    gcc_assert (m_data);
-    m_alloc_offset += offset;
-    m_data += offset;
-    m_size -= offset;
+    /* We can just use pointer equality here since the generated data lives in
+       memory in one persistent place.  It isn't anticipated there would be
+       several generated data buffers with the same content, so we don't mind
+       that in such a case we will store it twice.  */
+    return m_data_begin == data;
   }
 
+protected:
+  /* In contrast to file_cache_slot, we do not own a buffer.  The buffer
+     passed to create() needs to outlive this object.  */
+  bool get_more_data () override { return false; }
 };
 
 /* Current position in real source file.  */
@@ -282,6 +326,8 @@ expand_location_1 (location_t loc,
   xloc.data = block;
   if (loc <= BUILTINS_LOCATION)
     xloc.file = loc == UNKNOWN_LOCATION ? NULL : special_fname_builtin ();
+  else if (xloc.generated_data_len)
+    xloc.file = special_fname_generated ();
 
   return xloc;
 }
@@ -316,11 +362,12 @@ diagnostic_file_cache_fini (void)
    equals the actual number of lines of the file.  */
 
 static size_t
-total_lines_num (const char *file_path)
+total_lines_num (const char *fname_or_data, bool is_data)
 {
   size_t r = 0;
   location_t l = 0;
-  if (linemap_get_file_highest_location (line_table, file_path, &l))
+  if (linemap_get_file_highest_location (line_table, fname_or_data,
+					 is_data, &l))
     {
       gcc_assert (l >= RESERVED_LOCATION_COUNT);
       expanded_location xloc = expand_location (l);
@@ -356,6 +403,21 @@ file_cache::lookup_file (const char *file_path)
   return r;
 }
 
+data_cache_slot *
+file_cache::lookup_data (const char *data, unsigned int data_len)
+{
+  for (unsigned int i = 0; i != num_file_slots; ++i)
+    {
+      const auto slot = m_data_slots + i;
+      if (slot->represents_data (data, data_len))
+	{
+	  slot->inc_use_count ();
+	  return slot;
+	}
+    }
+  return nullptr;
+}
+
 /* Purge any mention of FILENAME from the cache of files used for
    printing source code.  For use in selftests when working
    with tempfiles.  */
@@ -371,6 +433,15 @@ diagnostics_file_cache_forcibly_evict_file (const char *file_path)
   global_dc->m_file_cache->forcibly_evict_file (file_path);
 }
 
+void
+diagnostics_file_cache_forcibly_evict_data (const char *data,
+					    unsigned int data_len)
+{
+  if (!global_dc->m_file_cache)
+    return;
+  global_dc->m_file_cache->forcibly_evict_data (data, data_len);
+}
+
 void
 file_cache::forcibly_evict_file (const char *file_path)
 {
@@ -381,55 +452,39 @@ file_cache::forcibly_evict_file (const char *file_path)
     /* Not found.  */
     return;
 
-  r->evict ();
+  r->reset ();
 }
 
 void
-file_cache_slot::evict ()
+file_cache::forcibly_evict_data (const char *data, unsigned int data_len)
 {
-  m_file_path = NULL;
-  if (m_fp)
-    fclose (m_fp);
-  m_fp = NULL;
-  m_nb_read = 0;
-  m_line_start_idx = 0;
-  m_line_num = 0;
-  m_line_record.truncate (0);
-  m_use_count = 0;
-  m_total_lines = 0;
-  m_missing_trailing_newline = true;
+  if (auto r = lookup_data (data, data_len))
+    r->reset ();
 }
 
-/* Return the file cache that has been less used, recently, or the
+/* Return the cache that has been less used, recently, or the
    first empty one.  If HIGHEST_USE_COUNT is non-null,
    *HIGHEST_USE_COUNT is set to the highest use count of the entries
    in the cache table.  */
 
-file_cache_slot*
-file_cache::evicted_cache_tab_entry (unsigned *highest_use_count)
+template <class Slot>
+Slot *
+file_cache::evicted_cache_tab_entry (Slot *slots,
+				     unsigned int *highest_use_count)
 {
-  diagnostic_file_cache_init ();
-
-  file_cache_slot *to_evict = &m_file_slots[0];
+  auto to_evict = &slots[0];
   unsigned huc = to_evict->get_use_count ();
   for (unsigned i = 1; i < num_file_slots; ++i)
     {
-      file_cache_slot *c = &m_file_slots[i];
-      bool c_is_empty = (c->get_file_path () == NULL);
-
+      auto c = &slots[i];
       if (c->get_use_count () < to_evict->get_use_count ()
-	  || (to_evict->get_file_path () && c_is_empty))
+	  || (!to_evict->unused () && c->unused ()))
 	/* We evict C because it's either an entry with a lower use
 	   count or one that is empty.  */
 	to_evict = c;
 
       if (huc < c->get_use_count ())
 	huc = c->get_use_count ();
-
-      if (c_is_empty)
-	/* We've reached the end of the cache; subsequent elements are
-	   all empty.  */
-	break;
     }
 
   if (highest_use_count)
@@ -453,12 +508,23 @@ file_cache::add_file (const char *file_path)
     return NULL;
 
   unsigned highest_use_count = 0;
-  file_cache_slot *r = evicted_cache_tab_entry (&highest_use_count);
+  file_cache_slot *r = evicted_cache_tab_entry (m_file_slots,
+						&highest_use_count);
   if (!r->create (in_context, file_path, fp, highest_use_count))
     return NULL;
   return r;
 }
 
+data_cache_slot *
+file_cache::add_data (const char *data, unsigned int data_len)
+{
+  unsigned int highest_use_count = 0;
+  data_cache_slot *r = evicted_cache_tab_entry (m_data_slots,
+						&highest_use_count);
+  r->create (data, data_len, highest_use_count);
+  return r;
+}
+
 /* Populate this slot for use on FILE_PATH and FP, dropping any
    existing cached content within it.  */
 
@@ -467,22 +533,12 @@ file_cache_slot::create (const file_cache::input_context &in_context,
 			 const char *file_path, FILE *fp,
 			 unsigned highest_use_count)
 {
+  reset ();
+  on_create (highest_use_count + 1, total_lines_num (file_path, false));
+  m_data_begin = m_buf_begin;
+  m_data_end = m_buf_begin;
   m_file_path = file_path;
-  if (m_fp)
-    fclose (m_fp);
   m_fp = fp;
-  if (m_alloc_offset)
-    offset_buffer (-m_alloc_offset);
-  m_nb_read = 0;
-  m_line_start_idx = 0;
-  m_line_num = 0;
-  m_line_record.truncate (0);
-  /* Ensure that this cache entry doesn't get evicted next time
-     add_file_to_cache_tab is called.  */
-  m_use_count = ++highest_use_count;
-  m_total_lines = total_lines_num (file_path);
-  m_missing_trailing_newline = true;
-
 
   /* Check the input configuration to determine if we need to do any
      transformations, such as charset conversion or BOM skipping.  */
@@ -495,29 +551,37 @@ file_cache_slot::create (const file_cache::input_context &in_context,
 	= cpp_get_converted_source (file_path, input_charset);
       if (!cs.data)
 	return false;
-      if (m_data)
-	XDELETEVEC (m_data);
-      m_data = cs.data;
-      m_nb_read = m_size = cs.len;
-      m_alloc_offset = cs.data - cs.to_free;
+      XDELETEVEC (m_buf_begin);
+      m_buf_begin = cs.to_free;
+      m_buf_end = cs.data + cs.len;
+      m_data_begin = cs.data;
+      m_data_end = m_buf_end;
     }
-  else if (in_context.should_skip_bom)
+  else if (in_context.should_skip_bom && get_more_data ())
     {
-      if (read_data ())
-	{
-	  const int offset = cpp_check_utf8_bom (m_data, m_nb_read);
-	  offset_buffer (offset);
-	  m_nb_read -= offset;
-	}
+      const int offset = cpp_check_utf8_bom (m_data_begin,
+					     m_data_end - m_data_begin);
+      m_data_begin += offset;
     }
 
   return true;
 }
 
+void
+data_cache_slot::create (const char *data, unsigned int data_len,
+			 unsigned int highest_use_count)
+{
+  reset ();
+  on_create (highest_use_count + 1, total_lines_num (data, true));
+  m_data_begin = data;
+  m_data_end = data + data_len;
+}
+
 /* file_cache's ctor.  */
 
 file_cache::file_cache ()
-: m_file_slots (new file_cache_slot[num_file_slots])
+  : m_file_slots (new file_cache_slot[num_file_slots]),
+    m_data_slots (new data_cache_slot[num_file_slots])
 {
   initialize_input_context (nullptr, false);
 }
@@ -526,6 +590,7 @@ file_cache::file_cache ()
 
 file_cache::~file_cache ()
 {
+  delete[] m_data_slots;
   delete[] m_file_slots;
 }
 
@@ -543,55 +608,69 @@ file_cache::lookup_or_add_file (const char *file_path)
   return r;
 }
 
-/* Default constructor for a cache of file used by caret
-   diagnostic.  */
+data_cache_slot *
+file_cache::lookup_or_add_data (const char *data, unsigned int data_len)
+{
+  data_cache_slot *r = lookup_data (data, data_len);
+  if (!r)
+    r = add_data (data, data_len);
+  return r;
+}
 
-file_cache_slot::file_cache_slot ()
-: m_use_count (0), m_file_path (NULL), m_fp (NULL), m_data (0),
-  m_alloc_offset (0), m_size (0), m_nb_read (0), m_line_start_idx (0),
-  m_line_num (0), m_total_lines (0), m_missing_trailing_newline (true)
+cache_data_source::cache_data_source ()
+: m_data_begin (nullptr), m_data_end (nullptr),
+  m_use_count (0),
+  m_missing_trailing_newline (true),
+  m_total_lines (0),
+  m_line_num (0),
+  m_line_start_idx (0)
 {
   m_line_record.create (0);
 }
 
-/* Destructor for a cache of file used by caret diagnostic.  */
-
-file_cache_slot::~file_cache_slot ()
+cache_data_source::~cache_data_source ()
 {
-  if (m_fp)
-    {
-      fclose (m_fp);
-      m_fp = NULL;
-    }
-  if (m_data)
-    {
-      offset_buffer (-m_alloc_offset);
-      XDELETEVEC (m_data);
-      m_data = 0;
-    }
   m_line_record.release ();
 }
 
-/* Returns TRUE iff the cache would need to be filled with data coming
-   from the file.  That is, either the cache is empty or full or the
-   current line is empty.  Note that if the cache is full, it would
-   need to be extended and filled again.  */
-
-bool
-file_cache_slot::needs_read_p () const
+void
+cache_data_source::reset ()
 {
-  return m_fp && (m_nb_read == 0
-	  || m_nb_read == m_size
-	  || (m_line_start_idx >= m_nb_read - 1));
+  m_data_begin = nullptr;
+  m_data_end = nullptr;
+  m_use_count = 0;
+  m_missing_trailing_newline = true;
+  m_total_lines = 0;
+  m_line_num = 0;
+  m_line_start_idx = 0;
+  m_line_record.truncate (0);
 }
 
-/*  Return TRUE iff the cache is full and thus needs to be
-    extended.  */
+file_cache_slot::file_cache_slot ()
+: m_file_path (nullptr), m_fp (nullptr),
+  m_buf_begin (nullptr), m_buf_end (nullptr)
+{}
 
-bool
-file_cache_slot::needs_grow_p () const
+file_cache_slot::~file_cache_slot ()
 {
-  return m_nb_read == m_size;
+  if (m_fp)
+    fclose (m_fp);
+  XDELETEVEC (m_buf_begin);
+}
+
+void
+file_cache_slot::reset ()
+{
+  cache_data_source::reset ();
+  m_file_path = NULL;
+  if (m_fp)
+    {
+      fclose (m_fp);
+      m_fp = NULL;
+    }
+
+  /* Do not free the buffer here, we intend to reuse it the next time this
+     slot is activated.  */
 }
 
 /* Grow the cache if it needs to be extended.  */
@@ -599,22 +678,23 @@ file_cache_slot::needs_grow_p () const
 void
 file_cache_slot::maybe_grow ()
 {
-  if (!needs_grow_p ())
-    return;
-
-  if (!m_data)
+  if (!m_buf_begin)
     {
-      gcc_assert (m_size == 0 && m_alloc_offset == 0);
-      m_size = buffer_size;
-      m_data = XNEWVEC (char, m_size);
+      const size_t buffer_size = 4 * 1024;
+      m_buf_begin = XNEWVEC (char, buffer_size);
+      m_buf_end = m_buf_begin + buffer_size;
+      m_data_begin = m_buf_begin;
+      m_data_end = m_data_begin;
     }
-  else
+  else if (m_data_end == m_buf_end)
     {
-      const int offset = m_alloc_offset;
-      offset_buffer (-offset);
-      m_size *= 2;
-      m_data = XRESIZEVEC (char, m_data, m_size);
-      offset_buffer (offset);
+      const auto new_size = 2 * (m_buf_end - m_buf_begin);
+      const auto data_offset = m_data_begin - m_buf_begin;
+      const auto data_size = m_data_end - m_data_begin;
+      m_buf_begin = XRESIZEVEC (char, m_buf_begin, new_size);
+      m_buf_end = m_buf_begin + new_size;
+      m_data_begin = m_buf_begin + data_offset;
+      m_data_end = m_data_begin + data_size;
     }
 }
 
@@ -622,45 +702,28 @@ file_cache_slot::maybe_grow ()
     Returns TRUE iff new data could be read.  */
 
 bool
-file_cache_slot::read_data ()
+file_cache_slot::get_more_data ()
 {
-  if (feof (m_fp) || ferror (m_fp))
+  if (!m_fp || feof (m_fp) || ferror (m_fp))
     return false;
-
   maybe_grow ();
-
-  char * from = m_data + m_nb_read;
-  size_t to_read = m_size - m_nb_read;
-  size_t nb_read = fread (from, 1, to_read, m_fp);
-
-  if (ferror (m_fp))
-    return false;
-
-  m_nb_read += nb_read;
-  return !!nb_read;
-}
-
-/* Read new data iff the cache needs to be filled with more data
-   coming from the file FP.  Return TRUE iff the cache was filled with
-   mode data.  */
-
-bool
-file_cache_slot::maybe_read_data ()
-{
-  if (!needs_read_p ())
+  char *const dest = m_buf_begin + (m_data_end - m_buf_begin);
+  const auto nb_read = fread (dest, 1, m_buf_end - dest, m_fp);
+  if (ferror (m_fp) || !nb_read)
     return false;
-  return read_data ();
+  m_data_end += nb_read;
+  return true;
 }
 
-/* Helper function for file_cache_slot::get_next_line (), to find the end of
+/* Helper function for cache_data_source::get_next_line (), to find the end of
    the next line.  Returns with the memchr convention, i.e. nullptr if a line
    terminator was not found.  We need to determine line endings in the same
    manner that libcpp does: any of \n, \r\n, or \r is a line ending.  */
 
-static char *
-find_end_of_line (char *s, size_t len)
+static const char *
+find_end_of_line (const char *s, const char *end)
 {
-  for (const auto end = s + len; s != end; ++s)
+  for (; s != end; ++s)
     {
       if (*s == '\n')
 	return s;
@@ -683,41 +746,38 @@ find_end_of_line (char *s, size_t len)
   return nullptr;
 }
 
-/* Read a new line from file FP, using C as a cache for the data
-   coming from the file.  Upon successful completion, *LINE is set to
-   the beginning of the line found.  *LINE points directly in the
-   line cache and is only valid until the next call of get_next_line.
-   *LINE_LEN is set to the length of the line.  Note that the line
-   does not contain any terminal delimiter.  This function returns
-   true if some data was read or process from the cache, false
-   otherwise.  Note that subsequent calls to get_next_line might
-   make the content of *LINE invalid.  */
+/* Read a new line from the data source.  Upon successful completion, *LINE is
+   set to the beginning of the line found.  *LINE points directly in the line
+   cache and is only valid until the next call of get_next_line.  *LINE_LEN is
+   set to the length of the line.  Note that the line does not contain any
+   terminal delimiter.  This function returns true if some data was read or
+   processed from the cache, false otherwise.  Note that subsequent calls to
+   get_next_line might make the content of *LINE invalid.  */
 
 bool
-file_cache_slot::get_next_line (char **line, ssize_t *line_len)
+cache_data_source::get_next_line (const char **line, ssize_t *line_len)
 {
-  /* Fill the cache with data to process.  */
-  maybe_read_data ();
+  const char *line_start = m_data_begin + m_line_start_idx;
 
-  size_t remaining_size = m_nb_read - m_line_start_idx;
-  if (remaining_size == 0)
-    /* There is no more data to process.  */
-    return false;
+  /* Check if we are all done reading the file.  */
+  if (line_start == m_data_end)
+    {
+      if (!get_more_data ())
+	return false;
+      line_start = m_data_begin + m_line_start_idx;
+    }
 
-  char *line_start = m_data + m_line_start_idx;
-
-  char *next_line_start = NULL;
-  size_t len = 0;
-  char *line_end = find_end_of_line (line_start, remaining_size);
+  /* Find the end of the current line.  */
+  const char *next_line_start = NULL;
+  const char *line_end = find_end_of_line (line_start, m_data_end);
   if (line_end == NULL)
     {
       /* We haven't found an end-of-line delimiter in the cache.
 	 Fill the cache with more data from the file and look again.  */
-      while (maybe_read_data ())
+      while (get_more_data ())
 	{
-	  line_start = m_data + m_line_start_idx;
-	  remaining_size = m_nb_read - m_line_start_idx;
-	  line_end = find_end_of_line (line_start, remaining_size);
+	  line_start = m_data_begin + m_line_start_idx;
+	  line_end = find_end_of_line (line_start, m_data_end);
 	  if (line_end != NULL)
 	    {
 	      next_line_start = line_end + 1;
@@ -734,8 +794,8 @@ file_cache_slot::get_next_line (char **line, ssize_t *line_len)
 
 	     If the file ends in a \r, we didn't identify it as a line
 	     terminator above, so do that now instead.  */
-	  line_end = m_data + m_nb_read;
-	  if (m_nb_read && line_end[-1] == '\r')
+	  line_end = m_data_end;
+	  if (line_end != m_data_begin && line_end[-1] == '\r')
 	    {
 	      --line_end;
 	      m_missing_trailing_newline = false;
@@ -752,18 +812,11 @@ file_cache_slot::get_next_line (char **line, ssize_t *line_len)
       m_missing_trailing_newline = false;
     }
 
-  if (m_fp && ferror (m_fp))
-    return false;
-
   /* At this point, we've found the end of the of line.  It either points to
      the line terminator or to one byte after the last byte of the file.  */
-  gcc_assert (line_end != NULL);
-
-  len = line_end - line_start;
-
-  if (m_line_start_idx < m_nb_read)
-    *line = line_start;
-
+  const auto len = line_end - line_start;
+  *line = line_start;
+  *line_len = len;
   ++m_line_num;
 
   /* Before we update our line record, make sure the hint about the
@@ -785,7 +838,7 @@ file_cache_slot::get_next_line (char **line, ssize_t *line_len)
 	m_line_record.safe_push
 	  (file_cache_slot::line_info (m_line_num,
 				       m_line_start_idx,
-				       line_end - m_data));
+				       line_end - m_data_begin));
       else if (m_total_lines > line_record_size)
 	{
 	  /* ... otherwise, we just scale total_lines down to
@@ -796,23 +849,14 @@ file_cache_slot::get_next_line (char **line, ssize_t *line_len)
 	    m_line_record.safe_push
 	      (file_cache_slot::line_info (m_line_num,
 					   m_line_start_idx,
-					   line_end - m_data));
+					   line_end - m_data_begin));
 	}
     }
 
   /* Update m_line_start_idx so that it points to the next line to be
      read.  */
-  if (next_line_start)
-    m_line_start_idx = next_line_start - m_data;
-  else
-    /* We didn't find any terminal '\n'.  Let's consider that the end
-       of line is the end of the data in the cache.  The next
-       invocation of get_next_line will either read more data from the
-       underlying file or return false early because we've reached the
-       end of the file.  */
-    m_line_start_idx = m_nb_read;
-
-  *line_len = len;
+  m_line_start_idx
+    = (next_line_start ? next_line_start : m_data_end) - m_data_begin;
 
   return true;
 }
@@ -824,15 +868,15 @@ file_cache_slot::get_next_line (char **line, ssize_t *line_len)
    completion.  */
 
 bool
-file_cache_slot::goto_next_line ()
+cache_data_source::goto_next_line ()
 {
-  char *l;
+  const char *l;
   ssize_t len;
 
   return get_next_line (&l, &len);
 }
 
-/* Read an arbitrary line number LINE_NUM from the file cached in C.
+/* Read an arbitrary line number LINE_NUM from the data cache.
    If the line was read successfully, *LINE points to the beginning
    of the line in the file cache and *LINE_LEN is the length of the
    line.  *LINE is not nul-terminated, but may contain zero bytes.
@@ -840,8 +884,8 @@ file_cache_slot::goto_next_line ()
    This function returns bool if a line was read.  */
 
 bool
-file_cache_slot::read_line_num (size_t line_num,
-		       char ** line, ssize_t *line_len)
+cache_data_source::read_line_num (size_t line_num,
+				  const char ** line, ssize_t *line_len)
 {
   gcc_assert (line_num > 0);
 
@@ -849,7 +893,7 @@ file_cache_slot::read_line_num (size_t line_num,
     {
       /* We've been asked to read lines that are before m_line_num.
 	 So lets use our line record (if it's not empty) to try to
-	 avoid re-reading the file from the beginning again.  */
+	 avoid re-scanning the data from the beginning again.  */
 
       if (m_line_record.is_empty ())
 	{
@@ -858,7 +902,7 @@ file_cache_slot::read_line_num (size_t line_num,
 	}
       else
 	{
-	  file_cache_slot::line_info *i = NULL;
+	  line_info *i = NULL;
 	  if (m_total_lines <= line_record_size)
 	    {
 	      /* In languages where the input file is not totally
@@ -894,7 +938,7 @@ file_cache_slot::read_line_num (size_t line_num,
 	  if (i && i->line_num == line_num)
 	    {
 	      /* We have the start/end of the line.  */
-	      *line = m_data + i->start_pos;
+	      *line = m_data_begin + i->start_pos;
 	      *line_len = i->end_pos - i->start_pos;
 	      return true;
 	    }
@@ -931,30 +975,56 @@ file_cache_slot::read_line_num (size_t line_num,
    If the function fails, a NULL char_span is returned.  */
 
 char_span
-location_get_source_line (const char *file_path, int line)
+location_get_source_line (expanded_location xloc, int line)
 {
-  char *buffer = NULL;
-  ssize_t len;
-
+  const char_span fail (nullptr, 0);
   if (line == 0)
-    return char_span (NULL, 0);
-
-  if (file_path == NULL)
-    return char_span (NULL, 0);
+    return fail;
 
   diagnostic_file_cache_init ();
 
-  file_cache_slot *c = global_dc->m_file_cache->lookup_or_add_file (file_path);
-  if (c == NULL)
-    return char_span (NULL, 0);
+  cache_data_source *c;
+  if (xloc.generated_data_len)
+    {
+      if (!xloc.generated_data)
+	return fail;
+      c = global_dc->m_file_cache->lookup_or_add_data (xloc.generated_data,
+						       xloc.generated_data_len);
+    }
+  else
+    {
+      if (!xloc.file)
+	return fail;
+      c = global_dc->m_file_cache->lookup_or_add_file (xloc.file);
+    }
 
+  if (!c)
+    return fail;
+
+  const char *buffer = NULL;
+  ssize_t len;
   bool read = c->read_line_num (line, &buffer, &len);
   if (!read)
-    return char_span (NULL, 0);
+    return fail;
 
   return char_span (buffer, len);
 }
 
+char_span
+location_get_source_line (expanded_location xloc)
+{
+  return location_get_source_line (xloc, xloc.line);
+}
+
+char_span
+location_get_source_line (const char *file_path, int line)
+{
+  expanded_location xloc = {};
+  xloc.file = file_path;
+  xloc.line = line;
+  return location_get_source_line (xloc);
+}
+
 /* Return a NUL-terminated copy of the source text between two locations, or
    NULL if the arguments are invalid.  The caller is responsible for freeing
    the return value.  */
@@ -971,8 +1041,18 @@ get_source_text_between (location_t start, location_t end)
      start, give up and return nothing.  */
   if (!expstart.file || !expend.file)
     return NULL;
-  if (strcmp (expstart.file, expend.file) != 0)
+  if (expstart.generated_data_len != expend.generated_data_len)
     return NULL;
+  if (expstart.generated_data_len)
+    {
+      if (expstart.generated_data != expend.generated_data)
+	return NULL;
+    }
+  else
+    {
+      if (strcmp (expstart.file, expend.file) != 0)
+	return NULL;
+    }
   if (expstart.line > expend.line)
     return NULL;
   if (expstart.line == expend.line
@@ -1202,9 +1282,10 @@ int
 location_compute_display_column (expanded_location exploc,
 				 const cpp_char_column_policy &policy)
 {
-  if (!(exploc.file && *exploc.file && exploc.line && exploc.column))
+  if (!(exploc.file && (exploc.generated_data_len || *exploc.file)
+	&& exploc.line && exploc.column))
     return exploc.column;
-  char_span line = location_get_source_line (exploc.file, exploc.line);
+  char_span line = location_get_source_line (exploc);
   /* If line is NULL, this function returns exploc.column which is the
      desired fallback.  */
   return cpp_byte_column_to_display_column (line.get_buffer (), line.length (),
@@ -1364,7 +1445,19 @@ dump_location_info (FILE *stream)
       fprintf (stream, "ORDINARY MAP: %i\n", idx);
       dump_location_range (stream,
 			   MAP_START_LOCATION (map), end_location);
-      fprintf (stream, "  file: %s\n", ORDINARY_MAP_FILE_NAME (map));
+
+      if (ORDINARY_MAP_GENERATED_DATA_P (map))
+	{
+	  fprintf (stream, "  file: %s%s\n",
+		   ORDINARY_MAP_CONTAINING_FILE_NAME (line_table, map),
+		   special_fname_generated ());
+	  fprintf (stream, "  data: %.*s\n",
+		   (int) ORDINARY_MAP_GENERATED_DATA_LEN (map),
+		   ORDINARY_MAP_GENERATED_DATA (map));
+	}
+      else
+	fprintf (stream, "  file: %s\n", LINEMAP_FILE (map));
+
       fprintf (stream, "  starting at line: %i\n",
 	       ORDINARY_MAP_STARTING_LINE_NUMBER (map));
       fprintf (stream, "  column and range bits: %i\n",
@@ -1390,6 +1483,9 @@ dump_location_info (FILE *stream)
       case LC_ENTER_MACRO:
 	reason = "LC_RENAME_MACRO";
 	break;
+      case LC_GEN:
+	reason = "LC_GEN";
+	break;
       default:
 	reason = "Unknown";
       }
@@ -1419,13 +1515,14 @@ dump_location_info (FILE *stream)
 	    {
 	      /* Beginning of a new source line: draw the line.  */
 
-	      char_span line_text = location_get_source_line (exploc.file,
-							      exploc.line);
+	      char_span line_text = location_get_source_line (exploc);
 	      if (!line_text)
 		break;
 	      fprintf (stream,
-		       "%s:%3i|loc:%5i|%.*s\n",
-		       exploc.file, exploc.line,
+		       "%s%s:%3i|loc:%5i|%.*s\n",
+		       exploc.file,
+		       exploc.generated_data ? special_fname_generated () : "",
+		       exploc.line,
 		       loc,
 		       (int)line_text.length (), line_text.get_buffer ());
 
@@ -1740,14 +1837,17 @@ get_substring_ranges_for_loc (cpp_reader *pfile,
       expanded_location finish
 	= expand_location_to_spelling_point (src_range.m_finish,
 					     LOCATION_ASPECT_FINISH);
-      if (start.file != finish.file)
+      if (start.generated_data_len != finish.generated_data_len
+	  || (start.generated_data_len
+	      ? start.generated_data != finish.generated_data
+	      : start.file != finish.file))
 	return "range endpoints are in different files";
       if (start.line != finish.line)
 	return "range endpoints are on different lines";
       if (start.column > finish.column)
 	return "range endpoints are reversed";
 
-      char_span line = location_get_source_line (start.file, start.line);
+      char_span line = location_get_source_line (start);
       if (!line)
 	return "unable to read source line";
 
@@ -1787,11 +1887,13 @@ get_substring_ranges_for_loc (cpp_reader *pfile,
       /* Bulletproofing.  We ought to only have different ordinary maps
 	 for start vs finish due to line-length jumps.  */
       if (start_ord_map != final_ord_map
-	  && start_ord_map->to_file != final_ord_map->to_file)
+	  && !ORDINARY_MAPS_SAME_FILE_P (start_ord_map, final_ord_map))
 	return "start and finish are spelled in different ordinary maps";
       /* The file from linemap_resolve_location ought to match that from
 	 expand_location_to_spelling_point.  */
-      if (start_ord_map->to_file != start.file)
+      if (ORDINARY_MAP_GENERATED_DATA_P (start_ord_map)
+	  ? ORDINARY_MAP_GENERATED_DATA (start_ord_map) != start.generated_data
+	  : ORDINARY_MAP_FILE_NAME (start_ord_map) != start.file)
 	return "mismatching file after resolving linemap";
 
       location_t start_loc
@@ -1963,6 +2065,20 @@ get_num_source_ranges_for_substring (cpp_reader *pfile,
 
 /* Selftests of location handling.  */
 
+/* Wrapper around linemap_add to handle transparently adding either a tmp file,
+   or in-memory generated content.  */
+const line_map_ordinary *
+temp_source_file::do_linemap_add (int line)
+{
+  const line_map *map;
+  if (content_buf)
+    map = linemap_add (line_table, LC_GEN, false, content_buf,
+		       line, content_len);
+  else
+    map = linemap_add (line_table, LC_ENTER, false, get_filename (), line);
+  return linemap_check_ordinary (map);
+}
+
 /* Verify that compare() on linenum_type handles comparisons over the full
    range of the type.  */
 
@@ -2041,13 +2157,16 @@ assert_loceq (const char *exp_filename, int exp_linenum, int exp_colnum,
 class line_table_case
 {
 public:
-  line_table_case (int default_range_bits, int base_location)
+  line_table_case (int default_range_bits, int base_location,
+		   bool generated_data)
   : m_default_range_bits (default_range_bits),
-    m_base_location (base_location)
+    m_base_location (base_location),
+    m_generated_data (generated_data)
   {}
 
   int m_default_range_bits;
   int m_base_location;
+  bool m_generated_data;
 };
 
 /* Constructor.  Store the old value of line_table, and create a new
@@ -2064,6 +2183,7 @@ line_table_test::line_table_test ()
   gcc_assert (saved_line_table->round_alloc_size);
   line_table->round_alloc_size = saved_line_table->round_alloc_size;
   line_table->default_range_bits = 0;
+  m_generated_data = false;
 }
 
 /* Constructor.  Store the old value of line_table, and create a new
@@ -2085,6 +2205,7 @@ line_table_test::line_table_test (const line_table_case &case_)
       line_table->highest_location = case_.m_base_location;
       line_table->highest_line = case_.m_base_location;
     }
+  m_generated_data = case_.m_generated_data;
 }
 
 /* Destructor.  Restore the old value of line_table.  */
@@ -2104,7 +2225,10 @@ test_accessing_ordinary_linemaps (const line_table_case &case_)
   line_table_test ltt (case_);
 
   /* Build a simple linemap describing some locations. */
-  linemap_add (line_table, LC_ENTER, false, "foo.c", 0);
+  if (ltt.m_generated_data)
+    linemap_add (line_table, LC_GEN, false, "some data", 0, 10);
+  else
+    linemap_add (line_table, LC_ENTER, false, "foo.c", 0);
 
   linemap_line_start (line_table, 1, 100);
   location_t loc_a = linemap_position_for_column (line_table, 1);
@@ -2154,21 +2278,23 @@ test_accessing_ordinary_linemaps (const line_table_case &case_)
   linemap_add (line_table, LC_LEAVE, false, NULL, 0);
 
   /* Verify that we can recover the location info.  */
-  assert_loceq ("foo.c", 1, 1, loc_a);
-  assert_loceq ("foo.c", 1, 23, loc_b);
-  assert_loceq ("foo.c", 2, 1, loc_c);
-  assert_loceq ("foo.c", 2, 17, loc_d);
-  assert_loceq ("foo.c", 3, 700, loc_e);
-  assert_loceq ("foo.c", 4, 100, loc_back_to_short);
+  const auto fname
+    = (ltt.m_generated_data ? special_fname_generated () : "foo.c");
+  assert_loceq (fname, 1, 1, loc_a);
+  assert_loceq (fname, 1, 23, loc_b);
+  assert_loceq (fname, 2, 1, loc_c);
+  assert_loceq (fname, 2, 17, loc_d);
+  assert_loceq (fname, 3, 700, loc_e);
+  assert_loceq (fname, 4, 100, loc_back_to_short);
 
   /* In the very wide line, the initial location should be fully tracked.  */
-  assert_loceq ("foo.c", 5, 2000, loc_start_of_very_long_line);
+  assert_loceq (fname, 5, 2000, loc_start_of_very_long_line);
   /* ...but once we exceed LINE_MAP_MAX_COLUMN_NUMBER column-tracking should
      be disabled.  */
-  assert_loceq ("foo.c", 5, 0, loc_too_wide);
-  assert_loceq ("foo.c", 5, 0, loc_too_wide_2);
+  assert_loceq (fname, 5, 0, loc_too_wide);
+  assert_loceq (fname, 5, 0, loc_too_wide_2);
   /*...and column-tracking should be re-enabled for subsequent lines.  */
-  assert_loceq ("foo.c", 6, 10, loc_sane_again);
+  assert_loceq (fname, 6, 10, loc_sane_again);
 
   assert_loceq ("bar.c", 1, 150, loc_f);
 
@@ -2215,10 +2341,11 @@ test_make_location_nonpure_range_endpoints (const line_table_case &case_)
      with C++ frontend.
      ....................0000000001111111111222.
      ....................1234567890123456789012.  */
+  line_table_test ltt (case_);
   const char *content = "     r += !aaa == bbb;\n";
-  temp_source_file tmp (SELFTEST_LOCATION, ".C", content);
-  line_table_test ltt (case_);
-  linemap_add (line_table, LC_ENTER, false, tmp.get_filename (), 1);
+  temp_source_file tmp (SELFTEST_LOCATION, ".C", content, strlen (content),
+			ltt.m_generated_data);
+  tmp.do_linemap_add (1);
 
   const location_t c11 = linemap_position_for_column (line_table, 11);
   const location_t c12 = linemap_position_for_column (line_table, 12);
@@ -3875,7 +4002,8 @@ static const location_t boundary_locations[] = {
 /* Run TESTCASE multiple times, once for each case in our test matrix.  */
 
 void
-for_each_line_table_case (void (*testcase) (const line_table_case &))
+for_each_line_table_case (void (*testcase) (const line_table_case &),
+			  bool test_generated_data)
 {
   /* As noted above in the description of struct line_table_case,
      we want to explore a test matrix of interesting line_table
@@ -3894,16 +4022,19 @@ for_each_line_table_case (void (*testcase) (const line_table_case &))
       const int num_boundary_locations = ARRAY_SIZE (boundary_locations);
       for (int loc_idx = 0; loc_idx < num_boundary_locations; loc_idx++)
 	{
-	  line_table_case c (default_range_bits, boundary_locations[loc_idx]);
-
-	  testcase (c);
-
-	  num_cases_tested++;
+	  /* ...and try both normal files, and internally generated data.  */
+	  for (int gen = 0; gen != 1+test_generated_data; ++gen)
+	    {
+	      line_table_case c (default_range_bits,
+				 boundary_locations[loc_idx], gen);
+	      testcase (c);
+	      num_cases_tested++;
+	    }
 	}
     }
 
   /* Verify that we fully covered the test matrix.  */
-  ASSERT_EQ (num_cases_tested, 2 * 12);
+  ASSERT_EQ (num_cases_tested, 2 * 12 * (1+test_generated_data));
 }
 
 /* Verify that when presented with a consecutive pair of locations with
@@ -3914,7 +4045,7 @@ for_each_line_table_case (void (*testcase) (const line_table_case &))
 static void
 test_line_offset_overflow ()
 {
-  line_table_test ltt (line_table_case (5, 0));
+  line_table_test ltt (line_table_case (5, 0, false));
 
   linemap_add (line_table, LC_ENTER, false, "foo.c", 0);
   linemap_line_start (line_table, 1, 100);
@@ -4057,9 +4188,9 @@ input_cc_tests ()
   test_should_have_column_data_p ();
   test_unknown_location ();
   test_builtins ();
-  for_each_line_table_case (test_make_location_nonpure_range_endpoints);
+  for_each_line_table_case (test_make_location_nonpure_range_endpoints, true);
 
-  for_each_line_table_case (test_accessing_ordinary_linemaps);
+  for_each_line_table_case (test_accessing_ordinary_linemaps, true);
   for_each_line_table_case (test_lexer);
   for_each_line_table_case (test_lexer_string_locations_simple);
   for_each_line_table_case (test_lexer_string_locations_ebcdic);
diff --git a/gcc/input.h b/gcc/input.h
index 2173a39a773..b81c5a9e71f 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -34,6 +34,7 @@ extern GTY(()) class line_maps *saved_line_table;
 
 /* Returns the translated string referring to the special location.  */
 const char *special_fname_builtin ();
+const char *special_fname_generated ();
 
 /* line-map.cc reserves RESERVED_LOCATION_COUNT to the user.  Ensure
    both UNKNOWN_LOCATION and BUILTINS_LOCATION fit into that.  */
@@ -114,13 +115,20 @@ class char_span
 };
 
 extern char_span location_get_source_line (const char *file_path, int line);
+
+/* The version taking an exploc handles generated source too, and should be used
+   whenever possible.  */
+extern char_span location_get_source_line (expanded_location exploc);
+extern char_span location_get_source_line (expanded_location exploc, int line);
+
 extern char *get_source_text_between (location_t, location_t);
 
 extern bool location_missing_trailing_newline (const char *file_path);
 
-/* Forward decl of slot within file_cache, so that the definition doesn't
+/* Forward decl of slots within file_cache, so that the definition doesn't
    need to be in this header.  */
 class file_cache_slot;
+class data_cache_slot;
 
 /* A cache of source files for use when emitting diagnostics
    (and in a few places in the C/C++ frontends).
@@ -138,7 +146,9 @@ class file_cache
   ~file_cache ();
 
   file_cache_slot *lookup_or_add_file (const char *file_path);
+  data_cache_slot *lookup_or_add_data (const char *data, unsigned int data_len);
   void forcibly_evict_file (const char *file_path);
+  void forcibly_evict_data (const char *data, unsigned int data_len);
 
   /* See comments in diagnostic.h about the input conversion context.  */
   struct input_context
@@ -150,13 +160,17 @@ class file_cache
 				 bool should_skip_bom);
 
  private:
-  file_cache_slot *evicted_cache_tab_entry (unsigned *highest_use_count);
+  template <class Slot>
+  Slot *evicted_cache_tab_entry (Slot *slots, unsigned int *highest_use_count);
+
   file_cache_slot *add_file (const char *file_path);
+  data_cache_slot *add_data (const char *data, unsigned int data_len);
   file_cache_slot *lookup_file (const char *file_path);
+  data_cache_slot *lookup_data (const char *data, unsigned int data_len);
 
- private:
   static const size_t num_file_slots = 16;
   file_cache_slot *m_file_slots;
+  data_cache_slot *m_data_slots;
   input_context in_context;
 };
 
@@ -253,6 +267,8 @@ void dump_location_info (FILE *stream);
 void diagnostics_file_cache_fini (void);
 
 void diagnostics_file_cache_forcibly_evict_file (const char *file_path);
+void diagnostics_file_cache_forcibly_evict_data (const char *data,
+						 unsigned int data_len);
 
 class GTY(()) string_concat
 {
diff --git a/gcc/selftest.cc b/gcc/selftest.cc
index 89abfba5e80..072933c1031 100644
--- a/gcc/selftest.cc
+++ b/gcc/selftest.cc
@@ -163,14 +163,21 @@ assert_str_startswith (const location &loc,
 
 named_temp_file::named_temp_file (const char *suffix)
 {
-  m_filename = make_temp_file (suffix);
-  ASSERT_NE (m_filename, NULL);
+  if (suffix)
+    {
+      m_filename = make_temp_file (suffix);
+      ASSERT_NE (m_filename, NULL);
+    }
+  else
+    m_filename = nullptr;
 }
 
 /* Destructor.  Delete the tempfile.  */
 
 named_temp_file::~named_temp_file ()
 {
+  if (!m_filename)
+    return;
   unlink (m_filename);
   diagnostics_file_cache_forcibly_evict_file (m_filename);
   free (m_filename);
@@ -183,7 +190,9 @@ named_temp_file::~named_temp_file ()
 temp_source_file::temp_source_file (const location &loc,
 				    const char *suffix,
 				    const char *content)
-: named_temp_file (suffix)
+: named_temp_file (suffix),
+  content_buf (nullptr),
+  content_len (0)
 {
   FILE *out = fopen (get_filename (), "w");
   if (!out)
@@ -192,19 +201,41 @@ temp_source_file::temp_source_file (const location &loc,
   fclose (out);
 }
 
-/* As above, but with a size, to allow for NUL bytes in CONTENT.  */
+/* As above, but with a size, to allow for NUL bytes in CONTENT.  When
+   IS_GENERATED==true, the data is kept in memory instead, for testing LC_GEN
+   maps.  */
 
 temp_source_file::temp_source_file (const location &loc,
 				    const char *suffix,
 				    const char *content,
-				    size_t sz)
-: named_temp_file (suffix)
+				    size_t sz,
+				    bool is_generated)
+: named_temp_file (is_generated ? nullptr : suffix),
+  content_buf (is_generated ? XNEWVEC (char, sz) : nullptr),
+  content_len (is_generated ? sz : 0)
 {
-  FILE *out = fopen (get_filename (), "w");
-  if (!out)
-    fail_formatted (loc, "unable to open tempfile: %s", get_filename ());
-  fwrite (content, sz, 1, out);
-  fclose (out);
+  if (is_generated)
+    {
+      gcc_assert (sz); /* Empty generated content is not supported.  */
+      memcpy (content_buf, content, sz);
+    }
+  else
+    {
+      FILE *out = fopen (get_filename (), "w");
+      if (!out)
+	fail_formatted (loc, "unable to open tempfile: %s", get_filename ());
+      fwrite (content, sz, 1, out);
+      fclose (out);
+    }
+}
+
+temp_source_file::~temp_source_file ()
+{
+  if (content_buf)
+    {
+      diagnostics_file_cache_forcibly_evict_data (content_buf, content_len);
+      XDELETEVEC (content_buf);
+    }
 }
 
 /* Avoid introducing locale-specific differences in the results
diff --git a/gcc/selftest.h b/gcc/selftest.h
index 7568a6d24d4..ab1c9025349 100644
--- a/gcc/selftest.h
+++ b/gcc/selftest.h
@@ -25,6 +25,8 @@ along with GCC; see the file COPYING3.  If not see
 
 #if CHECKING_P
 
+struct line_map_ordinary;
+
 namespace selftest {
 
 /* A struct describing the source-location of a selftest, to make it
@@ -96,10 +98,9 @@ extern void assert_str_startswith (const location &loc,
 class named_temp_file
 {
  public:
-  named_temp_file (const char *suffix);
+  explicit named_temp_file (const char *suffix);
   ~named_temp_file ();
   const char *get_filename () const { return m_filename; }
-
  private:
   char *m_filename;
 };
@@ -113,7 +114,13 @@ class temp_source_file : public named_temp_file
   temp_source_file (const location &loc, const char *suffix,
 		    const char *content);
   temp_source_file (const location &loc, const char *suffix,
-		    const char *content, size_t sz);
+		    const char *content, size_t sz,
+		    bool is_generated = false);
+  ~temp_source_file ();
+
+  char *const content_buf;
+  const size_t content_len;
+  const line_map_ordinary *do_linemap_add (int line); /* In input.cc */
 };
 
 /* RAII-style class for avoiding introducing locale-specific differences
@@ -171,6 +178,10 @@ class line_table_test
 
   /* Destructor.  Restore the saved line_table.  */
   ~line_table_test ();
+
+  /* When this is enabled in the line_table_case, test storing all the data
+     in memory rather than a file.  */
+  bool m_generated_data;
 };
 
 /* Helper function for selftests that need a function decl.  */
@@ -183,7 +194,8 @@ extern tree make_fndecl (tree return_type,
 /* Run TESTCASE multiple times, once for each case in our test matrix.  */
 
 extern void
-for_each_line_table_case (void (*testcase) (const line_table_case &));
+for_each_line_table_case (void (*testcase) (const line_table_case &),
+			  bool test_generated_data = false);
 
 /* Read the contents of PATH into memory, returning a 0-terminated buffer
    that must be freed by the caller.
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
index baa6b629b83..29e653625f8 100644
--- a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
@@ -430,7 +430,7 @@ test_show_locus (function *fun)
      to upper case.  Give all of the ranges labels (sharing one label).  */
   if (0 == strcmp (fnname, "test_many_nested_locations"))
     {
-      const char *file = LOCATION_FILE (fnstart);
+      const expanded_location xloc = expand_location (fnstart);
       const int start_line = fnstart_line + 2;
       const int finish_line = start_line + 7;
       location_t loc = get_loc (start_line - 1, 2);
@@ -438,7 +438,7 @@ test_show_locus (function *fun)
       rich_location richloc (line_table, loc);
       for (int line = start_line; line <= finish_line; line++)
 	{
-	  char_span content = location_get_source_line (file, line);
+	  char_span content = location_get_source_line (xloc, line);
 	  gcc_assert (content);
 	  /* Split line up into words.  */
 	  for (int idx = 0; idx < content.length (); idx++)
diff --git a/libcpp/directives.cc b/libcpp/directives.cc
index 9dc4363c65a..a29f4abc3cc 100644
--- a/libcpp/directives.cc
+++ b/libcpp/directives.cc
@@ -1162,7 +1162,7 @@ _cpp_do_file_change (cpp_reader *pfile, enum lc_reason reason,
 		     const char *to_file, linenum_type to_line,
 		     unsigned int sysp)
 {
-  linemap_assert (reason != LC_ENTER_MACRO);
+  linemap_assert (reason != LC_ENTER_MACRO && reason != LC_GEN);
 
   const line_map_ordinary *ord_map = NULL;
   if (!to_line && reason == LC_RENAME_VERBATIM)
@@ -1173,6 +1173,7 @@ _cpp_do_file_change (cpp_reader *pfile, enum lc_reason reason,
          preprocessed source.  */
       line_map_ordinary *last = LINEMAPS_LAST_ORDINARY_MAP (pfile->line_table);
       if (!ORDINARY_MAP_STARTING_LINE_NUMBER (last)
+	  && !ORDINARY_MAP_GENERATED_DATA_P (last)
 	  && 0 == filename_cmp (to_file, ORDINARY_MAP_FILE_NAME (last))
 	  && SOURCE_LINE (last, pfile->line_table->highest_line) == 2)
 	{
diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index 50207cacc12..7bed55548c7 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -75,6 +75,8 @@ enum lc_reason
   LC_RENAME_VERBATIM,	/* Likewise, but "" != stdin.  */
   LC_ENTER_MACRO,	/* Begin macro expansion.  */
   LC_MODULE,		/* A (C++) Module.  */
+  LC_GEN,		/* Internally generated source.  */
+
   /* FIXME: add support for stringize and paste.  */
   LC_HWM /* High Water Mark.  */
 };
@@ -437,7 +439,13 @@ struct GTY((tag ("1"))) line_map_ordinary : public line_map {
 
   /* Pointer alignment boundary on both 32 and 64-bit systems.  */
 
-  const char *to_file;
+  /* For an LC_GEN map, DATA points to the actual content.  Otherwise it is
+     a file name.  In the former case, the data could contain embedded nulls
+     and it need not be null terminated, so we use the GTY markup appropriate
+     for that case.  */
+  const char * GTY((string_length ("%h.data_len"))) data;
+  unsigned int data_len;
+
   linenum_type to_line;
 
   /* Location from whence this line map was included.  For regular
@@ -662,6 +670,12 @@ ORDINARY_MAP_IN_SYSTEM_HEADER_P (const line_map_ordinary *ord_map)
   return ord_map->sysp;
 }
 
+/* TRUE if this line map contains generated data.  */
+inline bool ORDINARY_MAP_GENERATED_DATA_P (const line_map_ordinary *ord_map)
+{
+  return ord_map->reason == LC_GEN;
+}
+
 /* TRUE if this line map is for a module (not a source file).  */
 
 inline bool
@@ -671,14 +685,42 @@ MAP_MODULE_P (const line_map *map)
 	  && linemap_check_ordinary (map)->reason == LC_MODULE);
 }
 
-/* Get the filename of ordinary map MAP.  */
+/* Get the data contents of ordinary map MAP.  */
 
 inline const char *
 ORDINARY_MAP_FILE_NAME (const line_map_ordinary *ord_map)
 {
-  return ord_map->to_file;
+  linemap_assert (ord_map->reason != LC_GEN);
+  return ord_map->data;
 }
 
+inline const char *
+ORDINARY_MAP_GENERATED_DATA (const line_map_ordinary *ord_map)
+{
+  linemap_assert (ord_map->reason == LC_GEN);
+  return ord_map->data;
+}
+
+inline unsigned int
+ORDINARY_MAP_GENERATED_DATA_LEN (const line_map_ordinary *ord_map)
+{
+  linemap_assert (ord_map->reason == LC_GEN);
+  return ord_map->data_len;
+}
+
+/* Sometimes we don't need to care which kind it is.  */
+inline const char *
+ORDINARY_MAP_FILE_NAME_OR_DATA (const line_map_ordinary *ord_map)
+{
+  return ord_map->data;
+}
+
+/* If we just want to know whether two maps point to the same
+   file/buffer or not.  */
+bool
+ORDINARY_MAPS_SAME_FILE_P (const line_map_ordinary *map1,
+			   const line_map_ordinary *map2);
+
 /* Get the cpp macro whose expansion gave birth to macro map MAP.  */
 
 inline cpp_hashnode *
@@ -1097,17 +1139,19 @@ extern line_map *line_map_new_raw (line_maps *, bool, unsigned);
    map that records locations of tokens that are not part of macro
    replacement-lists present at a macro expansion point.
 
-   The text pointed to by TO_FILE must have a lifetime
-   at least as long as the lifetime of SET.  An empty
-   TO_FILE means standard input.  If reason is LC_LEAVE, and
-   TO_FILE is NULL, then TO_FILE, TO_LINE and SYSP are given their
-   natural values considering the file we are returning to.
+   The text pointed to by DATA must have a lifetime at least as long as the
+   lifetime of SET.  If reason is LC_LEAVE, and DATA is NULL, then DATA, TO_LINE
+   and SYSP are given their natural values considering the file we are returning
+   to.  If reason is LC_GEN, then DATA is the actual content, and DATA_LEN>0 is
+   the length of it.  Otherwise DATA is a file name and DATA_LEN need not be
+   specified.  If DATA_LEN is specified for a file name, it should be the length
+   of the file name, including the terminating null.
 
-   A call to this function can relocate the previous set of
-   maps, so any stored line_map pointers should not be used.  */
+   A call to this function can relocate the previous set of maps, so any stored
+   line_map pointers should not be used.  */
 extern const line_map *linemap_add
   (class line_maps *, enum lc_reason, unsigned int sysp,
-   const char *to_file, linenum_type to_line);
+   const char *data, linenum_type to_line, unsigned int data_len = 0);
 
 /* Create a macro map.  A macro map encodes source locations of tokens
    that are part of a macro replacement-list, at a macro expansion
@@ -1257,7 +1301,7 @@ linemap_position_for_loc_and_offset (class line_maps *set,
 inline const char *
 LINEMAP_FILE (const line_map_ordinary *ord_map)
 {
-  return ord_map->to_file;
+  return ORDINARY_MAP_FILE_NAME (ord_map);
 }
 
 /* Return the line number this map started encoding location from.  */
@@ -1277,6 +1321,13 @@ LINEMAP_SYSP (const line_map_ordinary *ord_map)
   return ord_map->sysp;
 }
 
+/* For a normal ordinary map, this is the same as ORDINARY_MAP_FILE_NAME;
+   but for an LC_GEN map, it returns the file name from which the data
+   originated, instead of asserting.  */
+const char *
+ORDINARY_MAP_CONTAINING_FILE_NAME (line_maps *set,
+				   const line_map_ordinary *ord_map);
+
 const struct line_map *first_map_in_common (line_maps *set,
 					    location_t loc0,
 					    location_t loc1,
@@ -1316,6 +1367,11 @@ typedef struct
 
   /* In a system header?. */
   bool sysp;
+
+  /* If generated data, the data and its length.  The data may contain embedded
+   nulls and need not be null-terminated.  */
+  unsigned int generated_data_len;
+  const char *generated_data;
 } expanded_location;
 
 class range_label;
@@ -2104,12 +2160,14 @@ struct linemap_stats
   long adhoc_table_entries_used;
 };
 
-/* Return the highest location emitted for a given file for which
-   there is a line map in SET.  FILE_NAME is the file name to
-   consider.  If the function returns TRUE, *LOC is set to the highest
-   location emitted for that file.  */
+/* Return the highest location emitted for a given file or generated data buffer
+   for which there is a line map in SET.  If the function returns TRUE, *LOC is
+   set to the highest location emitted for that file.  The const char* arg is
+   either a file name or a generated data buffer, as indicated by
+   IS_DATA.  */
 bool linemap_get_file_highest_location (class line_maps * set,
-					const char *file_name,
+					const char *fname_or_data,
+					bool is_data,
 					location_t *loc);
 
 /* Compute and return statistics about the memory consumption of some
diff --git a/libcpp/line-map.cc b/libcpp/line-map.cc
index 50e8043255e..937b4551122 100644
--- a/libcpp/line-map.cc
+++ b/libcpp/line-map.cc
@@ -48,6 +48,35 @@ static location_t linemap_macro_loc_to_exp_point (line_maps *,
 extern unsigned num_expanded_macros_counter;
 extern unsigned num_macro_tokens_counter;
 
+/* For a normal ordinary map, this is the same as ORDINARY_MAP_FILE_NAME;
+   but for an LC_GEN map, it returns the file name from which the data
+   originated, instead of asserting.  */
+const char *
+ORDINARY_MAP_CONTAINING_FILE_NAME (line_maps *set,
+				   const line_map_ordinary *ord_map)
+{
+  while (ORDINARY_MAP_GENERATED_DATA_P (ord_map))
+    {
+      ord_map = linemap_included_from_linemap (set, ord_map);
+      if (!ord_map)
+	return "-";
+    }
+  return ORDINARY_MAP_FILE_NAME (ord_map);
+}
+
+/* If we just want to know whether two maps point to the same
+   file/buffer or not.  */
+bool
+ORDINARY_MAPS_SAME_FILE_P (const line_map_ordinary *map1,
+			   const line_map_ordinary *map2)
+{
+  const bool is_data = ORDINARY_MAP_GENERATED_DATA_P (map1);
+  return is_data == ORDINARY_MAP_GENERATED_DATA_P (map2)
+    && (is_data
+	? map1->data == map2->data
+	: !filename_cmp (map1->data, map2->data));
+}
+
 /* Destructor for class line_maps.
    Ensure non-GC-managed memory is released.  */
 
@@ -411,8 +440,9 @@ linemap_check_files_exited (line_maps *set)
   for (const line_map_ordinary *map = LINEMAPS_LAST_ORDINARY_MAP (set);
        ! MAIN_FILE_P (map);
        map = linemap_included_from_linemap (set, map))
-    fprintf (stderr, "line-map.cc: file \"%s\" entered but not left\n",
-	     ORDINARY_MAP_FILE_NAME (map));
+    fprintf (stderr, "line-map.cc: file \"%s%s\" entered but not left\n",
+	     ORDINARY_MAP_CONTAINING_FILE_NAME (set, map),
+	     ORDINARY_MAP_GENERATED_DATA_P (map) ? "<generated>" : "");
 }
 
 /* Create NUM zero-initialized maps of type MACRO_P.  */
@@ -504,22 +534,26 @@ LAST_SOURCE_LINE_LOCATION (const line_map_ordinary *map)
 	  + map->start_location);
 }
 
-/* Add a mapping of logical source line to physical source file and
-   line number.
+/* Add a mapping of logical source line to physical source file andg
+   line number.  This function creates an "ordinary map", which is a
+   map that records locations of tokens that are not part of macro
+   replacement-lists present at a macro expansion point.
 
-   The text pointed to by TO_FILE must have a lifetime
-   at least as long as the final call to lookup_line ().  An empty
-   TO_FILE means standard input.  If reason is LC_LEAVE, and
-   TO_FILE is NULL, then TO_FILE, TO_LINE and SYSP are given their
-   natural values considering the file we are returning to.
+   The text pointed to by DATA must have a lifetime at least as long as the
+   lifetime of SET.  If reason is LC_LEAVE, and DATA is NULL, then DATA, TO_LINE
+   and SYSP are given their natural values considering the file we are returning
+   to.  If reason is LC_GEN, then DATA is the actual content, and DATA_LEN>0 is
+   the length of it.  Otherwise DATA is a file name and DATA_LEN need not be
+   specified.  If DATA_LEN is specified for a file name, it should be the length
+   of the file name, including the terminating null.
 
-   FROM_LINE should be monotonic increasing across calls to this
-   function.  A call to this function can relocate the previous set of
-   maps, so any stored line_map pointers should not be used.  */
+   A call to this function can relocate the previous set of maps, so any stored
+   line_map pointers should not be used.  */
 
 const struct line_map *
 linemap_add (line_maps *set, enum lc_reason reason,
-	     unsigned int sysp, const char *to_file, linenum_type to_line)
+	     unsigned int sysp, const char *data, linenum_type to_line,
+	     unsigned int data_len)
 {
   /* Generate a start_location above the current highest_location.
      If possible, make the low range bits be zero.  */
@@ -535,13 +569,25 @@ linemap_add (line_maps *set, enum lc_reason reason,
 		      >= MAP_START_LOCATION (LINEMAPS_LAST_ORDINARY_MAP (set))));
 
   /* When we enter the file for the first time reason cannot be
-     LC_RENAME.  */
-  linemap_assert (!(set->depth == 0 && reason == LC_RENAME));
+     LC_RENAME.  To keep things simple, don't track LC_RENAME for
+     LC_GEN maps, but just keep their reason as always LC_GEN.  */
+  if (reason == LC_RENAME)
+    {
+      linemap_assert (set->depth != 0);
+      const auto prev = LINEMAPS_LAST_ORDINARY_MAP (set);
+      linemap_assert (prev);
+      if (prev->reason == LC_GEN)
+	{
+	  reason = LC_GEN;
+	  data = prev->data;
+	  data_len = prev->data_len;
+	}
+    }
 
   /* If we are leaving the main file, return a NULL map.  */
   if (reason == LC_LEAVE
       && MAIN_FILE_P (LINEMAPS_LAST_ORDINARY_MAP (set))
-      && to_file == NULL)
+      && data == NULL)
     {
       set->depth--;
       return NULL;
@@ -557,8 +603,9 @@ linemap_add (line_maps *set, enum lc_reason reason,
     = linemap_check_ordinary (new_linemap (set, start_location));
   map->reason = reason;
 
-  if (to_file && *to_file == '\0' && reason != LC_RENAME_VERBATIM)
-    to_file = "<stdin>";
+  if (data && *data == '\0' && reason != LC_RENAME_VERBATIM
+      && reason != LC_GEN)
+    data = "<stdin>";
 
   if (reason == LC_RENAME_VERBATIM)
     reason = LC_RENAME;
@@ -577,20 +624,31 @@ linemap_add (line_maps *set, enum lc_reason reason,
 	 that comes right before MAP in the same file.  */
       from = linemap_included_from_linemap (set, map - 1);
 
-      /* A TO_FILE of NULL is special - we use the natural values.  */
-      if (to_file == NULL)
+      /* A DATA of NULL is special - we use the natural values.  */
+      if (data == NULL)
 	{
-	  to_file = ORDINARY_MAP_FILE_NAME (from);
+	  data = ORDINARY_MAP_FILE_NAME_OR_DATA (from);
 	  to_line = SOURCE_LINE (from, from[1].start_location);
 	  sysp = ORDINARY_MAP_IN_SYSTEM_HEADER_P (from);
 	}
       else
-	linemap_assert (filename_cmp (ORDINARY_MAP_FILE_NAME (from),
-				      to_file) == 0);
+	linemap_assert (ORDINARY_MAP_GENERATED_DATA_P (from)
+			? (ORDINARY_MAP_GENERATED_DATA (from) == data)
+			: (filename_cmp (ORDINARY_MAP_FILE_NAME (from), data)
+			   == 0));
     }
 
   map->sysp = sysp;
-  map->to_file = to_file;
+  map->data = data;
+
+  if (reason == LC_GEN)
+    {
+      gcc_assert (data_len);
+      map->data_len = data_len;
+    }
+  else
+    map->data_len = (data_len > 0 ? data_len : strlen (data) + 1);
+
   map->to_line = to_line;
   LINEMAPS_ORDINARY_CACHE (set) = LINEMAPS_ORDINARY_USED (set) - 1;
   /* Do not store range_bits here.  That's readjusted in
@@ -606,7 +664,7 @@ linemap_add (line_maps *set, enum lc_reason reason,
      pure_location_p.  */
   linemap_assert (pure_location_p (set, start_location));
 
-  if (reason == LC_ENTER)
+  if (reason == LC_ENTER || reason == LC_GEN)
     {
       if (set->depth == 0)
 	map->included_from = 0;
@@ -617,7 +675,7 @@ linemap_add (line_maps *set, enum lc_reason reason,
 	      & ~((1 << map[-1].m_column_and_range_bits) - 1))
 	     + map[-1].start_location);
       set->depth++;
-      if (set->trace_includes)
+      if (set->trace_includes && reason == LC_ENTER)
 	trace_include (set, map);
     }
   else if (reason == LC_RENAME)
@@ -863,8 +921,9 @@ linemap_line_start (line_maps *set, linenum_type to_line,
 	        (const_cast <line_map *>
 		  (linemap_add (set, LC_RENAME,
 				ORDINARY_MAP_IN_SYSTEM_HEADER_P (map),
-				ORDINARY_MAP_FILE_NAME (map),
-				to_line)));
+				ORDINARY_MAP_FILE_NAME_OR_DATA (map),
+				to_line,
+				map->data_len)));
       map->m_column_and_range_bits = column_bits;
       map->m_range_bits = range_bits;
       r = (MAP_START_LOCATION (map)
@@ -1025,7 +1084,7 @@ linemap_position_for_loc_and_offset (line_maps *set,
        cannot encode the location there.  */
     if ((map + 1)->reason != LC_RENAME
 	|| line < ORDINARY_MAP_STARTING_LINE_NUMBER (map + 1)
-	|| 0 != strcmp (LINEMAP_FILE (map + 1), LINEMAP_FILE (map)))
+	|| !ORDINARY_MAPS_SAME_FILE_P (map, map + 1))
       return loc;
 
   column += column_offset;
@@ -1283,7 +1342,7 @@ linemap_get_expansion_filename (line_maps *set,
 
   linemap_macro_loc_to_exp_point (set, location, &map);
 
-  return LINEMAP_FILE (map);
+  return ORDINARY_MAP_CONTAINING_FILE_NAME (set, map);
 }
 
 /* Return the name of the macro associated to MACRO_MAP.  */
@@ -1853,8 +1912,12 @@ linemap_expand_location (line_maps *set,
 	abort ();
 
       const line_map_ordinary *ord_map = linemap_check_ordinary (map);
-
-      xloc.file = LINEMAP_FILE (ord_map);
+      xloc.file = ORDINARY_MAP_CONTAINING_FILE_NAME (set, ord_map);
+      if (ORDINARY_MAP_GENERATED_DATA_P (ord_map))
+	{
+	  xloc.generated_data = ORDINARY_MAP_GENERATED_DATA (ord_map);
+	  xloc.generated_data_len = ORDINARY_MAP_GENERATED_DATA_LEN (ord_map);
+	}
       xloc.line = SOURCE_LINE (ord_map, loc);
       xloc.column = SOURCE_COLUMN (ord_map, loc);
       xloc.sysp = LINEMAP_SYSP (ord_map) != 0;
@@ -1873,7 +1936,7 @@ linemap_dump (FILE *stream, class line_maps *set, unsigned ix, bool is_macro)
 {
   const char *const lc_reasons_v[LC_HWM]
       = { "LC_ENTER", "LC_LEAVE", "LC_RENAME", "LC_RENAME_VERBATIM",
-	  "LC_ENTER_MACRO", "LC_MODULE" };
+	  "LC_ENTER_MACRO", "LC_MODULE", "LC_GEN" };
   const line_map *map;
   unsigned reason;
 
@@ -1903,11 +1966,15 @@ linemap_dump (FILE *stream, class line_maps *set, unsigned ix, bool is_macro)
       const line_map_ordinary *includer_map
 	= linemap_included_from_linemap (set, ord_map);
 
-      fprintf (stream, "File: %s:%d\n", ORDINARY_MAP_FILE_NAME (ord_map),
+      fprintf (stream, "File: %s:%d\n",
+	       ORDINARY_MAP_GENERATED_DATA_P (ord_map) ? "<generated>"
+	       : ORDINARY_MAP_FILE_NAME (ord_map),
 	       ORDINARY_MAP_STARTING_LINE_NUMBER (ord_map));
       fprintf (stream, "Included from: [%d] %s\n",
 	       includer_map ? int (includer_map - set->info_ordinary.maps) : -1,
-	       includer_map ? ORDINARY_MAP_FILE_NAME (includer_map) : "None");
+	       includer_map ? ORDINARY_MAP_CONTAINING_FILE_NAME (set,
+								 includer_map)
+	       : "None");
     }
   else
     {
@@ -1931,7 +1998,7 @@ linemap_dump_location (line_maps *set,
 {
   const line_map_ordinary *map;
   location_t location;
-  const char *path = "", *from = "";
+  const char *path = "", *path_suffix = "", *from = "";
   int l = -1, c = -1, s = -1, e = -1;
 
   if (IS_ADHOC_LOC (loc))
@@ -1948,7 +2015,9 @@ linemap_dump_location (line_maps *set,
     linemap_assert (location < RESERVED_LOCATION_COUNT);
   else
     {
-      path = LINEMAP_FILE (map);
+      path = ORDINARY_MAP_CONTAINING_FILE_NAME (set, map);
+      if (ORDINARY_MAP_GENERATED_DATA_P (map))
+	path_suffix = "<generated>";
       l = SOURCE_LINE (map, location);
       c = SOURCE_COLUMN (map, location);
       s = LINEMAP_SYSP (map) != 0;
@@ -1959,24 +2028,27 @@ linemap_dump_location (line_maps *set,
 	{
 	  const line_map_ordinary *from_map
 	    = linemap_included_from_linemap (set, map);
-	  from = from_map ? LINEMAP_FILE (from_map) : "<NULL>";
+	  from = from_map ? ORDINARY_MAP_CONTAINING_FILE_NAME (set, from_map)
+	    : "<NULL>";
 	}
     }
 
   /* P: path, L: line, C: column, S: in-system-header, M: map address,
      E: macro expansion?, LOC: original location, R: resolved location   */
-  fprintf (stream, "{P:%s;F:%s;L:%d;C:%d;S:%d;M:%p;E:%d,LOC:%d,R:%d}",
-	   path, from, l, c, s, (void*)map, e, loc, location);
+  fprintf (stream, "{P:%s%s;F:%s;L:%d;C:%d;S:%d;M:%p;E:%d,LOC:%d,R:%d}",
+	   path, path_suffix, from, l, c, s, (void*)map, e, loc, location);
 }
 
-/* Return the highest location emitted for a given file for which
-   there is a line map in SET.  FILE_NAME is the file name to
-   consider.  If the function returns TRUE, *LOC is set to the highest
-   location emitted for that file.  */
+/* Return the highest location emitted for a given file or generated data buffer
+   for which there is a line map in SET.  If the function returns TRUE, *LOC is
+   set to the highest location emitted for that file.  The const char* arg is
+   either a file name or a generated data buffer, as indicated by
+   IS_DATA.  */
 
 bool
 linemap_get_file_highest_location (line_maps *set,
-				   const char *file_name,
+				   const char *fname_or_data,
+				   bool is_data,
 				   location_t *loc)
 {
   /* If the set is empty or no ordinary map has been created then
@@ -1984,13 +2056,23 @@ linemap_get_file_highest_location (line_maps *set,
   if (set == NULL || set->info_ordinary.used == 0)
     return false;
 
-  /* Now look for the last ordinary map created for FILE_NAME.  */
+  /* Now look for the last ordinary map created for this file.  */
   int i;
   for (i = set->info_ordinary.used - 1; i >= 0; --i)
     {
-      const char *fname = set->info_ordinary.maps[i].to_file;
-      if (fname && !filename_cmp (fname, file_name))
-	break;
+      const auto map = set->info_ordinary.maps + i;
+      if (is_data)
+	{
+	  if (ORDINARY_MAP_GENERATED_DATA_P (map)
+	      && ORDINARY_MAP_GENERATED_DATA (map) == fname_or_data)
+	    break;
+	}
+      else if (!ORDINARY_MAP_GENERATED_DATA_P (map))
+	{
+	  const auto this_fname = ORDINARY_MAP_FILE_NAME (map);
+	  if (this_fname && !filename_cmp (this_fname, fname_or_data))
+	    break;
+	}
     }
 
   if (i < 0)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/6] libcpp: Fix paste error with unknown pragma after macro expansion
  2022-11-04 13:44 ` [PATCH 3/6] libcpp: Fix paste error with unknown pragma after macro expansion Lewis Hyatt
@ 2022-11-21 17:50   ` Jeff Law
  0 siblings, 0 replies; 18+ messages in thread
From: Jeff Law @ 2022-11-21 17:50 UTC (permalink / raw)
  To: Lewis Hyatt, gcc-patches


On 11/4/22 07:44, Lewis Hyatt via Gcc-patches wrote:
> In directives.cc, do_pragma() contains logic to handle a case such as the new
> testcase pragma-omp-unknown.c, where an unknown pragma was the result of macro
> expansion (for pragma namespaces that permit expansion). This no longer works
> correctly as shown by the testcase, fixed by adding PREV_WHITE to the flags on
> the second token to prevent an unwanted paste.  Also fixed the memory leak,
> since the temporary tokens are pushed on their own context, nothing prevents
> freeing of the buffer that holds them when the context is eventually popped.
>
> libcpp/ChangeLog:
>
> 	* directives.cc (do_pragma): Fix memory leak in token buffer.  Fix
> 	unwanted paste between two tokens.
>
> gcc/testsuite/ChangeLog:
>
> 	* c-c++-common/gomp/pragma-omp-unknown.c: New test.

OK

jeff



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/6] diagnostics: libcpp: Add LC_GEN linemaps to support in-memory buffers
  2022-11-17 21:21     ` Lewis Hyatt
@ 2023-01-05 22:34       ` Lewis Hyatt
  0 siblings, 0 replies; 18+ messages in thread
From: Lewis Hyatt @ 2023-01-05 22:34 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

On Thu, Nov 17, 2022 at 4:21 PM Lewis Hyatt <lhyatt@gmail.com> wrote:
>
> On Sat, Nov 05, 2022 at 12:23:28PM -0400, David Malcolm wrote:
> > On Fri, 2022-11-04 at 09:44 -0400, Lewis Hyatt via Gcc-patches wrote:
> > [...snip...]
> > >
> > > diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
> > > index 5890c18bdc3..2935d7fb236 100644
> > > --- a/gcc/c-family/c-common.cc
> > > +++ b/gcc/c-family/c-common.cc
> > > @@ -9183,11 +9183,14 @@ try_to_locate_new_include_insertion_point (const char *file, location_t loc)
> > >        const line_map_ordinary *ord_map
> > >         = LINEMAPS_ORDINARY_MAP_AT (line_table, i);
> > >
> > > +      if (ord_map->reason == LC_GEN)
> > > +       continue;
> > > +
> > >        if (const line_map_ordinary *from
> > >           = linemap_included_from_linemap (line_table, ord_map))
> > >         /* We cannot use pointer equality, because with preprocessed
> > >            input all filename strings are unique.  */
> > > -       if (0 == strcmp (from->to_file, file))
> > > +       if (from->reason != LC_GEN && 0 == strcmp (from->to_file, file))
> > >           {
> > >             last_include_ord_map = from;
> > >             last_ord_map_after_include = NULL;
> >
> > [...snip...]
> >
> > I'm not a fan of having the "to_file" field change meaning based on
> > whether reason is LC_GEN.
> >
> > How involved would it be to split line_map_ordinary into two
> > subclasses, so that we'd have this hierarchy (with indentation showing
> > inheritance):
> >
> > line_map
> >   line_map_ordinary
> >     line_map_ordinary_file
> >     line_map_ordinary_generated
> >   line_map_macro
> >
> > Alternatively, how about renaming "to_file" to be "data" (or "m_data"),
> > to emphasize that it might not be a filename, and that we have to check
> > everywhere we access that field.
> >
> > Please can all those checks for LC_GEN go into an inline function so we
> > can write e.g.
> >   map->generated_p ()
> > or somesuch.
> >
> > If I reading things right, patch 6 adds the sole usage of this in
> > destringize_and_run.  Would we ever want to discriminate between
> > different kinds of generated buffers?
> >
> > [...snip...]
> >
> > > @@ -796,10 +798,13 @@ diagnostic_report_current_module (diagnostic_context *context, location_t where)
> > >                  N_("of module"),
> > >                  N_("In module imported at"),   /* 6 */
> > >                  N_("imported at"),
> > > +                N_("In buffer generated from"),   /* 8 */
> > >                 };
> >
> > We use the wording "destringized" in:
> >
> > so maybe this should be "In buffer destringized from" ???  (I'm not
> > sure)
> >
> > [...snip...]
> >
> > > diff --git a/gcc/input.cc b/gcc/input.cc
> > > index 483cb6e940d..3cf5480551d 100644
> > > --- a/gcc/input.cc
> > > +++ b/gcc/input.cc
> >
> > [..snip...]
> >
> > > @@ -58,7 +64,7 @@ public:
> > >    ~file_cache_slot ();
> >
> > My initial thought reading the input.cc part of this patch was that I
> > want it to be very clear when a file_cache_slot is for a real file vs
> > when we're replaying generated data.  I'd hoped that this could have
> > been expressed via inheritance, but we preallocate all the cache slots
> > once in an array in file_cache's ctor and the slots get reused over
> > time.  So instead of that, can we please have some kind of:
> >
> >    bool file_slot_p () const;
> >    bool generated_slot_p () const;
> >
> > or somesuch, so that we can have clear assertions and conditionals
> > about the current state of a slot (I think the discriminating condition
> > is that generated_data_len > 0, right?)
> >
> > If I'm reading things right, it looks like file_cache_slot::m_file_path
> > does double duty after this patch, and is either a filename, or a
> > pointer to the generated data.  If so, please can the patch rename it,
> > and have all usage guarded appropriately.  Can it be a union? (or does
> > the ctor prevent that?)
> >
> > [...snip...]
> >
> > > @@ -445,16 +461,23 @@ file_cache::evicted_cache_tab_entry (unsigned *highest_use_count)
> > >     num_file_slots files are cached.  */
> > >
> > >  file_cache_slot*
> > > -file_cache::add_file (const char *file_path)
> > > +file_cache::add_file (const char *file_path, unsigned int generated_data_len)
> >
> > Can we split this into two functions: one for files, and one for
> > generated data?  (add_file vs add_generated_data?)
> >
> > >  {
> > >
> > > -  FILE *fp = fopen (file_path, "r");
> > > -  if (fp == NULL)
> > > -    return NULL;
> > > +  FILE *fp;
> > > +  if (generated_data_len)
> > > +    fp = NULL;
> > > +  else
> > > +    {
> > > +      fp = fopen (file_path, "r");
> > > +      if (fp == NULL)
> > > +       return NULL;
> > > +    }
> > >
> > >    unsigned highest_use_count = 0;
> > >    file_cache_slot *r = evicted_cache_tab_entry (&highest_use_count);
> > > -  if (!r->create (in_context, file_path, fp, highest_use_count))
> > > +  if (!r->create (in_context, file_path, fp, highest_use_count,
> > > +                 generated_data_len))
> > >      return NULL;
> > >    return r;
> > >  }
> >
> > [...snip...]
> >
> > > @@ -535,11 +571,12 @@ file_cache::~file_cache ()
> > >     it.  */
> > >
> > >  file_cache_slot*
> > > -file_cache::lookup_or_add_file (const char *file_path)
> > > +file_cache::lookup_or_add_file (const char *file_path,
> > > +                               unsigned int generated_data_len)
> >
> > Likewise, could this be split into:
> >   lookup_or_add_file
> > and
> >   lookup_or_add_generated
> > or somesuch?
> >
> > >  {
> > >    file_cache_slot *r = lookup_file (file_path);
> >
> > The patch doesn't seem to touch file_cache::lookup_file.  Is the
> > current implementation of that ideal (it looks like we're going to be
> > doing strcmp of generated buffers, when presumably for those we could
> > simply be doing pointer comparisons).
> >
> > Maybe rename it to lookup_slot?
> >
> > >    if (r == NULL)
> > > -    r = add_file (file_path);
> > > +    r = add_file (file_path, generated_data_len);
> > >    return r;
> > >  }
> > >
> > > @@ -547,7 +584,8 @@ file_cache::lookup_or_add_file (const char *file_path)
> > >     diagnostic.  */
> > >
> > >  file_cache_slot::file_cache_slot ()
> > > -: m_use_count (0), m_file_path (NULL), m_fp (NULL), m_data (0),
> > > +: m_use_count (0), m_file_path (NULL), m_fp (NULL),
> > > +  m_data (0), m_data_active (0),
> > >    m_alloc_offset (0), m_size (0), m_nb_read (0), m_line_start_idx (0),
> > >    m_line_num (0), m_total_lines (0), m_missing_trailing_newline (true)
> > >  {
> >
> > [...snip...]
> >
> >
> >
> > > diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
> > > index 50207cacc12..eb281809cbd 100644
> > > --- a/libcpp/include/line-map.h
> > > +++ b/libcpp/include/line-map.h
> > > @@ -75,6 +75,8 @@ enum lc_reason
> > >    LC_RENAME_VERBATIM,  /* Likewise, but "" != stdin.  */
> > >    LC_ENTER_MACRO,      /* Begin macro expansion.  */
> > >    LC_MODULE,           /* A (C++) Module.  */
> > > +  LC_GEN,              /* Internally generated source.  */
> > > +
> > >    /* FIXME: add support for stringize and paste.  */
> > >    LC_HWM /* High Water Mark.  */
> > >  };
> > > @@ -437,7 +439,11 @@ struct GTY((tag ("1"))) line_map_ordinary : public line_map {
> > >
> > >    /* Pointer alignment boundary on both 32 and 64-bit systems.  */
> > >
> > > -  const char *to_file;
> > > +  /* This GTY markup is in case this is an LC_GEN map, in which case
> > > +     to_file actually points to the generated data, which we do not
> > > +     want to require to be free of null bytes.  */
> > > +  const char * GTY((string_length ("%h.to_file_len"))) to_file;
> > > +  unsigned int to_file_len;
> > >    linenum_type to_line;
> >
> > What's the intended interaction between this, the garbage-collector,
> > and PCH?  Is to_file to be allocated in the GC-managed heap, or can it
> > be outside of it?  Looking at patch 6 I see that this seems to be
> > allocated (in destringize_and_run) by _cpp_unaligned_alloc.  I don't
> > remember off the top of my head if that's valid.
> >
> > >
> > >    /* Location from whence this line map was included.  For regular
> > > @@ -1101,13 +1107,15 @@ extern line_map *line_map_new_raw (line_maps *, bool, unsigned);
> > >     at least as long as the lifetime of SET.  An empty
> > >     TO_FILE means standard input.  If reason is LC_LEAVE, and
> > >     TO_FILE is NULL, then TO_FILE, TO_LINE and SYSP are given their
> > > -   natural values considering the file we are returning to.
> > > +   natural values considering the file we are returning to.  If reason
> > > +   is LC_GEN, then TO_FILE is not a file name, but rather the actual
> > > +   content, and TO_FILE_LEN>0 is the length of it.
> > >
> > >     A call to this function can relocate the previous set of
> > >     maps, so any stored line_map pointers should not be used.  */
> > >  extern const line_map *linemap_add
> > >    (class line_maps *, enum lc_reason, unsigned int sysp,
> > > -   const char *to_file, linenum_type to_line);
> > > +   const char *to_file, linenum_type to_line, unsigned int to_file_len = 0);
> > >
> > >  /* Create a macro map.  A macro map encodes source locations of tokens
> > >     that are part of a macro replacement-list, at a macro expansion
> > > @@ -1304,7 +1312,8 @@ linemap_location_before_p (class line_maps *set,
> > >
> > >  typedef struct
> > >  {
> > > -  /* The name of the source file involved.  */
> > > +  /* The name of the source file involved, or NULL if
> > > +     generated_data is non-NULL.  */
> > >    const char *file;
> > >
> > >    /* The line-location in the source file.  */
> > > @@ -1316,6 +1325,10 @@ typedef struct
> > >
> > >    /* In a system header?. */
> > >    bool sysp;
> > > +
> > > +  /* If generated data, the data and its length.  */
> > > +  unsigned int generated_data_len;
> > > +  const char *generated_data;
> > >  } expanded_location;
> >
> > Probably worth noting that generated_data can contain NUL bytes, and
> > isn't necessarily NUL-terminated.
> >
> >
> > Thanks again for the patch; hope this is constructive
> > Dave
> >
>
> Hi Dave-
>
> Thanks again for taking a look at this one, sorry it's so long. I redid this
> patch 4/6 taking into account all of your suggestions. It's attached here.
> Now, on the linemap side of things, I renamed the member variable from TO_FILE
> to DATA, and created inline accessor functions to get at it.  The inline
> accessors will assert that the linemap is of the correct type.  I checked all
> the call sites and adjusted as needed.  On the input.cc side of things, I
> switched it to use inheritance.  The logic for finding and caching lines
> resides in a base class, while the two derived classes handle retrieving the
> data from the necessary source (a file, or an in-memory buffer).  I think it's
> much nicer now, please let me know what you think? Thanks!
>
> BTW, the remaining patches downstream of this one do not need to be modified,
> except that the new testcase for 5c/6 (the SARIF output patch) needs one line
> changed since the output of -fdump-internal-locations now distinguishes LC_GEN
> maps as well. I can resend that and/or any of the others once the dust
> settles with this one if that's helpful.
>
> -Lewis

Hello-

I thought I might check in on this one to see if you'd still be able
to take a look at it sometime please, and if you like the new approach
better?
It is at https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606616.html
, but I am going to re-submit the 4 remaining patches to the list
shortly so they are all in one place cleanly with more logical
numbering. The other 3 are unchanged from before other than the
one-line change to SARIF output. Thanks!


-Lewis

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-01-05 22:34 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-04 13:44 [PATCH 0/6] diagnostics: libcpp: Overhaul locations for _Pragma tokens Lewis Hyatt
2022-11-04 13:44 ` [PATCH 1/6] diagnostics: Fix macro tracking for ad-hoc locations Lewis Hyatt
2022-11-04 15:53   ` David Malcolm
2022-11-04 13:44 ` [PATCH 2/6] diagnostics: Use an inline function rather than hardcoding <built-in> string Lewis Hyatt
2022-11-04 15:55   ` David Malcolm
2022-11-04 13:44 ` [PATCH 3/6] libcpp: Fix paste error with unknown pragma after macro expansion Lewis Hyatt
2022-11-21 17:50   ` Jeff Law
2022-11-04 13:44 ` [PATCH 4/6] diagnostics: libcpp: Add LC_GEN linemaps to support in-memory buffers Lewis Hyatt
2022-11-05 16:23   ` David Malcolm
2022-11-05 17:28     ` Lewis Hyatt
2022-11-17 21:21     ` Lewis Hyatt
2023-01-05 22:34       ` Lewis Hyatt
2022-11-04 13:44 ` [PATCH 5/6] diagnostics: Support generated data in additional contexts Lewis Hyatt
2022-11-04 16:42   ` David Malcolm
2022-11-04 21:05     ` Lewis Hyatt
2022-11-05  1:54       ` [PATCH 5b/6] diagnostics: Remove null-termination requirement for json::string David Malcolm
2022-11-05  1:55       ` [PATCH 5a/6] diagnostics: Handle generated data locations in edit_context David Malcolm
2022-11-04 13:44 ` [PATCH 6/6] diagnostics: libcpp: Assign real locations to the tokens inside _Pragma strings Lewis Hyatt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).