public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Lewis Hyatt <lhyatt@gmail.com>
To: gcc-patches@gcc.gnu.org
Cc: Lewis Hyatt <lhyatt@gmail.com>, David Malcolm <dmalcolm@redhat.com>
Subject: [PATCH 0/6] diagnostics: libcpp: Overhaul locations for _Pragma tokens
Date: Fri,  4 Nov 2022 09:44:08 -0400	[thread overview]
Message-ID: <cover.1667514153.git.lhyatt@gmail.com> (raw)

Hello-

In the past couple years there has been a ton of progress in fixing bugs
related to _Pragma, especially its use in the type of macros that many
projects like to implement for manipulating GCC diagnostic pragmas more
easily. For GCC 13 I have been going through the remaining open PRs, fixing a
couple and adding testcases for several that were already fixed. I felt that
made it a good time to overhaul one of the last remaining issues with _Pragma
processing, which is that we do not currently assign good locations to the
tokens involved. The locations are very important, however, because that is
how GCC diagnostic pragmas will ultimately determine whether a given warning
should or should not apply at a given point. Currently, the tokens inside a
_Pragma string are all assigned the same location as the _Pragma token itself,
which is sufficient to make diagnostic pragmas work correctly. It does produce
somewhat inferior diagnostics, though, since we do not point the user to which
part of the _Pragma string caused the problem; and if the _Pragma string was
expanded from a macro, we do not even point them to the string at all.

Further, the assignment of the fake location to the tokens inside the _Pragma
string takes place after all the tokens have been lexed -- consequently, if a
diagnostic is issued by libcpp during that process, it doesn't benefit from the
patched-up location and instead uses a bogus location. As a quick example,
compiling:

    =====
    _Pragma("GCC diagnostic ignored \"oops")
    =====

produces:

    =====
    file:1:24: warning: missing terminating " character
        1 | _Pragma("GCC diagnostic ignored \"oops")
          |                        ^
    =====

It is surprisingly involved to make that caret point to something
reasonable. The reason it points to the middle of nowhere is that the current
implementation of _Pragma in directives.cc:destringize_and_run() does not
touch the line_maps instance at all, and so does not inform it where the
tokens are coming from. But the line_maps API in fact does not provide any way
to handle this case, so this needs to be added first. With all the changes in
this patch set, we would output instead:

    ======
    In buffer generated from file:1:
    <generated>:1:24: warning: missing terminating " character
        1 | GCC diagnostic ignored "oops
          |                        ^
    file:1:1: note: in <_Pragma directive>
        1 | _Pragma("GCC diagnostic ignored \"oops")
          | ^~~~~~~
    ======

Treating the _Pragma like a macro expansion makes everything consistent and
solves a ton of problems; all the locations involved will just make sense from
the user's point of view.

Patches 1-3 are tiny bug fixes that I came across while working on the new
testcases. I was a bit surprised that #1 and #3 especially did not have PRs
open, but I guess these small glitches have gone unnoticed so far.

Patch 4 is the largest one. It adds a new reason=LC_GEN for ordinary line
maps. These maps are just like normal ones, except the file name pointer
points not to a file name, but to the actual data in memory instead. This is
how we can issue diagnostics for code that did not appear in the user's input,
such as the de-stringized _Pragma string. The changes needed in libcpp to
support this concept are pretty small and straightforward. Most of the changes
outside of libcpp are in input.cc and diagnostic-show-locus.cc, which need to
learn how to obtain code from LC_GEN maps, and also a lot of the changes are
in selftests that are pretty sensitive to the internal implementation.

Patch 5 is a continuation of 4 that supports LC_GEN maps in less commonly used
places, such as the new SARIF output format, that also need to know how to
read source back from in-memory buffers in addition to files.

Patch 6 updates the implementation of _Pragma handling to use LC_GEN maps and
to create virtual locations for the tokens as in the example above. I have
also added support for the argument of the _Pragma to be a raw string, as
requested by PR83473, since this was easy to do while I was there.

1/6: diagnostics: Fix macro tracking for ad-hoc locations
2/6: diagnostics: Use an inline function rather than hardcoding <built-in>
     string
3/6: libcpp: Fix paste error with unknown pragma after macro expansion
4/6: diagnostics: libcpp: Add LC_GEN linemaps to support in-memory buffers
5/6: diagnostics: Support generated data in additional contexts
6/6: diagnostics: libcpp: Assign real locations to the tokens inside
     _Pragma strings

Bootstrap and regtest all languages on x86-64 Linux looks good.

I realize it's near the end of stage 1 now. It would still be great and I
would appreciate very much if this patch could get reviewed please? For GCC 13,
there have been several _Pragma-related bugs fixed (especially PR53431), and
addressing this location issue would tie it together nicely. Thanks very much!

-Lewis

             reply	other threads:[~2022-11-04 13:44 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-04 13:44 Lewis Hyatt [this message]
2022-11-04 13:44 ` [PATCH 1/6] diagnostics: Fix macro tracking for ad-hoc locations Lewis Hyatt
2022-11-04 15:53   ` David Malcolm
2022-11-04 13:44 ` [PATCH 2/6] diagnostics: Use an inline function rather than hardcoding <built-in> string Lewis Hyatt
2022-11-04 15:55   ` David Malcolm
2022-11-04 13:44 ` [PATCH 3/6] libcpp: Fix paste error with unknown pragma after macro expansion Lewis Hyatt
2022-11-21 17:50   ` Jeff Law
2022-11-04 13:44 ` [PATCH 4/6] diagnostics: libcpp: Add LC_GEN linemaps to support in-memory buffers Lewis Hyatt
2022-11-05 16:23   ` David Malcolm
2022-11-05 17:28     ` Lewis Hyatt
2022-11-17 21:21     ` Lewis Hyatt
2023-01-05 22:34       ` Lewis Hyatt
2022-11-04 13:44 ` [PATCH 5/6] diagnostics: Support generated data in additional contexts Lewis Hyatt
2022-11-04 16:42   ` David Malcolm
2022-11-04 21:05     ` Lewis Hyatt
2022-11-05  1:54       ` [PATCH 5b/6] diagnostics: Remove null-termination requirement for json::string David Malcolm
2022-11-05  1:55       ` [PATCH 5a/6] diagnostics: Handle generated data locations in edit_context David Malcolm
2022-11-04 13:44 ` [PATCH 6/6] diagnostics: libcpp: Assign real locations to the tokens inside _Pragma strings Lewis Hyatt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1667514153.git.lhyatt@gmail.com \
    --to=lhyatt@gmail.com \
    --cc=dmalcolm@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).