public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: David Malcolm <dmalcolm@redhat.com>
To: Lewis Hyatt <lhyatt@gmail.com>, gcc-patches@gcc.gnu.org
Subject: Re: [PATCH 5/6] diagnostics: Support generated data in additional contexts
Date: Fri, 04 Nov 2022 12:42:29 -0400	[thread overview]
Message-ID: <b60b7a4a698554fdcfcd635b7de2fe87d12b70d2.camel@redhat.com> (raw)
In-Reply-To: <bef0a344e3c76fa7deb8d3fbb9fcfb8cd2257f97.1667514153.git.lhyatt@gmail.com>

On Fri, 2022-11-04 at 09:44 -0400, Lewis Hyatt via Gcc-patches wrote:
> Add awareness that diagnostic locations may be in generated buffers
> rather
> than an actual file to other places in the diagnostics code that may
> care,
> most notably SARIF output (which needs to obtain its own snapshots of
> the code
> involved). For edit context output, which outputs fixit hints as
> diffs, for
> now just make sure we ignore generated data buffers. At the moment,
> there is
> no ability for a fixit hint to be generated in such a buffer.
> 
> Because SARIF uses JSON as well, also add the ability to the
> json::string
> class to handle a buffer with nulls in the middle (since we place no
> restriction on LC_GEN content) by providing the option to specify the
> data
> length.

Please can you split this patch into three parts:
- the SARIF part
- the json changes
- the edit-context.cc changes (I think this at least counts as an
"obvious" change with respect to the other changes in the kit, though
I'm still working my way through patch 4 in the kit).

Please add a DejaGnu testcase to the SARIF part, with a diagnostic that
references a generated data buffer; see
  gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-*.c 
for examples of SARIF testcases.

Please add a selftest to the json change so that we have a unit test of
constructing a json::string with an embedded NUL, and how we serialize
such a string (probably to json.cc's test_writing_strings)

Thanks
Dave

> 
> gcc/ChangeLog:
> 
>         * diagnostic-format-sarif.cc (sarif_builder::xloc_to_fb): New
> function.
>         (sarif_builder::maybe_make_physical_location_object): Support
>         generated data locations.
>         (sarif_builder::make_artifact_location_object): Likewise.
>         (sarif_builder::maybe_make_region_object_for_context):
> Likewise.
>         (sarif_builder::make_artifact_object): Likewise.
>         (sarif_builder::maybe_make_artifact_content_object):
> Likewise.
>         (get_source_lines): Likewise.
>         * edit-context.cc (edit_context::apply_fixit): Ignore
> generated
>         locations if one should make its way this far.
>         * json.cc (string::string): Support non-null-terminated
> string.
>         (string::print): Likewise.
>         * json.h (class string): Likewise.
> ---
>  gcc/diagnostic-format-sarif.cc | 86 +++++++++++++++++++++-----------
> --
>  gcc/edit-context.cc            |  4 ++
>  gcc/json.cc                    | 17 +++++--
>  gcc/json.h                     |  5 +-
>  4 files changed, 75 insertions(+), 37 deletions(-)
> 
> diff --git a/gcc/diagnostic-format-sarif.cc b/gcc/diagnostic-format-
> sarif.cc
> index 7110db4edd6..c2d18a1a16e 100644
> --- a/gcc/diagnostic-format-sarif.cc
> +++ b/gcc/diagnostic-format-sarif.cc
> @@ -125,7 +125,10 @@ private:
>    json::array *maybe_make_kinds_array (diagnostic_event::meaning m)
> const;
>    json::object *maybe_make_physical_location_object (location_t
> loc);
>    json::object *make_artifact_location_object (location_t loc);
> -  json::object *make_artifact_location_object (const char
> *filename);
> +
> +  typedef std::pair<const char *, unsigned int> filename_or_buffer;
> +  json::object *make_artifact_location_object (filename_or_buffer
> fb);
> +
>    json::object *make_artifact_location_object_for_pwd () const;
>    json::object *maybe_make_region_object (location_t loc) const;
>    json::object *maybe_make_region_object_for_context (location_t
> loc) const;
> @@ -146,16 +149,17 @@ private:
>    json::object *make_reporting_descriptor_object_for_cwe_id (int
> cwe_id) const;
>    json::object *
>    make_reporting_descriptor_reference_object_for_cwe_id (int
> cwe_id);
> -  json::object *make_artifact_object (const char *filename);
> -  json::object *maybe_make_artifact_content_object (const char
> *filename) const;
> -  json::object *maybe_make_artifact_content_object (const char
> *filename,
> -                                                   int start_line,
> +  json::object *make_artifact_object (filename_or_buffer fb);
> +  json::object *
> +  maybe_make_artifact_content_object (filename_or_buffer fb) const;
> +  json::object *maybe_make_artifact_content_object
> (expanded_location xloc,
>                                                     int end_line)
> const;
>    json::object *make_fix_object (const rich_location &rich_loc);
>    json::object *make_artifact_change_object (const rich_location
> &richloc);
>    json::object *make_replacement_object (const fixit_hint &hint)
> const;
>    json::object *make_artifact_content_object (const char *text)
> const;
>    int get_sarif_column (expanded_location exploc) const;
> +  static filename_or_buffer xloc_to_fb (expanded_location xloc);
>  
>    diagnostic_context *m_context;
>  
> @@ -166,7 +170,11 @@ private:
>       diagnostic group.  */
>    sarif_result *m_cur_group_result;
>  
> -  hash_set <const char *> m_filenames;
> +  /* If the second member is >0, then this is a buffer of generated
> content,
> +     with that length, not a filename.  */
> +  hash_set <pair_hash <nofree_ptr_hash <const char>,
> +                      int_hash <unsigned int, -1U> >
> +           > m_filenames;
>    bool m_seen_any_relative_paths;
>    hash_set <free_string_hash> m_rule_id_set;
>    json::array *m_rules_arr;
> @@ -588,6 +596,15 @@ sarif_builder::make_location_object (const
> diagnostic_event &event)
>    return location_obj;
>  }
>  
> +/* Populate a filename_or_buffer pair from an expanded location.  */
> +sarif_builder::filename_or_buffer
> +sarif_builder::xloc_to_fb (expanded_location xloc)
> +{
> +  if (xloc.generated_data_len)
> +    return filename_or_buffer (xloc.generated_data,
> xloc.generated_data_len);
> +  return filename_or_buffer (xloc.file, 0);
> +}
> +
>  /* Make a physicalLocation object (SARIF v2.1.0 section 3.29) for
> LOC,
>     or return NULL;
>     Add any filename to the m_artifacts.  */
> @@ -603,7 +620,7 @@
> sarif_builder::maybe_make_physical_location_object (location_t loc)
>    /* "artifactLocation" property (SARIF v2.1.0 section 3.29.3).  */
>    json::object *artifact_loc_obj = make_artifact_location_object
> (loc);
>    phys_loc_obj->set ("artifactLocation", artifact_loc_obj);
> -  m_filenames.add (LOCATION_FILE (loc));
> +  m_filenames.add (xloc_to_fb (expand_location (loc)));
>  
>    /* "region" property (SARIF v2.1.0 section 3.29.4).  */
>    if (json::object *region_obj = maybe_make_region_object (loc))
> @@ -627,7 +644,7 @@
> sarif_builder::maybe_make_physical_location_object (location_t loc)
>  json::object *
>  sarif_builder::make_artifact_location_object (location_t loc)
>  {
> -  return make_artifact_location_object (LOCATION_FILE (loc));
> +  return make_artifact_location_object (xloc_to_fb (expand_location
> (loc)));
>  }
>  
>  /* The ID value for use in "uriBaseId" properties (SARIF v2.1.0
> section 3.4.4)
> @@ -639,10 +656,12 @@ sarif_builder::make_artifact_location_object
> (location_t loc)
>     or return NULL.  */
>  
>  json::object *
> -sarif_builder::make_artifact_location_object (const char *filename)
> +sarif_builder::make_artifact_location_object (filename_or_buffer fb)
>  {
>    json::object *artifact_loc_obj = new json::object ();
>  
> +  const auto filename = (fb.second ? special_fname_generated () :
> fb.first);
> +
>    /* "uri" property (SARIF v2.1.0 section 3.4.3).  */
>    artifact_loc_obj->set ("uri", new json::string (filename));
>  
> @@ -795,9 +814,7 @@
> sarif_builder::maybe_make_region_object_for_context (location_t loc)
> const
>  
>    /* "snippet" property (SARIF v2.1.0 section 3.30.13).  */
>    if (json::object *artifact_content_obj
> -        = maybe_make_artifact_content_object (exploc_start.file,
> -                                              exploc_start.line,
> -                                              exploc_finish.line))
> +       = maybe_make_artifact_content_object (exploc_start,
> exploc_finish.line))
>      region_obj->set ("snippet", artifact_content_obj);
>  
>    return region_obj;
> @@ -1248,24 +1265,24 @@ sarif_builder::maybe_make_cwe_taxonomy_object
> () const
>  /* Make an artifact object (SARIF v2.1.0 section 3.24).  */
>  
>  json::object *
> -sarif_builder::make_artifact_object (const char *filename)
> +sarif_builder::make_artifact_object (filename_or_buffer fb)
>  {
>    json::object *artifact_obj = new json::object ();
>  
>    /* "location" property (SARIF v2.1.0 section 3.24.2).  */
> -  json::object *artifact_loc_obj = make_artifact_location_object
> (filename);
> +  json::object *artifact_loc_obj = make_artifact_location_object
> (fb);
>    artifact_obj->set ("location", artifact_loc_obj);
>  
>    /* "contents" property (SARIF v2.1.0 section 3.24.8).  */
>    if (json::object *artifact_content_obj
> -       = maybe_make_artifact_content_object (filename))
> +       = maybe_make_artifact_content_object (fb))
>      artifact_obj->set ("contents", artifact_content_obj);
>  
>    /* "sourceLanguage" property (SARIF v2.1.0 section 3.24.10).  */
>    if (m_context->m_client_data_hooks)
>      if (const char *source_lang
>         = m_context->m_client_data_hooks-
> >maybe_get_sarif_source_language
> -           (filename))
> +           (fb.first))
>        artifact_obj->set ("sourceLanguage", new json::string
> (source_lang));
>  
>    return artifact_obj;
> @@ -1331,16 +1348,21 @@ maybe_read_file (const char *filename)
>     full contents of FILENAME.  */
>  
>  json::object *
> -sarif_builder::maybe_make_artifact_content_object (const char
> *filename) const
> +sarif_builder::maybe_make_artifact_content_object
> (filename_or_buffer fb) const
>  {
> -  char *text_utf8 = maybe_read_file (filename);
> -  if (!text_utf8)
> -    return NULL;
> -
> -  json::object *artifact_content_obj = new json::object ();
> -  artifact_content_obj->set ("text", new json::string (text_utf8));
> -  free (text_utf8);
> -
> +  json::object *artifact_content_obj = nullptr;
> +  if (fb.second)
> +    {
> +      artifact_content_obj = new json::object ();
> +      artifact_content_obj->set ("text", new json::string (fb.first,
> +                                                         
> fb.second));
> +    }
> +  else if (char *text_utf8 = maybe_read_file (fb.first))
> +    {
> +      artifact_content_obj = new json::object ();
> +      artifact_content_obj->set ("text", new json::string
> (text_utf8));
> +      free (text_utf8);
> +    }
>    return artifact_content_obj;
>  }
>  
> @@ -1348,15 +1370,14 @@
> sarif_builder::maybe_make_artifact_content_object (const char
> *filename) const
>     a freshly-allocated 0-terminated buffer containing them, or
> NULL.  */
>  
>  static char *
> -get_source_lines (const char *filename,
> -                 int start_line,
> +get_source_lines (expanded_location xloc,
>                   int end_line)
>  {
>    auto_vec<char> result;
>  
> -  for (int line = start_line; line <= end_line; line++)
> +  for (int line = xloc.line; line <= end_line; line++)
>      {
> -      char_span line_content = location_get_source_line (filename,
> line);
> +      char_span line_content = location_get_source_line (xloc,
> line);
>        if (!line_content.get_buffer ())
>         return NULL;
>        result.reserve (line_content.length () + 1);
> @@ -1370,14 +1391,13 @@ get_source_lines (const char *filename,
>  }
>  
>  /* Make an artifactContent object (SARIF v2.1.0 section 3.3) for the
> given
> -   run of lines within FILENAME (including the endpoints).  */
> +   run of lines starting at XLOC (including the endpoints).  */
>  
>  json::object *
> -sarif_builder::maybe_make_artifact_content_object (const char
> *filename,
> -                                                  int start_line,
> +sarif_builder::maybe_make_artifact_content_object (expanded_location
> xloc,
>                                                    int end_line)
> const
>  {
> -  char *text_utf8 = get_source_lines (filename, start_line,
> end_line);
> +  char *text_utf8 = get_source_lines (xloc, end_line);
>  
>    if (!text_utf8)
>      return NULL;
> diff --git a/gcc/edit-context.cc b/gcc/edit-context.cc
> index 6879ddd41b4..aa95bc0834f 100644
> --- a/gcc/edit-context.cc
> +++ b/gcc/edit-context.cc
> @@ -301,8 +301,12 @@ edit_context::apply_fixit (const fixit_hint
> *hint)
>      return false;
>    if (start.column == 0)
>      return false;
> +  if (start.generated_data)
> +    return false;
>    if (next_loc.column == 0)
>      return false;
> +  if (next_loc.generated_data)
> +    return false;
>  
>    edited_file &file = get_or_insert_file (start.file);
>    if (!m_valid)
> diff --git a/gcc/json.cc b/gcc/json.cc
> index 974f8c36825..3ebe8495e96 100644
> --- a/gcc/json.cc
> +++ b/gcc/json.cc
> @@ -190,6 +190,15 @@ string::string (const char *utf8)
>  {
>    gcc_assert (utf8);
>    m_utf8 = xstrdup (utf8);
> +  m_len = strlen (utf8);
> +}
> +
> +string::string (const char *utf8, size_t len)
> +{
> +  gcc_assert (utf8);
> +  m_utf8 = XNEWVEC (char, len);
> +  m_len = len;
> +  memcpy (m_utf8, utf8, len);
>  }
>  
>  /* Implementation of json::value::print for json::string.  */
> @@ -198,9 +207,9 @@ void
>  string::print (pretty_printer *pp) const
>  {
>    pp_character (pp, '"');
> -  for (const char *ptr = m_utf8; *ptr; ptr++)
> +  for (size_t i = 0; i != m_len; ++i)
>      {
> -      char ch = *ptr;
> +      char ch = m_utf8[i];
>        switch (ch)
>         {
>         case '"':
> @@ -224,7 +233,9 @@ string::print (pretty_printer *pp) const
>         case '\t':
>           pp_string (pp, "\\t");
>           break;
> -
> +       case '\0':
> +         pp_string (pp, "\\0");
> +         break;
>         default:
>           pp_character (pp, ch);
>         }
> diff --git a/gcc/json.h b/gcc/json.h
> index f272981259b..f7afd843dc5 100644
> --- a/gcc/json.h
> +++ b/gcc/json.h
> @@ -156,16 +156,19 @@ class integer_number : public value
>  class string : public value
>  {
>   public:
> -  string (const char *utf8);
> +  explicit string (const char *utf8);
> +  string (const char *utf8, size_t len);
>    ~string () { free (m_utf8); }
>  
>    enum kind get_kind () const final override { return JSON_STRING; }
>    void print (pretty_printer *pp) const final override;
>  
>    const char *get_string () const { return m_utf8; }
> +  size_t get_length () const { return m_len; }
>  
>   private:
>    char *m_utf8;
> +  size_t m_len;
>  };
>  
>  /* Subclass of value for the three JSON literals "true", "false",
> 


  reply	other threads:[~2022-11-04 16:42 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-04 13:44 [PATCH 0/6] diagnostics: libcpp: Overhaul locations for _Pragma tokens Lewis Hyatt
2022-11-04 13:44 ` [PATCH 1/6] diagnostics: Fix macro tracking for ad-hoc locations Lewis Hyatt
2022-11-04 15:53   ` David Malcolm
2022-11-04 13:44 ` [PATCH 2/6] diagnostics: Use an inline function rather than hardcoding <built-in> string Lewis Hyatt
2022-11-04 15:55   ` David Malcolm
2022-11-04 13:44 ` [PATCH 3/6] libcpp: Fix paste error with unknown pragma after macro expansion Lewis Hyatt
2022-11-21 17:50   ` Jeff Law
2022-11-04 13:44 ` [PATCH 4/6] diagnostics: libcpp: Add LC_GEN linemaps to support in-memory buffers Lewis Hyatt
2022-11-05 16:23   ` David Malcolm
2022-11-05 17:28     ` Lewis Hyatt
2022-11-17 21:21     ` Lewis Hyatt
2023-01-05 22:34       ` Lewis Hyatt
2022-11-04 13:44 ` [PATCH 5/6] diagnostics: Support generated data in additional contexts Lewis Hyatt
2022-11-04 16:42   ` David Malcolm [this message]
2022-11-04 21:05     ` Lewis Hyatt
2022-11-05  1:54       ` [PATCH 5b/6] diagnostics: Remove null-termination requirement for json::string David Malcolm
2022-11-05  1:55       ` [PATCH 5a/6] diagnostics: Handle generated data locations in edit_context David Malcolm
2022-11-04 13:44 ` [PATCH 6/6] diagnostics: libcpp: Assign real locations to the tokens inside _Pragma strings Lewis Hyatt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b60b7a4a698554fdcfcd635b7de2fe87d12b70d2.camel@redhat.com \
    --to=dmalcolm@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=lhyatt@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).