public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: David Malcolm <dmalcolm@redhat.com>
To: Lewis Hyatt <lhyatt@gmail.com>
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH 5b/6] diagnostics: Remove null-termination requirement for json::string
Date: Fri, 04 Nov 2022 21:54:55 -0400	[thread overview]
Message-ID: <6869af802df401a086e4da6397410906b9765c62.camel@redhat.com> (raw)
In-Reply-To: <20221104210550.GA92497@ldh-imac.local>

On Fri, 2022-11-04 at 17:05 -0400, Lewis Hyatt wrote:
> [PATCH 5b/6] diagnostics: Remove null-termination requirement for
> json::string
> 
> json::string currently handles null-terminated data and so can't work
> with
> data that may contain embedded null bytes or that is not null-
> terminated.
> Supporting such data will make json::string more robust in some
> contexts, such
> as SARIF output, which uses it to output user source code that may
> contain
> embedded null bytes.
> 
> gcc/ChangeLog:
> 
> 	* json.h (class string): Add M_LEN member to store the
> length of
> 	the data.  Add constructor taking an explicit length.
> 	* json.cc (string::string):  Implement the new constructor.
> 	(string::print): Support print strings that are not null-
> terminated.
> 	Escape embdedded null bytes on output.
> 	(test_writing_strings): Test the new null-byte-related
> features of
> 	json::string.
> 

[...snip...]

> diff --git a/gcc/json.h b/gcc/json.h
> index f272981259b..f7afd843dc5 100644
> --- a/gcc/json.h
> +++ b/gcc/json.h
> @@ -156,16 +156,19 @@ class integer_number : public value
>  class string : public value
>  {
>   public:
> -  string (const char *utf8);
> +  explicit string (const char *utf8);
> +  string (const char *utf8, size_t len);
>    ~string () { free (m_utf8); }
>  
>    enum kind get_kind () const final override { return JSON_STRING; }
>    void print (pretty_printer *pp) const final override;
>  
>    const char *get_string () const { return m_utf8; }

I worried that json::string::get_string previously returned a NUL-
terminated string, but now there's no guarantee of termination, and
that this might break something.  But I checked, and it seems that this
accessor doesn't get used anywhere in our source tree.

> +  size_t get_length () const { return m_len; }

Does anything actually use this?

Perhaps it might make sense to delete the get_string accessor, and if
we ever need one, replace it with an accessor that returns a char_span?

>  
>   private:
>    char *m_utf8;
> +  size_t m_len;
>  };
>  

Thanks for adding the unit test.

The 5b patch is OK for trunk.

Dave


  reply	other threads:[~2022-11-05  1:54 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-04 13:44 [PATCH 0/6] diagnostics: libcpp: Overhaul locations for _Pragma tokens Lewis Hyatt
2022-11-04 13:44 ` [PATCH 1/6] diagnostics: Fix macro tracking for ad-hoc locations Lewis Hyatt
2022-11-04 15:53   ` David Malcolm
2022-11-04 13:44 ` [PATCH 2/6] diagnostics: Use an inline function rather than hardcoding <built-in> string Lewis Hyatt
2022-11-04 15:55   ` David Malcolm
2022-11-04 13:44 ` [PATCH 3/6] libcpp: Fix paste error with unknown pragma after macro expansion Lewis Hyatt
2022-11-21 17:50   ` Jeff Law
2022-11-04 13:44 ` [PATCH 4/6] diagnostics: libcpp: Add LC_GEN linemaps to support in-memory buffers Lewis Hyatt
2022-11-05 16:23   ` David Malcolm
2022-11-05 17:28     ` Lewis Hyatt
2022-11-17 21:21     ` Lewis Hyatt
2023-01-05 22:34       ` Lewis Hyatt
2022-11-04 13:44 ` [PATCH 5/6] diagnostics: Support generated data in additional contexts Lewis Hyatt
2022-11-04 16:42   ` David Malcolm
2022-11-04 21:05     ` Lewis Hyatt
2022-11-05  1:54       ` David Malcolm [this message]
2022-11-05  1:55       ` [PATCH 5a/6] diagnostics: Handle generated data locations in edit_context David Malcolm
2022-11-04 13:44 ` [PATCH 6/6] diagnostics: libcpp: Assign real locations to the tokens inside _Pragma strings Lewis Hyatt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6869af802df401a086e4da6397410906b9765c62.camel@redhat.com \
    --to=dmalcolm@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=lhyatt@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).