public inbox for jit@gcc.gnu.org
 help / color / mirror / Atom feed
From: Andrea Corallo <andrea.corallo@arm.com>
To: David Malcolm <dmalcolm@redhat.com>
Cc: "jit@gcc.gnu.org" <jit@gcc.gnu.org>,
	 "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
	 nd <nd@arm.com>
Subject: Re: [PATCH][gcc] libgccjit: handle long literals in playback::context::new_string_literal
Date: Sat, 07 Mar 2020 21:43:58 +0000	[thread overview]
Message-ID: <878skb263l.fsf@arm.com> (raw)
In-Reply-To: <cef891b1dde427f0c856e99a32ded3a1799bc32d.camel@redhat.com> (David Malcolm's message of "Thu, 05 Mar 2020 22:18:46 -0500")

David Malcolm <dmalcolm@redhat.com> writes:

> On Mon, 2019-09-02 at 09:16 +0000, Andrea Corallo wrote:
>> Hi all,
>> yesterday I've found an interesting bug in libgccjit.
>> Seems we have an hard limitation of 200 characters for literal
>> strings.
>> Attempting to create longer strings lead to ICE during pass_expand
>> while performing a sanity check in get_constant_size.
>>
>> Tracking down the issue seems the code we have was inspired from
>> c-family/c-common.c:c_common_nodes_and_builtins were
>> array_domain_type
>> is actually defined with a size of 200.
>> The comment that follows that point sounded premonitory :) :)
>>
>> /* Make a type for arrays of characters.
>>    With luck nothing will ever really depend on the length of this
>>    array type.  */
>>
>> At least in the current implementation the type is set by
>> fix_string_type were the actual string length is taken in account.
>>
>> I attach a patch updating the logic accordingly and a new testcase
>>  for that.
>>
>> make check-jit is passing clean.
>>
>> Best Regards
>>   Andrea
>
> Sorry about the long delay in reviewing this patch.
>
>> gcc/jit/ChangeLog
>> 2019-??-??  Andrea Corallo  <andrea.corallo@arm.com>
>>
>>         * jit-playback.h
>>         (gcc::jit::recording::context m_recording_ctxt): Remove
>                      ^^^^^^^^^
> 		     "playback" here
>
>>         m_char_array_type_node field.
>
> [...]
>
>> @@ -670,9 +669,12 @@ playback::rvalue *
>>  playback::context::
>>  new_string_literal (const char *value)
>>  {
>> -  tree t_str = build_string (strlen (value), value);
>> -  gcc_assert (m_char_array_type_node);
>> -  TREE_TYPE (t_str) = m_char_array_type_node;
>> +  /* Compare with c-family/c-common.c: fix_string_type.  */
>> +  size_t len = strlen (value);
>> +  tree i_type = build_index_type (size_int (len));
>> +  tree a_type = build_array_type (char_type_node, i_type);
>> +  tree t_str = build_string (len, value);
>> +  TREE_TYPE (t_str) = a_type;
>
> This code works with string lengths and string sizes which always
> requires a little extra care.  I'd like to see at least a comment
> discussing this, as it's not immediately clear to me that we're
> correctly handling the NUL terminator here.
>
> Consider the string "foo".
> This has strlen == 3, and its size is 4 chars.
>
> build_string's comment says "Note that for a C string literal, LEN
> should include the trailing NUL."
>
> However, build_string appears to allocate one byte more than LEN, and
> write a NUL in that final position.
>
> fix_string_type has:
>
>   int length = TREE_STRING_LENGTH (value);
>
> where tree.h has:
>
>   /* In C terms, this is sizeof, not strlen.  */
>   #define TREE_STRING_LENGTH(NODE) (STRING_CST_CHECK (NODE)-
>>string.length)
>
> and:
>   nchars = length / charsz;
>
> where for our purposes charsz == 1 and so nchars == length.
>
> fix_string_type has:
>
>   i_type = build_index_type (size_int (nchars - 1));
>
> So I *think* the patch is correct, but there ought to be a comment at
> least.
>
> Or maybe use TREE_STRING_LENGTH (t_str) to more closely follow
> fix_string_type?
>

Mmmm when I wrote the patch I looked into the build_string code missing
the comment and that was my understanding too.  But looking better now
in the compiler I think the patch is *not* correct.  The trouble is that
if we call build_string (3, "foo") (as we would do now) we set
TREE_STRING_LENGTH to 3 and this appears to be wrong.

Actually build_string can be called using the strlen as argument (see
for example the various definition of DEF_ATTR_STRING) but apparently
this is not how is done for C strings in the frontends.  Given that this
is what the comment suggests and that modifying the patch with the +1
keeps it functional I think doing what the front-end does is the right
thing.

> Also, please add an assertion and comment to the testcase to assert
> that the very long string is indeed longer than the previous limit of
> 200.

Ack

> Thanks
> Dave

Thanks for reviewing

  Andrea

  reply	other threads:[~2020-03-07 21:44 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-01  0:00 Andrea Corallo
2019-01-01  0:00 ` Andrea Corallo
2019-01-01  0:00   ` Andrea Corallo
2019-01-01  0:00     ` Andrea Corallo
2020-01-01  0:00 ` David Malcolm
2020-03-07 21:43   ` Andrea Corallo [this message]
2020-03-09 22:20     ` [PATCH v2][gcc] " Andrea Corallo
2020-03-18 22:52       ` Andrea Corallo
2020-03-21  1:41       ` David Malcolm
2020-03-23 18:04         ` [committed][gcc] " Andrea Corallo
     [not found] <1579772431.117590.ezmlm@gcc.gnu.org>
2020-01-01  0:00 ` [PATCH][gcc] " Andrea Corallo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878skb263l.fsf@arm.com \
    --to=andrea.corallo@arm.com \
    --cc=dmalcolm@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jit@gcc.gnu.org \
    --cc=nd@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).