From: Jason Merrill <jason@redhat.com>
To: Jakub Jelinek <jakub@redhat.com>
Cc: Marek Polacek <polacek@redhat.com>,
"Joseph S. Myers" <joseph@codesourcery.com>,
gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] libcpp: Implement C++23 P2290R3 - Delimited escape sequences [PR106645]
Date: Wed, 17 Aug 2022 22:22:03 -0400 [thread overview]
Message-ID: <720d71cc-0883-f8e4-2321-2e2594cf93aa@redhat.com> (raw)
In-Reply-To: <Yv1bUmjit6zH+Jw0@tucnak>
On 8/17/22 14:19, Jakub Jelinek wrote:
> On Wed, Aug 17, 2022 at 04:47:19PM -0400, Jason Merrill via Gcc-patches wrote:
>>> + length = 32;
>>
>> /* Magic value to indicate no digits seen. */
>
> Indeed, will add the comment.
>
>>> + delimited = true;
>>> + if (loc_reader)
>>> + char_range->m_finish = loc_reader->get_next ().m_finish;
>>> + }
>>> + }
>>> else if (str[-1] == 'U')
>>> length = 8;
>>> else
>>> @@ -1107,6 +1118,8 @@ _cpp_valid_ucn (cpp_reader *pfile, const
>>> result = 0;
>>> do
>>> {
>>> + if (str == limit)
>>> + break;
>>> c = *str;
>>> if (!ISXDIGIT (c))
>>> break;
>>> @@ -1116,9 +1129,41 @@ _cpp_valid_ucn (cpp_reader *pfile, const
>>> gcc_assert (char_range);
>>> char_range->m_finish = loc_reader->get_next ().m_finish;
>>> }
>>> + if (delimited)
>>> + {
>>> + if (!result)
>>> + /* Accept arbitrary number of leading zeros. */
>>> + length = 16;
>>> + else if (length == 8)
>>> + {
>>> + /* Make sure we detect overflows. */
>>> + result |= 0x8000000;
>>> + ++length;
>>> + }
>>
>> 16 above so that this case happens after we read 8 digits after leading
>> zeroes?
>
> Another magic value less than the no digits seen one and >8,
> so that it can count 8 digits with the first non-zero one after
> which to or in the overflow flag. The intent is not to break the loop
> if there are further digits, just that there will be overflow.
> Another option would be those overflow |= n ^ (n << 4 >> 4);
> tests that convert_hex does and just making sure length is never decremented
> (except we need a way to distinguish between \u{} and at least one digit).
This way is fine, could just use more comment.
>>> + if (loc_reader)
>>> + char_range->m_finish = loc_reader->get_next ().m_finish;
>>
>> Here and in other functions, the pattern of increment the input pointer and
>> update m_finish seems like it should be a macro?
>
> Perhaps or inline function. Before my patch, there are 5 such ifs
> (some with char_range.m_finish and others char_range->m_finish),
> the patch adds another 7 such spots.
Either way is fine.
>>> @@ -2119,15 +2255,23 @@ _cpp_interpret_identifier (cpp_reader *p
>>> cppchar_t value = 0;
>>> size_t bufleft = len - (bufp - buf);
>>> int rval;
>>> + bool delimited = false;
>>> idp += 2;
>>> + if (length == 4 && id[idp] == '{')
>>> + {
>>> + delimited = true;
>>> + idp++;
>>> + }
>>> while (length && idp < len && ISXDIGIT (id[idp]))
>>> {
>>> value = (value << 4) + hex_value (id[idp]);
>>> idp++;
>>> - length--;
>>> + if (!delimited)
>>> + length--;
>>> }
>>> - idp--;
>>> + if (!delimited)
>>> + idp--;
>>
>> Don't we need to check that the first non-xdigit is a }?
>
> The comments and my understanding of the code say that we first
> check what is a valid identifier and the above is only called on
> a valid identifier. So, if it would be delimited \u{ not terminated
> with }, then it would fail forms_identifier_p and wouldn't be included
> in the range. Thus e.g. the ISXDIGIT (id[id]) test is probably not needed
> unless delimited is true because we've checked earlier that it has 4 or 8
> hex digits.
> But sure, if you want a id[idp] == '}' test or assertion, it can be
> added.
OK, a comment mentioning this should be sufficient.
Jason
next prev parent reply other threads:[~2022-08-18 2:22 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-17 7:17 Jakub Jelinek
2022-08-17 20:47 ` Jason Merrill
2022-08-17 21:19 ` Jakub Jelinek
2022-08-18 2:22 ` Jason Merrill [this message]
2022-08-18 8:17 ` [PATCH] libcpp, v2: " Jakub Jelinek
2022-08-19 0:34 ` Jason Merrill
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=720d71cc-0883-f8e4-2321-2e2594cf93aa@redhat.com \
--to=jason@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
--cc=joseph@codesourcery.com \
--cc=polacek@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).