public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jason Merrill <jason@redhat.com>
To: Jakub Jelinek <jakub@redhat.com>
Cc: Marek Polacek <polacek@redhat.com>,
	"Joseph S. Myers" <joseph@codesourcery.com>,
	gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] libcpp: Implement C++23 P2290R3 - Delimited escape sequences [PR106645]
Date: Wed, 17 Aug 2022 22:22:03 -0400	[thread overview]
Message-ID: <720d71cc-0883-f8e4-2321-2e2594cf93aa@redhat.com> (raw)
In-Reply-To: <Yv1bUmjit6zH+Jw0@tucnak>

On 8/17/22 14:19, Jakub Jelinek wrote:
> On Wed, Aug 17, 2022 at 04:47:19PM -0400, Jason Merrill via Gcc-patches wrote:
>>> +	  length = 32;
>>
>> /* Magic value to indicate no digits seen.  */
> 
> Indeed, will add the comment.
> 
>>> +	  delimited = true;
>>> +	  if (loc_reader)
>>> +	    char_range->m_finish = loc_reader->get_next ().m_finish;
>>> +	}
>>> +    }
>>>      else if (str[-1] == 'U')
>>>        length = 8;
>>>      else
>>> @@ -1107,6 +1118,8 @@ _cpp_valid_ucn (cpp_reader *pfile, const
>>>      result = 0;
>>>      do
>>>        {
>>> +      if (str == limit)
>>> +	break;
>>>          c = *str;
>>>          if (!ISXDIGIT (c))
>>>    	break;
>>> @@ -1116,9 +1129,41 @@ _cpp_valid_ucn (cpp_reader *pfile, const
>>>    	  gcc_assert (char_range);
>>>    	  char_range->m_finish = loc_reader->get_next ().m_finish;
>>>    	}
>>> +      if (delimited)
>>> +	{
>>> +	  if (!result)
>>> +	    /* Accept arbitrary number of leading zeros.  */
>>> +	    length = 16;
>>> +	  else if (length == 8)
>>> +	    {
>>> +	      /* Make sure we detect overflows.  */
>>> +	      result |= 0x8000000;
>>> +	      ++length;
>>> +	    }
>>
>> 16 above so that this case happens after we read 8 digits after leading
>> zeroes?
> 
> Another magic value less than the no digits seen one and >8,
> so that it can count 8 digits with the first non-zero one after
> which to or in the overflow flag.  The intent is not to break the loop
> if there are further digits, just that there will be overflow.
> Another option would be those overflow |= n ^ (n << 4 >> 4);
> tests that convert_hex does and just making sure length is never decremented
> (except we need a way to distinguish between \u{} and at least one digit).

This way is fine, could just use more comment.

>>> +      if (loc_reader)
>>> +	char_range->m_finish = loc_reader->get_next ().m_finish;
>>
>> Here and in other functions, the pattern of increment the input pointer and
>> update m_finish seems like it should be a macro?
> 
> Perhaps or inline function.  Before my patch, there are 5 such ifs
> (some with char_range.m_finish and others char_range->m_finish),
> the patch adds another 7 such spots.

Either way is fine.

>>> @@ -2119,15 +2255,23 @@ _cpp_interpret_identifier (cpp_reader *p
>>>    	cppchar_t value = 0;
>>>    	size_t bufleft = len - (bufp - buf);
>>>    	int rval;
>>> +	bool delimited = false;
>>>    	idp += 2;
>>> +	if (length == 4 && id[idp] == '{')
>>> +	  {
>>> +	    delimited = true;
>>> +	    idp++;
>>> +	  }
>>>    	while (length && idp < len && ISXDIGIT (id[idp]))
>>>    	  {
>>>    	    value = (value << 4) + hex_value (id[idp]);
>>>    	    idp++;
>>> -	    length--;
>>> +	    if (!delimited)
>>> +	      length--;
>>>    	  }
>>> -	idp--;
>>> +	if (!delimited)
>>> +	  idp--;
>>
>> Don't we need to check that the first non-xdigit is a }?
> 
> The comments and my understanding of the code say that we first
> check what is a valid identifier and the above is only called on
> a valid identifier.  So, if it would be delimited \u{ not terminated
> with }, then it would fail forms_identifier_p and wouldn't be included
> in the range.  Thus e.g. the ISXDIGIT (id[id]) test is probably not needed
> unless delimited is true because we've checked earlier that it has 4 or 8
> hex digits.
> But sure, if you want a id[idp] == '}' test or assertion, it can be
> added.

OK, a comment mentioning this should be sufficient.

Jason


  reply	other threads:[~2022-08-18  2:22 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-17  7:17 Jakub Jelinek
2022-08-17 20:47 ` Jason Merrill
2022-08-17 21:19   ` Jakub Jelinek
2022-08-18  2:22     ` Jason Merrill [this message]
2022-08-18  8:17       ` [PATCH] libcpp, v2: " Jakub Jelinek
2022-08-19  0:34         ` Jason Merrill

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=720d71cc-0883-f8e4-2321-2e2594cf93aa@redhat.com \
    --to=jason@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=joseph@codesourcery.com \
    --cc=polacek@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).