public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Lewis Hyatt <lhyatt@gmail.com>
To: Jakub Jelinek <jakub@redhat.com>
Cc: Jason Merrill <jason@redhat.com>, Nathan Sidwell <nathan@acm.org>,
	gcc-patches@gcc.gnu.org
Subject: Re: Ping^3: [PATCH] libcpp: Handle extended characters in user-defined literal suffix [PR103902]
Date: Fri, 10 Feb 2023 11:58:03 -0500	[thread overview]
Message-ID: <CAA_5UQ5aHu_mMTM6tFNuk3FXmS+HK=MK7uj5f4c9cd6VBdj68A@mail.gmail.com> (raw)
In-Reply-To: <Y+ZxJ/w2ffHc/lES@tucnak>

On Fri, Feb 10, 2023 at 11:30 AM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Mon, Sep 26, 2022 at 06:27:25PM -0400, Lewis Hyatt via Gcc-patches wrote:
> > May I please ping this patch again? Joseph suggested that it would be best if
> > a C++ maintainer has a look at it. This is one of just a few places left where
> > we don't handle UTF-8 properly in libcpp, it would be really nice to get them
> > fixed up if there is time to review this patch. Thanks!
>
> CCing them.
>
> Just some nits from me, but I agree C++ maintainers are the best reviewers
> for this.

Thanks so much for looking it over, I really appreciate it. I'll be
sure to incorporate all your feedback along with those from the full
review.

Is this for stage 1 at this point BTW?

One note, the patch as-is doesn't quite apply to master branch
nowadays, it just needs a small tweak since warn_about_normalization()
has acquired a new argument in the meantime. If it's helpful I can
resend it with this addressed, as well as the rest of your comments?

Finally one comment here:

> > +      if (const auto sr = scan_cur_identifier (pfile))
> > +     {
> > +       /* If a string format macro, say from inttypes.h, is placed touching
> > +          a string literal it could be parsed as a C++11 user-defined
> > +          string literal thus breaking the program.  User-defined literals
> > +          outside of namespace std must start with a single underscore, so
> > +          assume anything of that form really is a UDL suffix.  We don't
> > +          need to worry about UDLs defined inside namespace std because
> > +          their names are reserved, so cannot be used as macro names in
> > +          valid programs.  */
> > +       if ((suffix_begin[0] != '_' || suffix_begin[1] == '_')
> > +           && cpp_macro_p (sr.node))
>
> What is the advantage of dropping is_macro_not_literal_suffix and
> hand-inlining it in two different spots?
> Couldn't even the actual warning be moved into an inline function?

The is_macro() function was doing two jobs, first lexing the
identifier and looking it up in the hash table, and then calling
cpp_macro_p(). This was a bit duplicative because the identifier was
then immediately lexed again after the check. Since lexing it became
more complicated with UTF-8 support, I changed it not to duplicate
that effort and instead scan_cur_identifer() does the job once. With
that done, all that's left for is_macro() to do is just the one line
check so I got rid of it. However, I agree that the check about
suffix_begin is not really trivial and so factoring this out into one
place instead of two makes sense. I'll try to move the whole warning
into its own function in the next iteration.

> Otherwise it looks reasonable to me, but I'd still prefer Jason or Nathan
> to review this.
>
>         Jakub
>

Thanks again.

-Lewis

  reply	other threads:[~2023-02-10 16:58 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-14 21:26 Lewis Hyatt
2022-06-15 19:06 ` Lewis Hyatt
2022-09-26 22:27   ` Ping^3: " Lewis Hyatt
2023-02-10 16:30     ` Jakub Jelinek
2023-02-10 16:58       ` Lewis Hyatt [this message]
2023-02-10 17:10         ` Jakub Jelinek
2023-02-15 18:39     ` Jason Merrill
2023-02-15 23:18       ` Lewis Hyatt
2023-03-02 23:07         ` [PATCH v2] " Lewis Hyatt
2023-07-18 20:33           ` Jason Merrill

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAA_5UQ5aHu_mMTM6tFNuk3FXmS+HK=MK7uj5f4c9cd6VBdj68A@mail.gmail.com' \
    --to=lhyatt@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=jason@redhat.com \
    --cc=nathan@acm.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).