From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <jason@redhat.com>
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [170.10.133.124])
 by sourceware.org (Postfix) with ESMTP id 4FB553857C71
 for <gcc-patches@gcc.gnu.org>; Fri, 27 Aug 2021 21:33:26 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4FB553857C71
Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com
 [209.85.219.71]) (Using TLS) by relay.mimecast.com with ESMTP id
 us-mta-390-X5vJB4R6N12ls2rTafILWA-1; Fri, 27 Aug 2021 17:33:15 -0400
X-MC-Unique: X5vJB4R6N12ls2rTafILWA-1
Received: by mail-qv1-f71.google.com with SMTP id
 u8-20020a0cee88000000b00363b89e1c50so1154949qvr.16
 for <gcc-patches@gcc.gnu.org>; Fri, 27 Aug 2021 14:33:15 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:subject:to:references:cc:message-id:date
 :user-agent:mime-version:in-reply-to:content-language
 :content-transfer-encoding;
 bh=2xDmht4Pw3voJDBIm8r3kof5P600SBshack+G8UuaF4=;
 b=G8tLLCDHC0z90g7zfaZYPYRZZMNAYVt+7FnZWTTsQGvC260WNekw1JS8vvEFqYrxSq
 yc0EBO/BDGy3KO0KewFRs05REd4bTqu7Fh+E0yA8lKg4l2Z5f6ojErA1Jv7ErCZDVU4J
 QKXAErqpzYu2c9sfoSpfcW35GxCwSIS6PFLV54jBUeHbip6w/ei7EukZ3H8ZnrPHop3f
 nN1TZ2D1X94tah7NWtZCdwmu7bg7ga/7EwbvW1fKvqYBlYQMlrcDjdeIsF2c4MUGapYt
 EXFOjeg6fG3REeiWIeRnA+GX+6znVzTzfsqlzwZAtJNtp1ZVWAZ52x9zG+B4V1ARHJ7u
 o7gw==
X-Gm-Message-State: AOAM532H4yhpFG7kQdATobCF9kxB5R5tOt3n4Ybi62ymbc7YT6k4B3al
 5HM8/CErdpCzhqeufkAQCp9InDGkZn+c/ZCuq8YLX10cga7WSWt3Q8L/Cwth1SbwIYqICG+D5/D
 N3SgcJeWGBPYybRCBZg==
X-Received: by 2002:ac8:6886:: with SMTP id m6mr10380017qtq.255.1630099993903; 
 Fri, 27 Aug 2021 14:33:13 -0700 (PDT)
X-Google-Smtp-Source: ABdhPJyNpDASFk/KSoV259m9MFsPUYBOztJiMWvRJm1zJBn7NoTuLweekm7jSBnY3TyW6QINcMW/EA==
X-Received: by 2002:ac8:6886:: with SMTP id m6mr10379997qtq.255.1630099993451; 
 Fri, 27 Aug 2021 14:33:13 -0700 (PDT)
Received: from [192.168.1.149]
 (130-44-159-43.s11817.c3-0.arl-cbr1.sbo-arl.ma.cable.rcncustomer.com.
 [130.44.159.43])
 by smtp.gmail.com with ESMTPSA id f2sm4186611qth.11.2021.08.27.14.33.12
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Fri, 27 Aug 2021 14:33:12 -0700 (PDT)
From: Jason Merrill <jason@redhat.com>
Subject: Re: [PATCH] c++: error message for dependent template members
 [PR70417]
To: Anthony Sharp <anthonysharp15@gmail.com>, gcc-patches@gcc.gnu.org
References: <CA+Lh_A=-T4sdnCNmTrPw+M1apohZJ0VcYq0aJTmXO4iWGENduQ@mail.gmail.com>
Cc: Marek Polacek <polacek@redhat.com>
Message-ID: <1d85c433-5176-1edb-b65a-e5ab1386387d@redhat.com>
Date: Fri, 27 Aug 2021 17:33:11 -0400
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
 Thunderbird/78.13.0
MIME-Version: 1.0
In-Reply-To: <CA+Lh_A=-T4sdnCNmTrPw+M1apohZJ0VcYq0aJTmXO4iWGENduQ@mail.gmail.com>
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=-7.4 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_SHORT, NICE_REPLY_A,
 RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE,
 TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Fri, 27 Aug 2021 21:33:37 -0000

On 8/20/21 12:56 PM, Anthony Sharp via Gcc-patches wrote:
> Hi, hope everyone is well. I have a patch here for issue 70417
> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70417). I'm still a GCC
> noob, and this is probably the hardest thing I have ever coded in my
> life, so please forgive any initial mistakes!
> 
> TLDR
> 
> This patch introduces a helpful error message when the user attempts
> to put a template-id after a member access token (., -> or ::) in a
> dependent context without having put the "template" keyword after the
> access token.

Great, thanks!

> CONTEXT
> 
> In C++, when referencing a class member (using ., -> or ::) in a
> dependent context, if that member is a template, the template keyword
> is required after the access token. For example:
> 
> template<class T> void MyDependentTemplate(T t)
> {
>    t.DoSomething<T>(); /* Error - t is dependent. "<" is treated as a
> less-than sign because DoSomething is assumed to be a non-template
> identifier. */
>    t.template DoSomething<T>(); // Good.
> }
> 
> PATCH EXPLANATION
> 
> In order to throw an appropriate error message we need to identify
> and/or create a point in the compiler where the following conditions
> are all simultaneously satisfied:
> 
> A) a class member access token (., ->, ::)
> B) a dependent context
> C) the thing being accessed is a template-id
> D) No template keyword
> 
> A, B and D are relatively straightforward and I won't go into detail
> about how that was achieved. The real challenge is C - figuring out
> whether a name is a template-id.
> 
> Usually, we would look up the name and use that to determine whether
> the name is a template or not. But, we cannot do that. We are in a
> dependent context, so lookup cannot (in theory) find anything at the
> point the access expression is parsed.
> 
> We (maybe) could defer checking until the template is actually
> instantiated. But this is not the approach I have gone for, since this
> to me sounded unnecessarily awkward and clunky. In my mind this also
> seems like a syntax error, and IMO it seems more natural that syntax
> errors should get caught as soon as they are parsed.
> 
> So, the solution I went for was to figure out whether the name was a
> template by looking at the surrounding tokens. The key to this lies in
> being able to differentiate between the start and end of a template
> parameter list (< and >) and greater-than and less-than tokens (and
> perhaps also things like << or >>). If the final > (if we find one at
> all) does indeed mark the end of a class or function template, then it
> will be followed by something like (, ::, or even just ;. On the other
> hand, a greater-than operator would be followed by a name.
> 
> PERFORMANCE IMPACT
> 
> My concern with this approach was that checking this would actually
> slow the compiler down. In the end it seemed the opposite was true. By
> parsing the tokens to determine whether the name looks like a
> template-id, we can cut out a lot of the template-checking code that
> gets run trying to find a template when there is none, making compile
> time generally faster. That being said, I'm not sure if the
> improvement will be that substantial, so I did not feel it necessary
> to introduce the template checking method everywhere where we could
> possibly have run into a template.
> 
> I ran an ABAB test with the following code repeated enough times to
> fill up about 10,000 lines:
> 
> ai = 1;
> bi = 2;
> ci = 3;
> c.x = 4;
> (&c)->x = 5;
> mytemplateclass<Y>::y = 6;
> c.template class_func<Y>();
> (&c)->template class_func<Y>();
> mytemplateclass<Y>::template class_func_static<Y>();
> c2.class_func<myclass>();
> (&c2)->class_func<myclass>();
> myclass::class_func_static<myclass>();
> ai = 6;
> bi = 5;
> ci = 4;
> c.x = 3;
> (&c)->x = 2;
> mytemplateclass<Y>::y = 1;
> c.template class_func<Y>();
> (&c)->template class_func<Y>();
> mytemplateclass<Y>::template class_func_static<Y>();
> c2.class_func<myclass>();
> (&c2)->class_func<myclass>();
> myclass::class_func_static<myclass>();
> 
> It basically just contains a mixture of class access expressions plus
> some regular assignments for good measure. The compiler was compiled
> without any optimisations (and so slower than usual). In hindsight,
> perhaps this test case assumes the worst-ish-case scenario since it
> contains a lot of templates; most of the benefits of this patch occur
> when parsing non-templates.
> 
> The compiler (clean) and the compiler with the patch applied (patched)
> compiled the above code 20 times each in ABAB fashion. On average, the
> clean compiler took 2.26295 seconds and the patched compiler took
> 2.25145 seconds (about 0.508% faster). Whether this improvement
> remains or disappears when the compiler is built with optimisations
> turned on I haven't tested. My main concern was just making sure it
> didn't get any slower.
> 
> AWKWARD TEST CASES
> 
> You will see from the patch that it required the updating of a few
> testcases (as well as one place within the compiler). I believe this
> is because up until now, the compiler allowed the omission of the
> template keyword in some places it shouldn't have. Take decltype34.C
> for example:
> 
> struct impl
> {
>    template <class T> static T create();
> };
> 
> template<class T, class U, class =
> decltype(impl::create<T>()->impl::create<U>())>
> struct tester{};
> 
> GCC currently accepts this code. My patch causes it to be rejected.

> For what it's worth, Clang also rejects this code:
> 
> 1824786093/source.cpp:6:70: error: use 'template' keyword to treat
> 'create' as a dependent template name
> template<class T, class U, class =
> decltype(impl::create<T>()->impl::create<U>())>
> 
>                                        ^ template
> 
> I think Clang is correct since from what I understand it should be
> written as impl::create<T>()->impl::template create<U>(). Here,
> create<T>() is dependent, so it makes sense that we would need
> "template" before the scope operator. Then again, I could well be
> wrong. The rules around dependent template names are pretty crazy.

This is basically core issue 1835, http://wg21.link/cwg1835

This was changed for C++23 by the paper "Declarations and where to find 
them", http://wg21.link/p1787

Previously, e.g. in C++20, 
https://timsong-cpp.github.io/cppwp/n4861/basic.lookup.classref#4 said 
that when we see ->impl, since we can't look up the name in the 
(dependent) object type, we do unqualified lookup and find ::impl, so 
create<U> is not in a dependent scope, and the testcase is fine.

The 1835 change clarifies that we only do unqualified lookup of the 
second impl if the object type is non-dependent, so the second create is 
in a dependent scope, and we need the ::template.

But in either case, whether create<U> is in a dependent scope depends on 
how we resolve impl::, we don't need to remember further back in the 
expression.  So your dependent_expression_p parameter seems like the 
wrong approach.  Note that when we're looking up the name after ->, the 
type of the object expression is in parser->context->object_type.

1835 also makes the following testcase valid, which we don't currently 
accept, but clang does:

template <class T> void f(T t) { t.foo::bar(); }
struct foo { void bar(); };
int main() { f(foo()); }

For C++20 and down, I want to start accepting this testcase, but also 
still accept decltype34.C to avoid breaking existing code.  But that's a 
separate issue; I don't think your patch should affect decltype34.

The cases you fixed in symbol-summary.h are indeed mistakes, but not 
ill-formed, so giving an error on them is wrong.  For example, here is a 
well-formed program that is rejected with your patch:

template <class T, int N> void f(T t) { t.m<N>(0); }
struct A { int m; } a;
int main() { f<A,42>(a); }

Comparing the result of < by > is very unlikely to be what the user 
actually meant, but it is potentially valid.  So we should accept f with 
a warning about the dubious expression.  People can silence the warning 
if they really mean this by parenthesizing the LHS of the < operator.

Another valid program, using ::

int x;
template <class T, int N> void f(T t) { t.m<N>::x; }
struct A { int m; } a;
int main() { f<A,42>(a); }

Now to reviewing the actual patch:

> +   yet).  Return 1 if it does look like a template-id.  Return 2
> +   if not.  */

Let's use -1 for definitely not.

> +  /* Could be a ~ referencing the destructor of a class template.
> +     Temporarily eat it up so it's not in our way.  */
> +  if (next_token->type == CPP_COMPL)
> +    {
> +      cp_lexer_save_tokens (parser->lexer);
> +      cp_lexer_consume_token (parser->lexer);
> +      next_token = cp_lexer_peek_token (parser->lexer);
> +      saved = true;
> +    }

There's no ambiguity in the case of a destructor, as you note in your 
comments in cp_parser_id_expression.  How about returning early if we see ~?

> +  /* Find offset of final ">".  */
> +  for (; depth > 0; i++)
> +    {
> +      switch (cp_lexer_peek_nth_token (parser->lexer, i)->type)
> +	 {
> +	   case CPP_LESS:
> +	     depth++;
> +	     break;
> +	   case CPP_GREATER:
> +	     depth--;
> +	     break;
> +	   case CPP_RSHIFT:
> +	     depth-=2;
> +	     break;
> +	   case CPP_EOF:
> +	   case CPP_SEMICOLON:
> +	     goto no;
> +	   default:
> +	     break;
> +	 }
> +    }

This code doesn't handle skipping matched ()/{}/[] in the 
template-argument-list.  You probably want to involve 
cp_parser_skip_to_end_of_template_parameter_list somehow.

> +no:
> +  if (saved)
> +    cp_lexer_rollback_tokens (parser->lexer);
> +  return 2;
> +
> +yes:
> +  if (saved)
> +    cp_lexer_rollback_tokens (parser->lexer);
> +  return 1;

Now that we're writing C++, I'd prefer to avoid this kind of pattern in 
favor of RAII, such as saved_token_sentinel.  If this is still relevant 
after addressing the above comments.

> +   If DEPENDENT_EXPRESSION_P is true, this is a dependent id-expression
> +   that is not the current instantiation.  */

As mentioned above, let's not add this parameter.

> +      error_at (next_token->location,
> +		"expected \"template\" keyword before dependent template "
> +		"name");

As mentioned above, this should be a warning.

> -	/* If it worked, we're done.  */
> -	if (cp_parser_parse_definitely (parser))
> -	  return id;
> +  if (looks_like_template != 2)
> +    {
> +      /* We don't know yet whether or not this will be a
> +	 template-id.  */

The indentation here is off.

> +			  int looks_like_template)

Let's give these parms a default argument of 0.  And call them 
looks_like_template_id to be clearer.

> -  /* Avoid performing name lookup if there is no possibility of
> -     finding a template-id.  */
> -  if ((token->type != CPP_NAME && token->keyword != RID_OPERATOR)
> -      || (token->type == CPP_NAME
> -	  && !cp_parser_nth_token_starts_template_argument_list_p
> -	       (parser, 2)))
> +  /* Only if we have not already checked whether this looks like a
> +     template.  */
> +  if (looks_like_template == 0)
>      {
> -      cp_parser_error (parser, "expected template-id");
> -      return error_mark_node;
> +      /* Avoid performing name lookup if there is no possibility of
> +	 finding a template-id.  */

You could add the looks_like_template_id check to the existing condition 
so we don't need to reindent the comment or body.

> +++ b/gcc/symbol-summary.h
> @@ -191,7 +191,7 @@ public:
>    template<typename Arg, bool (*f)(const T &, Arg)>
>    void traverse (Arg a) const
>    {
> -    m_map.traverse <f> (a);
> +    m_map.template traverse <f> (a);

I've gone ahead and applied this fix as a separate patch.

Jason