From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTP id 4FB553857C71 for ; Fri, 27 Aug 2021 21:33:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4FB553857C71 Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-390-X5vJB4R6N12ls2rTafILWA-1; Fri, 27 Aug 2021 17:33:15 -0400 X-MC-Unique: X5vJB4R6N12ls2rTafILWA-1 Received: by mail-qv1-f71.google.com with SMTP id u8-20020a0cee88000000b00363b89e1c50so1154949qvr.16 for ; Fri, 27 Aug 2021 14:33:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:subject:to:references:cc:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=2xDmht4Pw3voJDBIm8r3kof5P600SBshack+G8UuaF4=; b=G8tLLCDHC0z90g7zfaZYPYRZZMNAYVt+7FnZWTTsQGvC260WNekw1JS8vvEFqYrxSq yc0EBO/BDGy3KO0KewFRs05REd4bTqu7Fh+E0yA8lKg4l2Z5f6ojErA1Jv7ErCZDVU4J QKXAErqpzYu2c9sfoSpfcW35GxCwSIS6PFLV54jBUeHbip6w/ei7EukZ3H8ZnrPHop3f nN1TZ2D1X94tah7NWtZCdwmu7bg7ga/7EwbvW1fKvqYBlYQMlrcDjdeIsF2c4MUGapYt EXFOjeg6fG3REeiWIeRnA+GX+6znVzTzfsqlzwZAtJNtp1ZVWAZ52x9zG+B4V1ARHJ7u o7gw== X-Gm-Message-State: AOAM532H4yhpFG7kQdATobCF9kxB5R5tOt3n4Ybi62ymbc7YT6k4B3al 5HM8/CErdpCzhqeufkAQCp9InDGkZn+c/ZCuq8YLX10cga7WSWt3Q8L/Cwth1SbwIYqICG+D5/D N3SgcJeWGBPYybRCBZg== X-Received: by 2002:ac8:6886:: with SMTP id m6mr10380017qtq.255.1630099993903; Fri, 27 Aug 2021 14:33:13 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyNpDASFk/KSoV259m9MFsPUYBOztJiMWvRJm1zJBn7NoTuLweekm7jSBnY3TyW6QINcMW/EA== X-Received: by 2002:ac8:6886:: with SMTP id m6mr10379997qtq.255.1630099993451; Fri, 27 Aug 2021 14:33:13 -0700 (PDT) Received: from [192.168.1.149] (130-44-159-43.s11817.c3-0.arl-cbr1.sbo-arl.ma.cable.rcncustomer.com. [130.44.159.43]) by smtp.gmail.com with ESMTPSA id f2sm4186611qth.11.2021.08.27.14.33.12 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 27 Aug 2021 14:33:12 -0700 (PDT) From: Jason Merrill Subject: Re: [PATCH] c++: error message for dependent template members [PR70417] To: Anthony Sharp , gcc-patches@gcc.gnu.org References: Cc: Marek Polacek Message-ID: <1d85c433-5176-1edb-b65a-e5ab1386387d@redhat.com> Date: Fri, 27 Aug 2021 17:33:11 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-7.4 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_SHORT, NICE_REPLY_A, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Aug 2021 21:33:37 -0000 On 8/20/21 12:56 PM, Anthony Sharp via Gcc-patches wrote: > Hi, hope everyone is well. I have a patch here for issue 70417 > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70417). I'm still a GCC > noob, and this is probably the hardest thing I have ever coded in my > life, so please forgive any initial mistakes! > > TLDR > > This patch introduces a helpful error message when the user attempts > to put a template-id after a member access token (., -> or ::) in a > dependent context without having put the "template" keyword after the > access token. Great, thanks! > CONTEXT > > In C++, when referencing a class member (using ., -> or ::) in a > dependent context, if that member is a template, the template keyword > is required after the access token. For example: > > template void MyDependentTemplate(T t) > { > t.DoSomething(); /* Error - t is dependent. "<" is treated as a > less-than sign because DoSomething is assumed to be a non-template > identifier. */ > t.template DoSomething(); // Good. > } > > PATCH EXPLANATION > > In order to throw an appropriate error message we need to identify > and/or create a point in the compiler where the following conditions > are all simultaneously satisfied: > > A) a class member access token (., ->, ::) > B) a dependent context > C) the thing being accessed is a template-id > D) No template keyword > > A, B and D are relatively straightforward and I won't go into detail > about how that was achieved. The real challenge is C - figuring out > whether a name is a template-id. > > Usually, we would look up the name and use that to determine whether > the name is a template or not. But, we cannot do that. We are in a > dependent context, so lookup cannot (in theory) find anything at the > point the access expression is parsed. > > We (maybe) could defer checking until the template is actually > instantiated. But this is not the approach I have gone for, since this > to me sounded unnecessarily awkward and clunky. In my mind this also > seems like a syntax error, and IMO it seems more natural that syntax > errors should get caught as soon as they are parsed. > > So, the solution I went for was to figure out whether the name was a > template by looking at the surrounding tokens. The key to this lies in > being able to differentiate between the start and end of a template > parameter list (< and >) and greater-than and less-than tokens (and > perhaps also things like << or >>). If the final > (if we find one at > all) does indeed mark the end of a class or function template, then it > will be followed by something like (, ::, or even just ;. On the other > hand, a greater-than operator would be followed by a name. > > PERFORMANCE IMPACT > > My concern with this approach was that checking this would actually > slow the compiler down. In the end it seemed the opposite was true. By > parsing the tokens to determine whether the name looks like a > template-id, we can cut out a lot of the template-checking code that > gets run trying to find a template when there is none, making compile > time generally faster. That being said, I'm not sure if the > improvement will be that substantial, so I did not feel it necessary > to introduce the template checking method everywhere where we could > possibly have run into a template. > > I ran an ABAB test with the following code repeated enough times to > fill up about 10,000 lines: > > ai = 1; > bi = 2; > ci = 3; > c.x = 4; > (&c)->x = 5; > mytemplateclass::y = 6; > c.template class_func(); > (&c)->template class_func(); > mytemplateclass::template class_func_static(); > c2.class_func(); > (&c2)->class_func(); > myclass::class_func_static(); > ai = 6; > bi = 5; > ci = 4; > c.x = 3; > (&c)->x = 2; > mytemplateclass::y = 1; > c.template class_func(); > (&c)->template class_func(); > mytemplateclass::template class_func_static(); > c2.class_func(); > (&c2)->class_func(); > myclass::class_func_static(); > > It basically just contains a mixture of class access expressions plus > some regular assignments for good measure. The compiler was compiled > without any optimisations (and so slower than usual). In hindsight, > perhaps this test case assumes the worst-ish-case scenario since it > contains a lot of templates; most of the benefits of this patch occur > when parsing non-templates. > > The compiler (clean) and the compiler with the patch applied (patched) > compiled the above code 20 times each in ABAB fashion. On average, the > clean compiler took 2.26295 seconds and the patched compiler took > 2.25145 seconds (about 0.508% faster). Whether this improvement > remains or disappears when the compiler is built with optimisations > turned on I haven't tested. My main concern was just making sure it > didn't get any slower. > > AWKWARD TEST CASES > > You will see from the patch that it required the updating of a few > testcases (as well as one place within the compiler). I believe this > is because up until now, the compiler allowed the omission of the > template keyword in some places it shouldn't have. Take decltype34.C > for example: > > struct impl > { > template static T create(); > }; > > template decltype(impl::create()->impl::create())> > struct tester{}; > > GCC currently accepts this code. My patch causes it to be rejected. > For what it's worth, Clang also rejects this code: > > 1824786093/source.cpp:6:70: error: use 'template' keyword to treat > 'create' as a dependent template name > template decltype(impl::create()->impl::create())> > > ^ template > > I think Clang is correct since from what I understand it should be > written as impl::create()->impl::template create(). Here, > create() is dependent, so it makes sense that we would need > "template" before the scope operator. Then again, I could well be > wrong. The rules around dependent template names are pretty crazy. This is basically core issue 1835, http://wg21.link/cwg1835 This was changed for C++23 by the paper "Declarations and where to find them", http://wg21.link/p1787 Previously, e.g. in C++20, https://timsong-cpp.github.io/cppwp/n4861/basic.lookup.classref#4 said that when we see ->impl, since we can't look up the name in the (dependent) object type, we do unqualified lookup and find ::impl, so create is not in a dependent scope, and the testcase is fine. The 1835 change clarifies that we only do unqualified lookup of the second impl if the object type is non-dependent, so the second create is in a dependent scope, and we need the ::template. But in either case, whether create is in a dependent scope depends on how we resolve impl::, we don't need to remember further back in the expression. So your dependent_expression_p parameter seems like the wrong approach. Note that when we're looking up the name after ->, the type of the object expression is in parser->context->object_type. 1835 also makes the following testcase valid, which we don't currently accept, but clang does: template void f(T t) { t.foo::bar(); } struct foo { void bar(); }; int main() { f(foo()); } For C++20 and down, I want to start accepting this testcase, but also still accept decltype34.C to avoid breaking existing code. But that's a separate issue; I don't think your patch should affect decltype34. The cases you fixed in symbol-summary.h are indeed mistakes, but not ill-formed, so giving an error on them is wrong. For example, here is a well-formed program that is rejected with your patch: template void f(T t) { t.m(0); } struct A { int m; } a; int main() { f(a); } Comparing the result of < by > is very unlikely to be what the user actually meant, but it is potentially valid. So we should accept f with a warning about the dubious expression. People can silence the warning if they really mean this by parenthesizing the LHS of the < operator. Another valid program, using :: int x; template void f(T t) { t.m::x; } struct A { int m; } a; int main() { f(a); } Now to reviewing the actual patch: > + yet). Return 1 if it does look like a template-id. Return 2 > + if not. */ Let's use -1 for definitely not. > + /* Could be a ~ referencing the destructor of a class template. > + Temporarily eat it up so it's not in our way. */ > + if (next_token->type == CPP_COMPL) > + { > + cp_lexer_save_tokens (parser->lexer); > + cp_lexer_consume_token (parser->lexer); > + next_token = cp_lexer_peek_token (parser->lexer); > + saved = true; > + } There's no ambiguity in the case of a destructor, as you note in your comments in cp_parser_id_expression. How about returning early if we see ~? > + /* Find offset of final ">". */ > + for (; depth > 0; i++) > + { > + switch (cp_lexer_peek_nth_token (parser->lexer, i)->type) > + { > + case CPP_LESS: > + depth++; > + break; > + case CPP_GREATER: > + depth--; > + break; > + case CPP_RSHIFT: > + depth-=2; > + break; > + case CPP_EOF: > + case CPP_SEMICOLON: > + goto no; > + default: > + break; > + } > + } This code doesn't handle skipping matched ()/{}/[] in the template-argument-list. You probably want to involve cp_parser_skip_to_end_of_template_parameter_list somehow. > +no: > + if (saved) > + cp_lexer_rollback_tokens (parser->lexer); > + return 2; > + > +yes: > + if (saved) > + cp_lexer_rollback_tokens (parser->lexer); > + return 1; Now that we're writing C++, I'd prefer to avoid this kind of pattern in favor of RAII, such as saved_token_sentinel. If this is still relevant after addressing the above comments. > + If DEPENDENT_EXPRESSION_P is true, this is a dependent id-expression > + that is not the current instantiation. */ As mentioned above, let's not add this parameter. > + error_at (next_token->location, > + "expected \"template\" keyword before dependent template " > + "name"); As mentioned above, this should be a warning. > - /* If it worked, we're done. */ > - if (cp_parser_parse_definitely (parser)) > - return id; > + if (looks_like_template != 2) > + { > + /* We don't know yet whether or not this will be a > + template-id. */ The indentation here is off. > + int looks_like_template) Let's give these parms a default argument of 0. And call them looks_like_template_id to be clearer. > - /* Avoid performing name lookup if there is no possibility of > - finding a template-id. */ > - if ((token->type != CPP_NAME && token->keyword != RID_OPERATOR) > - || (token->type == CPP_NAME > - && !cp_parser_nth_token_starts_template_argument_list_p > - (parser, 2))) > + /* Only if we have not already checked whether this looks like a > + template. */ > + if (looks_like_template == 0) > { > - cp_parser_error (parser, "expected template-id"); > - return error_mark_node; > + /* Avoid performing name lookup if there is no possibility of > + finding a template-id. */ You could add the looks_like_template_id check to the existing condition so we don't need to reindent the comment or body. > +++ b/gcc/symbol-summary.h > @@ -191,7 +191,7 @@ public: > template > void traverse (Arg a) const > { > - m_map.traverse (a); > + m_map.template traverse (a); I've gone ahead and applied this fix as a separate patch. Jason