From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qv1-xf2e.google.com (mail-qv1-xf2e.google.com [IPv6:2607:f8b0:4864:20::f2e]) by sourceware.org (Postfix) with ESMTPS id F24C63858CD1 for ; Fri, 14 Jul 2023 21:14:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F24C63858CD1 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=acm.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-qv1-xf2e.google.com with SMTP id 6a1803df08f44-63c70dc7ed2so6286056d6.0 for ; Fri, 14 Jul 2023 14:14:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1689369279; x=1691961279; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :sender:from:to:cc:subject:date:message-id:reply-to; bh=f/N5P7ZjlS/X5es6+F7VBXmVWrolhtBDBOKf1LsGY6M=; b=BGNfr6yL//DzMuCJdiy5rBfx/1Ex9ooSnxI5dr0znrBZynsosFIi05y4xDPFWk9KJE KDPq3izJOEA7M6PNJdyGybIz+GIYSieooUFZb7SNhqIn+VpbOki+lAitTBbCh+Iyp/5i cVJdLH4MzfZf1xiSyfm6Ljn6FtFS4AVmCgVJEzoErwSIA6skVpk8oSmbd8s53ycqZYTP DqqimDm4qbC+40MHUfIL0nEekPPjkPD1O0kyLOrk4acomeBqxcIvUVBXZmBBF6u6WFSt zwfTq1HG4m1DfEdbzdcKXQC2o1gXEu6BF6EnDj8qM9XqQEv864FTw2n0Y1smi5RdPZMd bLZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689369279; x=1691961279; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :sender:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=f/N5P7ZjlS/X5es6+F7VBXmVWrolhtBDBOKf1LsGY6M=; b=ejA3LYjrFTW65Ki+ha/kO+yna/ToMfF3EFub/v359c75ESZ7qLOZv7RV0CtWcsWCYT RCjGNBS240I2UYrHzxwlBsIzs0kIELs3isd5ti2se3Jihd/qBLZ8JcAeGxjSgvQRL5C+ uR2Rz3mURDHUfCr5GrtBH+ppsGCGuV3NbUMljaXZA3kLmhTyEguzJqoR1zdGtqqOkusO kuInHKzWi+V7duSa5oVy3Wkl5hqPyTYSuzTzZhojrwEejFNCT3mlgbw0xP15KAcPCp/A 0ICYwZcoywdI3DjZdesBgBIxCTj9Sk3XdSZfjWISjvinLruw0tmO2GDHhafwmuzw6pMn /7OQ== X-Gm-Message-State: ABy/qLaCp6Yub1+dxt9kS6MLWhSR4JFyfHQ01u8+eS9XSppU4ioXU8R2 3qnUXiEexIEWqYqi5IA5r0qvpa6CbS8= X-Google-Smtp-Source: APBJJlFBAARSWh6tLJBsFEnlBB2iXNlbvg4H834zRyiV9X5cS+jl8aBq71G351BgYsGX5e2yAMqbTQ== X-Received: by 2002:a0c:f013:0:b0:632:35e3:ea36 with SMTP id z19-20020a0cf013000000b0063235e3ea36mr5777753qvk.63.1689369278872; Fri, 14 Jul 2023 14:14:38 -0700 (PDT) Received: from ?IPV6:2601:19c:5282:9160::1? ([2601:19c:5282:9160::1]) by smtp.googlemail.com with ESMTPSA id q17-20020a0ce211000000b006362d4eeb6esm4207473qvl.144.2023.07.14.14.14.37 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 14 Jul 2023 14:14:38 -0700 (PDT) Sender: Nathan Sidwell Message-ID: <798b12e4-8305-d800-86a2-1d0c5b87fc2a@acm.org> Date: Fri, 14 Jul 2023 17:14:31 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Subject: Re: [WIP RFC] Add support for keyword-based attributes Content-Language: en-US To: gcc-patches@gcc.gnu.org, joseph@codesourcery.com, polacek@redhat.com, jason@redhat.com, richard.sandiford@arm.com References: From: Nathan Sidwell In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3038.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,GIT_PATCH_0,HEADER_FROM_DIFFERENT_DOMAINS,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 7/14/23 11:56, Richard Sandiford wrote: > Summary: We'd like to be able to specify some attributes using > keywords, rather than the traditional __attribute__ or [[...]] > syntax. Would that be OK? > > In more detail: > > We'd like to add some new target-specific attributes for Arm SME. > These attributes affect semantics and code generation and so they > can't simply be ignored. > > Traditionally we've done this kind of thing by adding GNU attributes, > via TARGET_ATTRIBUTE_TABLE in GCC's case. The problem is that both > GCC and Clang have traditionally only warned about unrecognised GNU > attributes, rather than raising an error. Older compilers might > therefore be able to look past some uses of the new attributes and > still produce object code, even though that object code is almost > certainly going to be wrong. (The compilers will also emit a default-on > warning, but that might go unnoticed when building a big project.) > > There are some existing attributes that similarly affect semantics > in ways that cannot be ignored. vector_size is one obvious example. > But that doesn't make it a good thing. :) > > Also, C++ says this for standard [[...]] attributes: > > For an attribute-token (including an attribute-scoped-token) > not specified in this document, the behavior is implementation-defined; > any such attribute-token that is not recognized by the implementation > is ignored. > > which doubles down on the idea that attributes should not be used > for necessary semantic information. There;s been quite a bit of discussion about the practicalities of that. As you say, there are existing, std-specified attributes, [[no_unique_address]] for instance, that affect user-visible object layout if ignored. Further, my understanding is that implementation-specific attributes are permitted to affect program semantics -- they're implementatin extensions. IMHO, attributes are the accepted mechanism for what you're describing. Compilers already have a way of dealing with them -- both parsing and, in general, representing them. I would be wary of inventing a different mechanism. Have you approached C or C++ std bodies for input? > > One of the attributes we'd like to add provides a new way of compiling > existing code. The attribute doesn't require SME to be available; > it just says that the code must be compiled so that it can run in either > of two modes. This is probably the most dangerous attribute of the set, > since compilers that ignore it would just produce normal code. That > code might work in some test scenarios, but it would fail in others. > > The feeling from the Clang community was therefore that these SME > attributes should use keywords instead, so that the keywords trigger > an error with older compilers. > > However, it seemed wrong to define new SME-specific grammar rules, > since the underlying problem is pretty generic. We therefore > proposed having a type of keyword that can appear exactly where > a standard [[...]] attribute can appear and that appertains to > exactly what a standard [[...]] attribute would appertain to. > No divergence or cherry-picking is allowed. > > For example: > > [[arm::foo]] > > would become: > > __arm_foo > > and: > > [[arm::bar(args)]] > > would become: > > __arm_bar(args) > > It wouldn't be possible to retrofit arguments to a keyword that > previously didn't take arguments, since that could lead to parsing > ambiguities. So when a keyword is first added, a binding decision > would need to be made whether the keyword always takes arguments > or is always standalone. > > For that reason, empty argument lists are allowed for keywords, > even though they're not allowed for [[...]] attributes. > > The argument-less version was accepted into Clang, and I have a follow-on > patch for handling arguments. Would the same thing be OK for GCC, > in both the C and C++ frontends? > > The patch below is a proof of concept for the C frontend. It doesn't > bootstrap due to warnings about uninitialised fields. And it doesn't > have tests. But I did test it locally with various combinations of > attribute_spec and it seemed to work as expected. > > The impact on the C frontend seems to be pretty small. It looks like > the impact on the C++ frontend would be a bit bigger, but not much. > > The patch contains a logically unrelated change: c-common.h set aside > 16 keywords for address spaces, but of the in-tree ports, the maximum > number of keywords used is 6 (for amdgcn). The patch therefore changes > the limit to 8 and uses 8 keywords for the new attributes. This keeps > the number of reserved ids <= 256. > > A real, non-proof-of-concept patch series would: > > - Change the address-space keywords separately, and deal with any fallout. > > - Clean up the way that attributes are specified, so that it isn't > necessary to update all definitions when adding a new field. > > - Allow more precise attribute requirements, such as "function decl only". > > - Add tests :) > > WDYT? Does this approach look OK in principle, or is it a non-starter? > > If it is a non-starter, the fallback would be to predefine macros > that expand to [[...]] or __attribute__. Having the keywords gives > more precise semantics and better error messages though. > > Thanks, > Richard > --- > gcc/attribs.cc | 30 +++++++++++- > gcc/c-family/c-common.h | 13 ++---- > gcc/c/c-parser.cc | 88 +++++++++++++++++++++++++++++++++-- > gcc/config/aarch64/aarch64.cc | 1 + > gcc/tree-core.h | 19 ++++++++ > 5 files changed, 135 insertions(+), 16 deletions(-) > > diff --git a/gcc/attribs.cc b/gcc/attribs.cc > index b8cb55b97df..706cd81329c 100644 > --- a/gcc/attribs.cc > +++ b/gcc/attribs.cc > @@ -752,6 +752,11 @@ decl_attributes (tree *node, tree attributes, int flags, > > if (spec->decl_required && !DECL_P (*anode)) > { > + if (spec->is_keyword) > + { > + error ("%qE does not apply to types", name); > + continue; > + } > if (flags & ((int) ATTR_FLAG_DECL_NEXT > | (int) ATTR_FLAG_FUNCTION_NEXT > | (int) ATTR_FLAG_ARRAY_NEXT)) > @@ -775,6 +780,11 @@ decl_attributes (tree *node, tree attributes, int flags, > the decl's type in place here. */ > if (spec->type_required && DECL_P (*anode)) > { > + if (spec->is_keyword) > + { > + error ("%qE only applies to types", name); > + continue; > + } > anode = &TREE_TYPE (*anode); > flags &= ~(int) ATTR_FLAG_TYPE_IN_PLACE; > } > @@ -782,6 +792,11 @@ decl_attributes (tree *node, tree attributes, int flags, > if (spec->function_type_required && TREE_CODE (*anode) != FUNCTION_TYPE > && TREE_CODE (*anode) != METHOD_TYPE) > { > + if (spec->is_keyword) > + { > + error ("%qE only applies to function types", name); > + continue; > + } > if (TREE_CODE (*anode) == POINTER_TYPE > && FUNC_OR_METHOD_TYPE_P (TREE_TYPE (*anode))) > { > @@ -821,7 +836,12 @@ decl_attributes (tree *node, tree attributes, int flags, > && (flags & (int) ATTR_FLAG_TYPE_IN_PLACE) > && COMPLETE_TYPE_P (*anode)) > { > - warning (OPT_Wattributes, "type attributes ignored after type is already defined"); > + if (spec->is_keyword) > + error ("cannot apply %qE to a type that has already been" > + " defined", name); > + else > + warning (OPT_Wattributes, "type attributes ignored after type" > + " is already defined"); > continue; > } > > @@ -895,7 +915,13 @@ decl_attributes (tree *node, tree attributes, int flags, > *anode = cur_and_last_decl[0]; > if (ret == error_mark_node) > { > - warning (OPT_Wattributes, "%qE attribute ignored", name); > + if (spec->is_keyword) > + /* This error is only a last resort, Hopefully the > + handler will report a better error in most cases, > + return NULL_TREE, and set no_add_attrs. */ > + error ("invalid %qE attribute", name); > + else > + warning (OPT_Wattributes, "%qE attribute ignored", name); > no_add_attrs = true; > } > else > diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h > index b5ef5ff6b2c..4443ccb874d 100644 > --- a/gcc/c-family/c-common.h > +++ b/gcc/c-family/c-common.h > @@ -222,17 +222,9 @@ enum rid > RID_ADDR_SPACE_5, > RID_ADDR_SPACE_6, > RID_ADDR_SPACE_7, > - RID_ADDR_SPACE_8, > - RID_ADDR_SPACE_9, > - RID_ADDR_SPACE_10, > - RID_ADDR_SPACE_11, > - RID_ADDR_SPACE_12, > - RID_ADDR_SPACE_13, > - RID_ADDR_SPACE_14, > - RID_ADDR_SPACE_15, > > RID_FIRST_ADDR_SPACE = RID_ADDR_SPACE_0, > - RID_LAST_ADDR_SPACE = RID_ADDR_SPACE_15, > + RID_LAST_ADDR_SPACE = RID_ADDR_SPACE_7, > > /* __intN keywords. The _N_M here doesn't correspond to the intN > in the keyword; use the bitsize in int_n_t_data_t[M] for that. > @@ -251,6 +243,9 @@ enum rid > RID_FIRST_INT_N = RID_INT_N_0, > RID_LAST_INT_N = RID_INT_N_3, > > + RID_FIRST_KEYWORD_ATTR, > + RID_LAST_KEYWORD_ATTR = RID_FIRST_KEYWORD_ATTR + 7, > + > RID_MAX, > > RID_FIRST_MODIFIER = RID_STATIC, > diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc > index 24a6eb6e459..b02b082b140 100644 > --- a/gcc/c/c-parser.cc > +++ b/gcc/c/c-parser.cc > @@ -104,6 +104,17 @@ set_c_expr_source_range (c_expr *expr, > } > > > +/* Register keyword attribute AS with reserved identifier code RID. */ > + > +void > +c_register_keyword_attribute (const attribute_spec *as, int rid) > +{ > + tree id = get_identifier (as->name); > + C_SET_RID_CODE (id, rid); > + C_IS_RESERVED_WORD (id) = 1; > + ridpointers [rid] = id; > +} > + > /* Initialization routine for this file. */ > > void > @@ -180,6 +191,16 @@ c_parse_init (void) > C_IS_RESERVED_WORD (id) = 1; > ridpointers [RID_OMP_ALL_MEMORY] = id; > } > + > + int rid = RID_FIRST_KEYWORD_ATTR; > + if (const attribute_spec *attrs = targetm.attribute_table) > + for (const attribute_spec *attr = attrs; attr->name; ++attr) > + if (attr->is_keyword) > + { > + gcc_assert (rid <= RID_LAST_KEYWORD_ATTR); > + c_register_keyword_attribute (attr, rid); > + rid += 1; > + } > } > > /* A parser structure recording information about the state and > @@ -4951,7 +4972,8 @@ c_parser_gnu_attribute_any_word (c_parser *parser) > { > /* ??? See comment above about what keywords are accepted here. */ > bool ok; > - switch (c_parser_peek_token (parser)->keyword) > + int rid = c_parser_peek_token (parser)->keyword; > + switch (rid) > { > case RID_STATIC: > case RID_UNSIGNED: > @@ -4994,7 +5016,9 @@ c_parser_gnu_attribute_any_word (c_parser *parser) > ok = true; > break; > default: > - ok = false; > + /* Accept these now so that we can reject them with a specific > + error later. */ > + ok = (rid >= RID_FIRST_KEYWORD_ATTR && rid <= RID_LAST_KEYWORD_ATTR); > break; > } > if (!ok) > @@ -5156,6 +5180,10 @@ c_parser_gnu_attribute (c_parser *parser, tree attrs, > return NULL_TREE; > > attr_name = canonicalize_attr_name (attr_name); > + const attribute_spec *as = lookup_attribute_spec (attr_name); > + if (as && as->is_keyword) > + error ("%qE cannot be used in %<__attribute__%> constructs", attr_name); > + > c_parser_consume_token (parser); > > tree attr; > @@ -5330,6 +5358,42 @@ c_parser_balanced_token_sequence (c_parser *parser) > } > } > > +/* Parse a keyword attribute. This is simply: > + > + keyword > + > + if the attribute never takes arguments, otherwise it is: > + > + keyword ( balanced-token-sequence[opt] ) > +*/ > + > +static tree > +c_parser_keyword_attribute (c_parser *parser) > +{ > + c_token *token = c_parser_peek_token (parser); > + gcc_assert (token->type == CPP_KEYWORD); > + tree name = canonicalize_attr_name (token->value); > + c_parser_consume_token (parser); > + > + tree attribute = build_tree_list (name, NULL_TREE); > + const attribute_spec *as = lookup_attribute_spec (name); > + gcc_assert (as && as->is_keyword); > + if (as->max_length > 0) > + { > + matching_parens parens; > + if (!parens.require_open (parser)) > + return error_mark_node; > + /* Allow zero-length arguments regardless of as->min_length, since > + the usual error "parentheses must be omitted if attribute argument > + list is empty" suggests an invalid change. We'll reject incorrect > + argument lists later. */ > + TREE_VALUE (attribute) > + = c_parser_attribute_arguments (parser, false, false, false, true); > + parens.require_close (parser); > + } > + return attribute; > +} > + > /* Parse standard (C2X) attributes (including GNU attributes in the > gnu:: namespace). > > @@ -5396,8 +5460,11 @@ c_parser_std_attribute (c_parser *parser, bool for_tm) > ns = NULL_TREE; > attribute = build_tree_list (build_tree_list (ns, name), NULL_TREE); > > - /* Parse the arguments, if any. */ > const attribute_spec *as = lookup_attribute_spec (TREE_PURPOSE (attribute)); > + if (as && as->is_keyword) > + error ("%qE cannot be used in %<[[...]]%> constructs", name); > + > + /* Parse the arguments, if any. */ > if (c_parser_next_token_is_not (parser, CPP_OPEN_PAREN)) > goto out; > { > @@ -5456,7 +5523,13 @@ c_parser_std_attribute (c_parser *parser, bool for_tm) > static tree > c_parser_std_attribute_specifier (c_parser *parser, bool for_tm) > { > - location_t loc = c_parser_peek_token (parser)->location; > + auto first_token = c_parser_peek_token (parser); > + location_t loc = first_token->location; > + if (first_token->type == CPP_KEYWORD > + && first_token->keyword >= RID_FIRST_KEYWORD_ATTR > + && first_token->keyword <= RID_LAST_KEYWORD_ATTR) > + return c_parser_keyword_attribute (parser); > + > if (!c_parser_require (parser, CPP_OPEN_SQUARE, "expected %<[%>")) > return NULL_TREE; > if (!c_parser_require (parser, CPP_OPEN_SQUARE, "expected %<[%>")) > @@ -5571,7 +5644,12 @@ c_parser_check_balanced_raw_token_sequence (c_parser *parser, unsigned int *n) > static bool > c_parser_nth_token_starts_std_attributes (c_parser *parser, unsigned int n) > { > - if (!(c_parser_peek_nth_token (parser, n)->type == CPP_OPEN_SQUARE > + auto token_n = c_parser_peek_nth_token (parser, n); > + if (token_n->type == CPP_KEYWORD > + && token_n->keyword >= RID_FIRST_KEYWORD_ATTR > + && token_n->keyword <= RID_LAST_KEYWORD_ATTR) > + return true; > + if (!(token_n->type == CPP_OPEN_SQUARE > && c_parser_peek_nth_token (parser, n + 1)->type == CPP_OPEN_SQUARE)) > return false; > /* In C, '[[' must start attributes. In Objective-C, we need to > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc > index 560e5431636..50698439104 100644 > --- a/gcc/config/aarch64/aarch64.cc > +++ b/gcc/config/aarch64/aarch64.cc > @@ -2802,6 +2802,7 @@ static const struct attribute_spec aarch64_attribute_table[] = > { "arm_sve_vector_bits", 1, 1, false, true, false, true, > aarch64_sve::handle_arm_sve_vector_bits_attribute, > NULL }, > + { "__arm_streaming", 0, 0, false, true, true, true, NULL, NULL, true }, > { "Advanced SIMD type", 1, 1, false, true, false, true, NULL, NULL }, > { "SVE type", 3, 3, false, true, false, true, NULL, NULL }, > { "SVE sizeless type", 0, 0, false, true, false, true, NULL, NULL }, > diff --git a/gcc/tree-core.h b/gcc/tree-core.h > index 668808a29d0..e6700509b9e 100644 > --- a/gcc/tree-core.h > +++ b/gcc/tree-core.h > @@ -2167,6 +2167,25 @@ struct attribute_spec { > /* An array of attribute exclusions describing names of other attributes > that this attribute is mutually exclusive with. */ > const exclusions *exclude; > + > + /* Whether the attribute is a C/C++ "regular keyword attribute". > + When true for an attribute "foo", this means that: > + > + - The keyword foo can appear exactly where the standard attribute syntax > + [[...]] can appear, but without the same minimum language requirements. > + The link is intended to be automatic: there should be no exceptions. > + > + - The attribute appertains to whatever a standard attribute in the > + same location would appertain to. There is no "sliding" from decls > + to types, as sometimes happens for GNU-style attributes. > + > + - When MAX_LENGTH > 0, the keyword is followed by an argument list. > + This argument list is parsed in the same way as arguments to [[...]] > + attributes, except that the list can be empty if MIN_LENGTH == 0. > + > + - In C and C++, the attribute cannot appear in __attribute__ or [[...]]; > + it can only appear as a simple keyword. */ > + bool is_keyword; > }; > > /* These functions allow a front-end to perform a manual layout of a -- Nathan Sidwell