From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 57CE23858D35 for ; Mon, 13 Nov 2023 23:57:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 57CE23858D35 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 57CE23858D35 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699919830; cv=none; b=YYvcRG7XsWE1zxwUpJh6qE0eOUOJyj7jJmkf1ucxHUKVrplqJxr/KeNDlNPZcCIAPHegzxO9eF60s0s0OflRNFUG3rPr74s9Zd+SnrHU42tOKUKu4fKH2Jw1I3cHuoPQZYfMC2Cnd6s+c+AMc8sj+CdgabNGAoKgN1nz4fcXnRE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699919830; c=relaxed/simple; bh=oD/K4BCLw4VPPSFpt5Y3h9Y41wBma/XPkXQ39eGwXIU=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=lueOGGoc8ktmTqdoXrXJsA6qK/JTQ5a7LNRoZhXBv6t1EyjMFoTpNaL+gZlxlvXOo9w7kOu6zYfhBhvYXjYQOGC8CtLJz8ybvrxvQfhh1Mp2Ft1T+U35zV8yjkfkrivUYAdTMkPRWA3qlPmCHX1i4RmDBDKlhRcyPswTAawbMlA= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1699919828; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=rC6WouHiTXe8vKWaGDSleFRXj/m3aUUHQIsMX5YOyJY=; b=H0uVhhZfRs5uGSYaTQkm59dMwO+VjwWbhlo2zC/5Z2PSsUEgNRb6nl/q19eIDNkg23FRak mIz0PNOf7RxEjrBhRB2TvpegiQoGA9Uw4841nfIJQ4ttZjRLeVROaPCis4JOrKMDJHozhm nMmiK7VHEDgchrJ2x8MCAiv3ralRies= Received: from mail-yb1-f200.google.com (mail-yb1-f200.google.com [209.85.219.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-311-IEQ38YCVPPaESSfPmLjVaw-1; Mon, 13 Nov 2023 18:57:06 -0500 X-MC-Unique: IEQ38YCVPPaESSfPmLjVaw-1 Received: by mail-yb1-f200.google.com with SMTP id 3f1490d57ef6-da04776a869so5467040276.0 for ; Mon, 13 Nov 2023 15:57:06 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699919826; x=1700524626; h=user-agent:in-reply-to:content-disposition:mime-version:references :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rC6WouHiTXe8vKWaGDSleFRXj/m3aUUHQIsMX5YOyJY=; b=P3aLHJVStSQpwfuJf2fKz/371XoRk9uDIkAgVcD7ghh0/QMYijXDQKM114qtYRKhQ8 pwh+yND6PxPTC+CYBx+QRMvMIBPCjEm6207foGCQU70W7PsL6tUFYzS41bhn3j4xSNOo MDGkwke9Yt9RPP+uzR2/3E6VIs7V31ft2wncIKd4lrvQaA/m8p7dZRxZrUIlKmaL77V+ y0d2EtYPxXNnK/dOnEdouipx3OZL2ufIaUsWSY8U5rtZERtvIiWJmcPESKKj95qbm/Z7 F5TVnOHhvRt+0olmFOMkleYLQEk9JRExL/iQ6ub4x+vkZyWSZKhj4LejTjJWpZ2lrMJm nMZA== X-Gm-Message-State: AOJu0YxowzkK1hVPaRnmEUmIhc3HR0r742AIe5FrZ/AoaNbzffnQEOIq /iUaJOJrUCb0fvzvgoem4weZCvb9ZFvhO39hRUBR/9xHZ6+6OGYAhTGtF+gBHhGap/7Xep7vmHs mLU+7skngYhXvXoqPxg== X-Received: by 2002:a25:730d:0:b0:da0:ca7b:d3d7 with SMTP id o13-20020a25730d000000b00da0ca7bd3d7mr5553760ybc.39.1699919826039; Mon, 13 Nov 2023 15:57:06 -0800 (PST) X-Google-Smtp-Source: AGHT+IEneW0UNollU9xUExsMP1xAbcvtpJWO3LySowbsTsasdhyZRSm9BfPvQxb9Q3qwM1pdplDdVg== X-Received: by 2002:a25:730d:0:b0:da0:ca7b:d3d7 with SMTP id o13-20020a25730d000000b00da0ca7bd3d7mr5553746ybc.39.1699919825681; Mon, 13 Nov 2023 15:57:05 -0800 (PST) Received: from redhat.com (2603-7000-9500-34a5-0000-0000-0000-1db4.res6.spectrum.com. [2603:7000:9500:34a5::1db4]) by smtp.gmail.com with ESMTPSA id j16-20020a0cc350000000b006564afc5908sm2433213qvi.111.2023.11.13.15.57.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Nov 2023 15:57:05 -0800 (PST) Date: Mon, 13 Nov 2023 18:57:02 -0500 From: Marek Polacek To: iain@sandoe.co.uk Cc: gcc-patches@gcc.gnu.org, joseph@codesourcery.com, jason@redhat.com Subject: Re: [PATCH 2/4] c-family, C: handle clang attributes [PR109877]. Message-ID: References: <20231113060244.90554-1-iain@sandoe.co.uk> <20231113060244.90554-2-iain@sandoe.co.uk> <20231113060244.90554-3-iain@sandoe.co.uk> MIME-Version: 1.0 In-Reply-To: <20231113060244.90554-3-iain@sandoe.co.uk> User-Agent: Mutt/2.2.9 (2022-11-12) X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Sun, Nov 12, 2023 at 08:02:42PM -1000, Iain Sandoe wrote: > This adds the ability to defer the validation of numeric attribute > arguments until the sequence is parsed if the attribute being > handled is one known to be 'clang form'. > > We do this by considering the arguments to be strings regardless > of content and defer the interpretation of those strings until the > argument processing. I don't see any tests here nor in the C++ part of the patch. Is it possible to add some (I suppose for now only attribute availability)? FWIW, for chaining attributes it's best to use attr_chainon since that handles error_mark_node. Unfortunately that's currently only in cp/. > PR c++/109877 > > gcc/c-family/ChangeLog: > > * c-lex.cc (c_lex_with_flags): Allow for the case where > we wish to defer interpretation of numeric values until > parse time. > * c-pragma.h (C_LEX_NUMBER_AS_STRING): New. > > gcc/c/ChangeLog: > > * c-parser.cc (struct c_parser): Provide a flag to notify > that argument parsing should return attribute arguments > as string constants. > (c_lex_one_token): Act to defer numeric value validation. > (c_parser_clang_attribute_arguments): New. > (c_parser_gnu_attribute): Allow for clang-form GNU-style > attributes. > > Signed-off-by: Iain Sandoe > --- > gcc/c-family/c-lex.cc | 15 ++++++ > gcc/c-family/c-pragma.h | 3 ++ > gcc/c/c-parser.cc | 109 ++++++++++++++++++++++++++++++++++++++-- > 3 files changed, 122 insertions(+), 5 deletions(-) > > diff --git a/gcc/c-family/c-lex.cc b/gcc/c-family/c-lex.cc > index 06c2453c89a..d535f5b460c 100644 > --- a/gcc/c-family/c-lex.cc > +++ b/gcc/c-family/c-lex.cc > @@ -533,6 +533,21 @@ c_lex_with_flags (tree *value, location_t *loc, unsigned char *cpp_flags, > > case CPP_NUMBER: > { > + /* If the user wants number-like entities to be returned as a raw > + string, then don't try to classify them, which emits unwanted > + diagnostics. */ > + if (lex_flags & C_LEX_NUMBER_AS_STRING) > + { > + /* build_string adds a trailing NUL at [len]. */ > + tree num_string = build_string (tok->val.str.len + 1, > + (const char *) tok->val.str.text); > + TREE_TYPE (num_string) = char_array_type_node; > + *value = num_string; > + /* We will effectively note this as CPP_N_INVALID, because we > + made no checks here. */ > + break; > + } > + > const char *suffix = NULL; > unsigned int flags = cpp_classify_number (parse_in, tok, &suffix, *loc); > > diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h > index 98177913053..11cde74f9f0 100644 > --- a/gcc/c-family/c-pragma.h > +++ b/gcc/c-family/c-pragma.h > @@ -276,6 +276,9 @@ extern void pragma_lex_discard_to_eol (); > #define C_LEX_STRING_NO_JOIN 2 /* Do not concatenate strings > nor translate them into execution > character set. */ > +#define C_LEX_NUMBER_AS_STRING 4 /* Do not classify a number, but > + instead return it as a raw > + string. */ > > /* This is not actually available to pragma parsers. It's merely a > convenient location to declare this function for c-lex, after > diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc > index 703f9570dbc..aaaa16cc05d 100644 > --- a/gcc/c/c-parser.cc > +++ b/gcc/c/c-parser.cc > @@ -217,6 +217,9 @@ struct GTY(()) c_parser { > should translate them to the execution character set (false > inside attributes). */ > BOOL_BITFIELD translate_strings_p : 1; > + /* True if we want to lex arbitrary number-like sequences as their > + string representation. */ > + BOOL_BITFIELD lex_number_as_string : 1; > > /* Objective-C specific parser/lexer information. */ > > @@ -308,10 +311,10 @@ c_lex_one_token (c_parser *parser, c_token *token, bool raw = false) > > if (raw || vec_safe_length (parser->raw_tokens) == 0) > { > + int lex_flags = parser->lex_joined_string ? 0 : C_LEX_STRING_NO_JOIN; > + lex_flags |= parser->lex_number_as_string ? C_LEX_NUMBER_AS_STRING : 0; > token->type = c_lex_with_flags (&token->value, &token->location, > - &token->flags, > - (parser->lex_joined_string > - ? 0 : C_LEX_STRING_NO_JOIN)); > + &token->flags, lex_flags); > token->id_kind = C_ID_NONE; > token->keyword = RID_MAX; > token->pragma_kind = PRAGMA_NONE; > @@ -5210,6 +5213,98 @@ c_parser_gnu_attribute_any_word (c_parser *parser) > return attr_name; > } > > +/* Handle parsing clang-form attribute arguments, where we need to adjust > + the parsing rules to relate to a specific attribute. */ > + > +static tree > +c_parser_clang_attribute_arguments (c_parser *parser, tree /*attr_id*/) Why the second parameter if you don't use it? > +{ > + /* We can, if required, alter the parsing on the basis of the attribute. > + At present, we handle the availability attr, where ach entry can be : "each" > + identifier > + identifier=N.MM.Z > + identifier="string" > + followed by ',' or ) for the last entry*/ ". */" > + > + tree attr_args = NULL_TREE; > + if (c_parser_next_token_is (parser, CPP_NAME) > + && c_parser_peek_token (parser)->id_kind == C_ID_ID > + && c_parser_peek_2nd_token (parser)->type == CPP_COMMA) > + { > + tree platf = c_parser_peek_token (parser)->value; > + c_parser_consume_token (parser); > + attr_args = tree_cons (NULL_TREE, platf, NULL_TREE); > + } > + else > + { > + c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, > + "expected a platform name followed by %<,%>"); > + return error_mark_node; > + } > + c_parser_consume_token (parser); /* consume the ',' */ > + do > + { > + tree name = NULL_TREE; > + tree value = NULL_TREE; > + > + if (c_parser_next_token_is (parser, CPP_NAME) > + && c_parser_peek_token (parser)->id_kind == C_ID_ID) > + { > + name = c_parser_peek_token (parser)->value; > + c_parser_consume_token (parser); > + } > + else > + { > + c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, > + "expected an attribute keyword"); > + return error_mark_node; > + } > + if (c_parser_next_token_is (parser, CPP_EQ)) > + { > + c_parser_consume_token (parser); /* eat the '=' */ > + /* We need to bludgeon the lexer into not trying to interpret the > + xx.yy.zz form, since that just looks like a malformed float. > + Also, as a result of macro processing, we can have strig literals "string" > + that are in multiple pieces so, for this specific part of the > + parse, we need to join strings. */ > + bool saved_join_state = parser->lex_joined_string; > + parser->lex_number_as_string = 1; > + parser->lex_joined_string = 1; > + /* So look at the next token, expecting a string, or something that > + looks initially like a number, but might be a version number. */ > + c_parser_peek_token (parser); > + /* Done with the funky number parsing. */ > + parser->lex_number_as_string = 0; > + parser->lex_joined_string = saved_join_state; > + if (c_parser_next_token_is_not (parser, CPP_CLOSE_PAREN) > + && c_parser_next_token_is_not (parser, CPP_COMMA)) > + { > + value = c_parser_peek_token (parser)->value; > + /* ???: check for error mark and early-return? */ It might be useful to have a test for this invalid case. > + c_parser_consume_token (parser); > + } > + else > + { > + c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, > + "expected a value"); > + return error_mark_node; > + } > + } > + else if (c_parser_next_token_is_not (parser, CPP_CLOSE_PAREN) > + && c_parser_next_token_is_not (parser, CPP_COMMA)) > + { > + c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, > + "expected %<,%> or %<=%>"); > + return error_mark_node; > + } > + if (c_parser_next_token_is (parser, CPP_COMMA)) > + c_parser_consume_token (parser); /* Just skip the comma. */ > + tree t = tree_cons (value, name, NULL); > + chainon (attr_args, t); > + } while (c_parser_next_token_is_not (parser, CPP_CLOSE_PAREN)); > + return attr_args; > +} > + > /* Parse attribute arguments. This is a common form of syntax > covering all currently valid GNU and standard attributes. > > @@ -5375,9 +5470,13 @@ c_parser_gnu_attribute (c_parser *parser, tree attrs, > attrs = chainon (attrs, attr); > return attrs; > } > - c_parser_consume_token (parser); > + c_parser_consume_token (parser); /* The '('. */ > > - tree attr_args > + tree attr_args; > + if (attribute_clang_form_p (attr_name)) > + attr_args = c_parser_clang_attribute_arguments (parser, attr_name); > + else > + attr_args > = c_parser_attribute_arguments (parser, > attribute_takes_identifier_p (attr_name), > false, > -- > 2.39.2 (Apple Git-143) > Marek