From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x32f.google.com (mail-wm1-x32f.google.com [IPv6:2a00:1450:4864:20::32f]) by sourceware.org (Postfix) with ESMTPS id 7D3D93861038 for ; Mon, 12 Jul 2021 08:09:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7D3D93861038 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=embecosm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=embecosm.com Received: by mail-wm1-x32f.google.com with SMTP id q18-20020a1ce9120000b02901f259f3a250so10853591wmc.2 for ; Mon, 12 Jul 2021 01:09:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=embecosm.com; s=google; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to; bh=vEbwRp4QzMX2AlpNuJ+U/RCe7rfZn4LylRqWBARnT98=; b=GyhR8up7+nWOMxvn/7n47yNwdKLoAYrq7jKsWfwd+umXtokRQetboAdKEEo7kBaPku J7lk4jUDLBnbh2snOAenJiF04EdiTeitU6raorXn6qBuSwzDsKS4t6qP5lnNcm+45kKM NO1XEyoXRtMr2ARgJHo1LOTdIPBWMiiqmYvc21T7QCUaM0SMrr9veih+/ky2y6oNPlCZ H9x2wpO94WE7xszQbpf5wDM/5RjkYH9IXQ+zfrp7sojPIupMdokai/K4Ib5ty5crMlhb lC8DkA0/1/PsgdFvYnEgRhYnGcuvaW9N7vKddgEWYl4DXcsdFqMnShdPLoKPnVJ2GY2D xlIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to; bh=vEbwRp4QzMX2AlpNuJ+U/RCe7rfZn4LylRqWBARnT98=; b=S4+PgsAReXsRmlZVGBXnht8YptTB+LDCpFwgCkKglS0LqX48/0ZRLQuAnc41GtKCQh 6b2w1zMhwkRVhQC0r4t2fQ1hFk85mXF94YjPo3LRzOr8qbDE8rRfYr0kFNt09X//Ivk9 zfpcQU/SaI4JhxdlAm+evhULnkqLl1sw/KbXphlfH1zrLJsIg4RxSw9m6s4vTOk5IgMB +k7e7zvw/3uNElCL1GAffEQL8FIWd54/o3sWK8D6LNfuyhzAgJbVjScQfclN+0iMvPy7 ZvpwObed6kpC+3WOJZTQp1DkicxaFkqBUrlayXU0zkgbvZr+cIW8UKK1y8hj1mS9MBcK gMzw== X-Gm-Message-State: AOAM532FZxkXvLv/9OeE0AFc7gJJPGqBq7Rv5Y0/TvRuc4eALekROE/C LZUPPUTMiWLOcWAsLy08sN9GAP39nKihdQ== X-Google-Smtp-Source: ABdhPJxXE7UXuzAltaUAvfz929iIStwcIQT6vB2X9FBJWfRk4tT7gF2kX86Nzp3fffzgEOqp/xunmA== X-Received: by 2002:a7b:c1c2:: with SMTP id a2mr7654396wmj.15.1626077351216; Mon, 12 Jul 2021 01:09:11 -0700 (PDT) Received: from [192.168.0.40] ([86.14.124.218]) by smtp.gmail.com with ESMTPSA id b187sm6887599wmd.40.2021.07.12.01.09.10 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 12 Jul 2021 01:09:10 -0700 (PDT) Subject: Re: [PATCH] Handle doc comment strings in lexer and parser To: gcc-rust@gcc.gnu.org References: <20210711201018.389798-1-mark@klomp.org> From: Philip Herron Message-ID: Date: Mon, 12 Jul 2021 09:09:09 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <20210711201018.389798-1-mark@klomp.org> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="hx5VOJ4Ra8xw1voEZu6XEefBqyer2IGkN" X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-rust@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: gcc-rust mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Jul 2021 08:09:18 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --hx5VOJ4Ra8xw1voEZu6XEefBqyer2IGkN Content-Type: multipart/mixed; boundary="8cTF6R5AqY2EqNaCvIb8gG8BYr0t4Q7z4"; protected-headers="v1" From: Philip Herron To: gcc-rust@gcc.gnu.org Message-ID: Subject: Re: [PATCH] Handle doc comment strings in lexer and parser References: <20210711201018.389798-1-mark@klomp.org> In-Reply-To: <20210711201018.389798-1-mark@klomp.org> --8cTF6R5AqY2EqNaCvIb8gG8BYr0t4Q7z4 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Content-Language: en-US On 11/07/2021 21:10, Mark Wielaard wrote: > Remove (unused) comment related tokens and replace them with > INNER_DOC_COMMENT and OUTER_DOC_COMMENT tokens, which keep the comment > text as a string. These can be constructed with the new > make_inner_doc_comment and make_outer_doc_comment methods. > > Make sure to not confuse doc strings with normal comments in the lexer > when detecting shebang lines. Both single line //! and /*! */ blocks > are turned into INNER_DOC_COMMENT tokens. And both single line /// and > /** */ blocks are turned into OUTER_DOC_COMMENT tokens. > > Also fixes some issues with cr/lf line endings and keeping the line > map correct when seeing \n in a comment. > > In the parser handle INNER_DOC_COMMENT and OUTER_DOC_COMMENTS where > inner (#[]) and outer (#![]) attributes are handled. Add a method > parse_doc_comment which turns the tokens into an "doc" Attribute with > the string as literal expression. > > Add get_locus method to Attribute class for better error reporting. > > Tests are added for correctly placed and formatted doc strings, with > or without cr/lf line endings. Incorrect formatted (isolated CRs) doc > strings and badly placed inner doc strings. No tests add handling of > the actual doc attributes yet. These could be tested once we add > support for the #![warn(missing_docs)] attribute. > --- > gcc/rust/ast/rust-ast.h | 2 + > gcc/rust/lex/rust-lex.cc | 214 ++++++++++++++++--= > gcc/rust/lex/rust-token.h | 25 +- > gcc/rust/parse/rust-parse-impl.h | 60 ++++- > gcc/rust/parse/rust-parse.h | 1 + > gcc/testsuite/rust/compile/bad_inner_doc.rs | 15 ++ > .../compile/doc_isolated_cr_block_comment.rs | 3 + > .../doc_isolated_cr_inner_block_comment.rs | 5 + > .../doc_isolated_cr_inner_line_comment.rs | 5 + > .../compile/doc_isolated_cr_line_comment.rs | 3 + > .../torture/all_doc_comment_line_blocks.rs | 47 ++++ > .../all_doc_comment_line_blocks_crlf.rs | 47 ++++ > .../torture/isolated_cr_block_comment.rs | 2 + > .../torture/isolated_cr_line_comment.rs | 2 + > 14 files changed, 401 insertions(+), 30 deletions(-) > create mode 100644 gcc/testsuite/rust/compile/bad_inner_doc.rs > create mode 100644 gcc/testsuite/rust/compile/doc_isolated_cr_block_co= mment.rs > create mode 100644 gcc/testsuite/rust/compile/doc_isolated_cr_inner_bl= ock_comment.rs > create mode 100644 gcc/testsuite/rust/compile/doc_isolated_cr_inner_li= ne_comment.rs > create mode 100644 gcc/testsuite/rust/compile/doc_isolated_cr_line_com= ment.rs > create mode 100644 gcc/testsuite/rust/compile/torture/all_doc_comment_= line_blocks.rs > create mode 100644 gcc/testsuite/rust/compile/torture/all_doc_comment_= line_blocks_crlf.rs > create mode 100644 gcc/testsuite/rust/compile/torture/isolated_cr_bloc= k_comment.rs > create mode 100644 gcc/testsuite/rust/compile/torture/isolated_cr_line= _comment.rs > > diff --git a/gcc/rust/ast/rust-ast.h b/gcc/rust/ast/rust-ast.h > index 75b08f8aa66..3e3e185b9b5 100644 > --- a/gcc/rust/ast/rust-ast.h > +++ b/gcc/rust/ast/rust-ast.h > @@ -455,6 +455,8 @@ public: > // Returns whether the attribute is considered an "empty" attribute.= > bool is_empty () const { return attr_input =3D=3D nullptr && path.is= _empty (); } > =20 > + Location get_locus () const { return locus; } > + > /* e.g.: > #![crate_type =3D "lib"] > #[test] > diff --git a/gcc/rust/lex/rust-lex.cc b/gcc/rust/lex/rust-lex.cc > index 617dd69a080..0b8a8eae651 100644 > --- a/gcc/rust/lex/rust-lex.cc > +++ b/gcc/rust/lex/rust-lex.cc > @@ -265,9 +265,16 @@ Lexer::build_token () > int next_char =3D peek_input (n); > if (is_whitespace (next_char)) > n++; > - else if (next_char =3D=3D '/' && peek_input (n + 1) =3D=3D '/')= > + else if ((next_char =3D=3D '/' && peek_input (n + 1) =3D=3D '/'= > + && peek_input (n + 2) !=3D '!' > + && peek_input (n + 2) !=3D '/') > + || (next_char =3D=3D '/' && peek_input (n + 1) =3D=3D '/' > + && peek_input (n + 2) =3D=3D '/' > + && peek_input (n + 3) =3D=3D '/')) > { > + // two // or four //// > // A single line comment > + // (but not an inner or outer doc comment) > n +=3D 2; > next_char =3D peek_input (n); > while (next_char !=3D '\n' && next_char !=3D EOF) > @@ -278,9 +285,30 @@ Lexer::build_token () > if (next_char =3D=3D '\n') > n++; > } > - else if (next_char =3D=3D '/' && peek_input (n + 1) =3D=3D '*')= > + else if (next_char =3D=3D '/' && peek_input (n + 1) =3D=3D '*' > + && peek_input (n + 2) =3D=3D '*' > + && peek_input (n + 3) =3D=3D '/') > { > + /**/ > + n +=3D 4; > + } > + else if (next_char =3D=3D '/' && peek_input (n + 1) =3D=3D '*' > + && peek_input (n + 2) =3D=3D '*' && peek_input (n + 3) =3D=3D= '*' > + && peek_input (n + 4) =3D=3D '/') > + { > + /***/ > + n +=3D 5; > + } > + else if ((next_char =3D=3D '/' && peek_input (n + 1) =3D=3D '*'= > + && peek_input (n + 2) !=3D '*' > + && peek_input (n + 2) !=3D '!') > + || (next_char =3D=3D '/' && peek_input (n + 1) =3D=3D '*' > + && peek_input (n + 2) =3D=3D '*' > + && peek_input (n + 3) =3D=3D '*')) > + { > + // one /* or three /*** > // Start of a block comment > + // (but not an inner or outer doc comment) > n +=3D 2; > int level =3D 1; > while (level > 0) > @@ -339,6 +367,9 @@ Lexer::build_token () > // tell line_table that new line starts > line_map->start_line (current_line, max_column_hint); > continue; > + case '\r': // cr > + // Ignore, we expect a newline (lf) soon. > + continue; > case ' ': // space > current_column++; > continue; > @@ -445,11 +476,14 @@ Lexer::build_token () > =20 > return Token::make (DIV_EQ, loc); > } > - else if (peek_input () =3D=3D '/') > + else if ((peek_input () =3D=3D '/' && peek_input (1) !=3D '!' > + && peek_input (1) !=3D '/') > + || (peek_input () =3D=3D '/' && peek_input (1) =3D=3D '/' > + && peek_input (2) =3D=3D '/')) > { > - // TODO: single-line doc comments > - > + // two // or four //// > // single line comment > + // (but not an inner or outer doc comment) > skip_input (); > current_column +=3D 2; > =20 > @@ -461,23 +495,85 @@ Lexer::build_token () > current_char =3D peek_input (); > } > continue; > - break; > } > - else if (peek_input () =3D=3D '*') > + else if (peek_input () =3D=3D '/' > + && (peek_input (1) =3D=3D '!' || peek_input (1) =3D=3D '/')) > { > + /* single line doc comment, inner or outer. */ > + bool is_inner =3D peek_input (1) =3D=3D '!'; > + skip_input (1); > + current_column +=3D 3; > + > + std::string str; > + str.reserve (32); > + current_char =3D peek_input (); > + while (current_char !=3D '\n') > + { > + skip_input (); > + if (current_char =3D=3D '\r') > + { > + char next_char =3D peek_input (); > + if (next_char =3D=3D '\n') > + { > + current_char =3D '\n'; > + break; > + } > + rust_error_at ( > + loc, "Isolated CR %<\\r%> not allowed in doc comment"); > + current_char =3D next_char; > + continue; > + } > + if (current_char =3D=3D EOF) > + { > + rust_error_at ( > + loc, "unexpected EOF while looking for end of comment"); > + break; > + } > + str +=3D current_char; > + current_char =3D peek_input (); > + } > + skip_input (); > + current_line++; > + current_column =3D 1; > + // tell line_table that new line starts > + line_map->start_line (current_line, max_column_hint); > + > + str.shrink_to_fit (); > + if (is_inner) > + return Token::make_inner_doc_comment (loc, std::move (str)); > + else > + return Token::make_outer_doc_comment (loc, std::move (str)); > + } > + else if (peek_input () =3D=3D '*' && peek_input (1) =3D=3D '*' > + && peek_input (2) =3D=3D '/') > + { > + /**/ > + skip_input (2); > + current_column +=3D 4; > + continue; > + } > + else if (peek_input () =3D=3D '*' && peek_input (1) =3D=3D '*' > + && peek_input (2) =3D=3D '*' && peek_input (3) =3D=3D '/') > + { > + /***/ > + skip_input (3); > + current_column +=3D 5; > + continue; > + } > + else if ((peek_input () =3D=3D '*' && peek_input (1) !=3D '!' > + && peek_input (1) !=3D '*') > + || (peek_input () =3D=3D '*' && peek_input (1) =3D=3D '*' > + && peek_input (2) =3D=3D '*')) > + { > + // one /* or three /*** > // block comment > + // (but not an inner or outer doc comment) > skip_input (); > current_column +=3D 2; > =20 > - // TODO: block doc comments > - > - current_char =3D peek_input (); > - > int level =3D 1; > while (level > 0) > { > - skip_input (); > - current_column++; // for error-handling > current_char =3D peek_input (); > =20 > if (current_char =3D=3D EOF) > @@ -496,6 +592,7 @@ Lexer::build_token () > current_column +=3D 2; > =20 > level +=3D 1; > + continue; > } > =20 > // ignore until */ is found > @@ -505,16 +602,101 @@ Lexer::build_token () > skip_input (1); > =20 > current_column +=3D 2; > - // should only break inner loop here - seems to do so > - // break; > =20 > level -=3D 1; > + continue; > } > + > + if (current_char =3D=3D '\n') > + { > + skip_input (); > + current_line++; > + current_column =3D 1; > + // tell line_table that new line starts > + line_map->start_line (current_line, max_column_hint); > + continue; > + } > + > + skip_input (); > + current_column++; > } > =20 > // refresh new token > continue; > - break; > + } > + else if (peek_input () =3D=3D '*' > + && (peek_input (1) =3D=3D '!' || peek_input (1) =3D=3D '*')) > + { > + // block doc comment, inner /*! or outer /** > + bool is_inner =3D peek_input (1) =3D=3D '!'; > + skip_input (1); > + current_column +=3D 3; > + > + std::string str; > + str.reserve (96); > + > + int level =3D 1; > + while (level > 0) > + { > + current_char =3D peek_input (); > + > + if (current_char =3D=3D EOF) > + { > + rust_error_at ( > + loc, "unexpected EOF while looking for end of comment"); > + break; > + } > + > + // if /* found > + if (current_char =3D=3D '/' && peek_input (1) =3D=3D '*') > + { > + // skip /* characters > + skip_input (1); > + current_column +=3D 2; > + > + level +=3D 1; > + str +=3D "/*"; > + continue; > + } > + > + // ignore until */ is found > + if (current_char =3D=3D '*' && peek_input (1) =3D=3D '/') > + { > + // skip */ characters > + skip_input (1); > + current_column +=3D 2; > + > + level -=3D 1; > + if (level > 0) > + str +=3D "*/"; > + continue; > + } > + > + if (current_char =3D=3D '\r' && peek_input (1) !=3D '\n') > + rust_error_at ( > + loc, "Isolated CR %<\\r%> not allowed in doc comment"); > + > + if (current_char =3D=3D '\n') > + { > + skip_input (); > + current_line++; > + current_column =3D 1; > + // tell line_table that new line starts > + line_map->start_line (current_line, max_column_hint); > + str +=3D '\n'; > + continue; > + } > + > + str +=3D current_char; > + skip_input (); > + current_column++; > + } > + > + str.shrink_to_fit (); > + if (is_inner) > + return Token::make_inner_doc_comment (loc, std::move (str)); > + else > + return Token::make_outer_doc_comment (loc, std::move (str)); > } > else > { > diff --git a/gcc/rust/lex/rust-token.h b/gcc/rust/lex/rust-token.h > index 771910119b7..1c397c839fd 100644 > --- a/gcc/rust/lex/rust-token.h > +++ b/gcc/rust/lex/rust-token.h > @@ -151,15 +151,10 @@ enum PrimitiveCoreType > RS_TOKEN (RIGHT_SQUARE, "]") = \ > /* Macros */ = \ > RS_TOKEN (DOLLAR_SIGN, "$") = \ > - /* Comments */ = \ > - RS_TOKEN (LINE_COMMENT, "//") = \ > - RS_TOKEN (INNER_LINE_DOC, "//!") = \ > - RS_TOKEN (OUTER_LINE_DOC, "///") = \ > - RS_TOKEN (BLOCK_COMMENT_START, "/*") = \ > - RS_TOKEN (BLOCK_COMMENT_END, "*/") = \ > - RS_TOKEN (INNER_BLOCK_DOC_START, "/*!") = \ > - RS_TOKEN (OUTER_BLOCK_DOC_START, = \ > - "/**") /* have "weak" union and 'static keywords? */ = \ > + /* Doc Comments */ = \ > + RS_TOKEN (INNER_DOC_COMMENT, "#![doc]") = \ > + RS_TOKEN (OUTER_DOC_COMMENT, "#[doc]") = \ > + /* have "weak" union and 'static keywords? */ = \ > = \ > RS_TOKEN_KEYWORD (ABSTRACT, "abstract") /* unused */ = \ > RS_TOKEN_KEYWORD (AS, "as") = \ > @@ -368,6 +363,18 @@ public: > return TokenPtr (new Token (BYTE_STRING_LITERAL, locus, std::move = (str))); > } > =20 > + // Makes and returns a new TokenPtr of type INNER_DOC_COMMENT. > + static TokenPtr make_inner_doc_comment (Location locus, std::string = &&str) > + { > + return TokenPtr (new Token (INNER_DOC_COMMENT, locus, std::move (s= tr))); > + } > + > + // Makes and returns a new TokenPtr of type OUTER_DOC_COMMENT. > + static TokenPtr make_outer_doc_comment (Location locus, std::string = &&str) > + { > + return TokenPtr (new Token (OUTER_DOC_COMMENT, locus, std::move (s= tr))); > + } > + > // Makes and returns a new TokenPtr of type LIFETIME. > static TokenPtr make_lifetime (Location locus, std::string &&str) > { > diff --git a/gcc/rust/parse/rust-parse-impl.h b/gcc/rust/parse/rust-par= se-impl.h > index a8597fa401e..eedc76db43e 100644 > --- a/gcc/rust/parse/rust-parse-impl.h > +++ b/gcc/rust/parse/rust-parse-impl.h > @@ -434,8 +434,9 @@ Parser::parse_inner_attributes = () > AST::AttrVec inner_attributes; > =20 > // only try to parse it if it starts with "#!" not only "#" > - while (lexer.peek_token ()->get_id () =3D=3D HASH > - && lexer.peek_token (1)->get_id () =3D=3D EXCLAM) > + while ((lexer.peek_token ()->get_id () =3D=3D HASH > + && lexer.peek_token (1)->get_id () =3D=3D EXCLAM) > + || lexer.peek_token ()->get_id () =3D=3D INNER_DOC_COMMENT) > { > AST::Attribute inner_attr =3D parse_inner_attribute (); > =20 > @@ -457,11 +458,33 @@ Parser::parse_inner_attribute= s () > return inner_attributes; > } > =20 > +// Parse a inner or outer doc comment into an doc attribute > +template > +AST::Attribute > +Parser::parse_doc_comment () > +{ > + const_TokenPtr token =3D lexer.peek_token (); > + Location locus =3D token->get_locus (); > + AST::SimplePathSegment segment ("doc", locus); > + std::vector segments; > + segments.push_back (std::move (segment)); > + AST::SimplePath attr_path (std::move (segments), false, locus); > + AST::LiteralExpr lit_expr (token->get_str (), AST::Literal::STRING, > + PrimitiveCoreType::CORETYPE_STR, {}, locus); > + std::unique_ptr attr_input ( > + new AST::AttrInputLiteral (std::move (lit_expr))); > + lexer.skip_token (); > + return AST::Attribute (std::move (attr_path), std::move (attr_input)= , locus); > +} > + > // Parse a single inner attribute. > template > AST::Attribute > Parser::parse_inner_attribute () > { > + if (lexer.peek_token ()->get_id () =3D=3D INNER_DOC_COMMENT) > + return parse_doc_comment (); > + > if (lexer.peek_token ()->get_id () !=3D HASH) > { > Error error (lexer.peek_token ()->get_locus (), > @@ -1019,7 +1042,15 @@ Parser::parse_item (bool cal= led_from_statement) > switch (t->get_id ()) > { > case END_OF_FILE: > - // not necessarily an error > + // not necessarily an error, unless we just read outer > + // attributes which needs to be attached > + if (!outer_attrs.empty ()) > + { > + Rust::AST::Attribute attr =3D outer_attrs.back (); > + Error error (attr.get_locus (), > + "expected item after outer attribute or doc comment"); > + add_error (std::move (error)); > + } > return nullptr; > case PUB: > case MOD: > @@ -1091,7 +1122,11 @@ Parser::parse_outer_attribut= es () > { > AST::AttrVec outer_attributes; > =20 > - while (lexer.peek_token ()->get_id () =3D=3D HASH) > + while (lexer.peek_token ()->get_id () > + =3D=3D HASH /* Can also be #!, which catches errors. */ > + || lexer.peek_token ()->get_id () =3D=3D OUTER_DOC_COMMENT > + || lexer.peek_token ()->get_id () > + =3D=3D INNER_DOC_COMMENT) /* For error handling. */ > { > AST::Attribute outer_attr =3D parse_outer_attribute (); > =20 > @@ -1121,6 +1156,20 @@ template > AST::Attribute > Parser::parse_outer_attribute () > { > + if (lexer.peek_token ()->get_id () =3D=3D OUTER_DOC_COMMENT) > + return parse_doc_comment (); > + > + if (lexer.peek_token ()->get_id () =3D=3D INNER_DOC_COMMENT) > + { > + Error error ( > + lexer.peek_token ()->get_locus (), > + "inner doc (% or %) only allowed at start of item " > + "and before any outer attribute or doc (%<#[%>, % or %)")= ; > + add_error (std::move (error)); > + lexer.skip_token (); > + return AST::Attribute::create_empty (); > + } > + > /* OuterAttribute -> '#' '[' Attr ']' */ > =20 > if (lexer.peek_token ()->get_id () !=3D HASH) > @@ -1134,12 +1183,13 @@ Parser::parse_outer_attribu= te () > if (id =3D=3D EXCLAM) > { > // this is inner attribute syntax, so throw error > + // inner attributes were either already parsed or not allowed here.= > Error error ( > lexer.peek_token ()->get_locus (), > "token % found, indicating inner attribute definition. Inner = " > "attributes are not possible at this location"); > add_error (std::move (error)); > - } // TODO: are there any cases where this wouldn't be an error? > + } > return AST::Attribute::create_empty (); > } > =20 > diff --git a/gcc/rust/parse/rust-parse.h b/gcc/rust/parse/rust-parse.h > index bde2613f03d..1cd85eae8c2 100644 > --- a/gcc/rust/parse/rust-parse.h > +++ b/gcc/rust/parse/rust-parse.h > @@ -107,6 +107,7 @@ private: > AST::Attribute parse_outer_attribute (); > AST::Attribute parse_attribute_body (); > std::unique_ptr parse_attr_input (); > + AST::Attribute parse_doc_comment (); > =20 > // Path-related > AST::SimplePath parse_simple_path (); > diff --git a/gcc/testsuite/rust/compile/bad_inner_doc.rs b/gcc/testsuit= e/rust/compile/bad_inner_doc.rs > new file mode 100644 > index 00000000000..cfd166ce3ec > --- /dev/null > +++ b/gcc/testsuite/rust/compile/bad_inner_doc.rs > @@ -0,0 +1,15 @@ > +pub fn main () > +{ > + //! inner doc allowed > + let _x =3D 42; > + // { dg-error "inner doc" "" { target *-*-* } .+1 } > + //! inner doc disallowed > + mod module > + { > + /*! inner doc allowed */ > + /// outer doc allowed > + // { dg-error "inner doc" "" { target *-*-* } .+1 } > + /*! but inner doc not here */ > + mod x { } > + } > +} > diff --git a/gcc/testsuite/rust/compile/doc_isolated_cr_block_comment.r= s b/gcc/testsuite/rust/compile/doc_isolated_cr_block_comment.rs > new file mode 100644 > index 00000000000..0ada77f69cf > --- /dev/null > +++ b/gcc/testsuite/rust/compile/doc_isolated_cr_block_comment.rs > @@ -0,0 +1,3 @@ > +// { dg-error "Isolated CR" "" { target *-*-* } .+1 } > +/** doc cr > comment */ > +pub fn main () { } > diff --git a/gcc/testsuite/rust/compile/doc_isolated_cr_inner_block_com= ment.rs b/gcc/testsuite/rust/compile/doc_isolated_cr_inner_block_comment.= rs > new file mode 100644 > index 00000000000..7db35341bee > --- /dev/null > +++ b/gcc/testsuite/rust/compile/doc_isolated_cr_inner_block_comment.rs= > @@ -0,0 +1,5 @@ > +pub fn main () > +{ > +// { dg-error "Isolated CR" "" { target *-*-* } .+1 } > + /*! doc cr > comment */ > +} > diff --git a/gcc/testsuite/rust/compile/doc_isolated_cr_inner_line_comm= ent.rs b/gcc/testsuite/rust/compile/doc_isolated_cr_inner_line_comment.rs= > new file mode 100644 > index 00000000000..d75da75e218 > --- /dev/null > +++ b/gcc/testsuite/rust/compile/doc_isolated_cr_inner_line_comment.rs > @@ -0,0 +1,5 @@ > +pub fn main () > +{ > +// { dg-error "Isolated CR" "" { target *-*-* } .+1 } > + //! doc cr > comment > +} > diff --git a/gcc/testsuite/rust/compile/doc_isolated_cr_line_comment.rs= b/gcc/testsuite/rust/compile/doc_isolated_cr_line_comment.rs > new file mode 100644 > index 00000000000..7b6ef989c30 > --- /dev/null > +++ b/gcc/testsuite/rust/compile/doc_isolated_cr_line_comment.rs > @@ -0,0 +1,3 @@ > +// { dg-error "Isolated CR" "" { target *-*-* } .+1 } > +/// doc cr > comment > +pub fn main () { } > diff --git a/gcc/testsuite/rust/compile/torture/all_doc_comment_line_bl= ocks.rs b/gcc/testsuite/rust/compile/torture/all_doc_comment_line_blocks.= rs > new file mode 100644 > index 00000000000..ab38ac69610 > --- /dev/null > +++ b/gcc/testsuite/rust/compile/torture/all_doc_comment_line_blocks.rs= > @@ -0,0 +1,47 @@ > +// comment line not a doc > +/* comment block not a doc */ > + > +//! inner line comment for most outer crate > +/*! inner block comment for most outer crate */ > + > +// comment line not a doc > +/* comment block not a doc */ > + > +/// outer doc line for module > +/** outer doc block for module */ > +pub mod module > +{ > + //! inner line doc > + //!! inner line doc! > + /*! inner block doc */ > + /*!! inner block doc! */ > + > + // line comment > + /// outer line doc > + //// line comment > + > + /* block comment */ > + /** outer block doc */ > + /*** block comment */ > + > + mod block_doc_comments > + { > + /* /* */ /** */ /*! */ */ > + /*! /* */ /** */ /*! */ */ > + /** /* */ /** */ /*! */ */ > + mod item { } > + } > + > + pub mod empty > + { > + //! > + /*!*/ > + // > + > + /// > + mod doc { } > + /**/ > + /***/ > + } > +} > +pub fn main () { } > diff --git a/gcc/testsuite/rust/compile/torture/all_doc_comment_line_bl= ocks_crlf.rs b/gcc/testsuite/rust/compile/torture/all_doc_comment_line_bl= ocks_crlf.rs > new file mode 100644 > index 00000000000..3ea2cd01c8c > --- /dev/null > +++ b/gcc/testsuite/rust/compile/torture/all_doc_comment_line_blocks_cr= lf.rs > @@ -0,0 +1,47 @@ > +// comment line not a doc > +/* comment block not a doc */ > + > +//! inner line comment for most outer crate > +/*! inner block comment for most outer crate */ > + > +// comment line not a doc > +/* comment block not a doc */ > + > +/// outer doc line for module > +/** outer doc block for module */ > +pub mod module > +{ > + //! inner line doc > + //!! inner line doc! > + /*! inner block doc */ > + /*!! inner block doc! */ > + > + // line comment > + /// outer line doc > + //// line comment > + > + /* block comment */ > + /** outer block doc */ > + /*** block comment */ > + > + mod block_doc_comments > + { > + /* /* */ /** */ /*! */ */ > + /*! /* */ /** */ /*! */ */ > + /** /* */ /** */ /*! */ */ > + mod item { } > + } > + > + pub mod empty > + { > + //! > + /*!*/ > + // > + > + /// > + mod doc { } > + /**/ > + /***/ > + } > +} > +pub fn main () { } > diff --git a/gcc/testsuite/rust/compile/torture/isolated_cr_block_comme= nt.rs b/gcc/testsuite/rust/compile/torture/isolated_cr_block_comment.rs > new file mode 100644 > index 00000000000..9a1e090f330 > --- /dev/null > +++ b/gcc/testsuite/rust/compile/torture/isolated_cr_block_comment.rs > @@ -0,0 +1,2 @@ > +/* comment cr > is allowed */ > +pub fn main () { } > diff --git a/gcc/testsuite/rust/compile/torture/isolated_cr_line_commen= t.rs b/gcc/testsuite/rust/compile/torture/isolated_cr_line_comment.rs > new file mode 100644 > index 00000000000..4e921a225c2 > --- /dev/null > +++ b/gcc/testsuite/rust/compile/torture/isolated_cr_line_comment.rs > @@ -0,0 +1,2 @@ > +// comment cr > is allowed > +pub fn main () { } Hi Mark, This patch looks good to me. When I tried to apply it to merge it I got the following: ``` $ git am=C2=A0 '[PATCH] Handle doc comment strings in lexer and parser.em= l' Applying: Handle doc comment strings in lexer and parser error: corrupt patch at line 531 Patch failed at 0001 Handle doc comment strings in lexer and parser hint: Use 'git am --show-current-patch' to see the failed patch When you have resolved this problem, run "git am --continue". If you prefer to skip this patch, run "git am --skip" instead. To restore the original branch and stop patching, run "git am --abort". ``` Not sure if I have done something wrong, have you any pointers? Thanks --Phil --8cTF6R5AqY2EqNaCvIb8gG8BYr0t4Q7z4-- --hx5VOJ4Ra8xw1voEZu6XEefBqyer2IGkN Content-Type: application/pgp-signature; name="OpenPGP_signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="OpenPGP_signature" -----BEGIN PGP SIGNATURE----- wsD5BAABCAAjFiEET83ATZOayqRjyL0Cr7gxHEFOdpkFAmDr+KUFAwAAAAAACgkQr7gxHEFOdpmI fAv/SR6mQiMhYyuwqUXR39q4RnxpQkaREayq4MTyfrvmeu8q/5iOjchDC78XDGxxhi7OT9lgQppz k07ZXbMLZnp/NF8fneNRYhf1mRY+NmST3fWW9lwI/KCy6cFAjDRRjtyX2HU+fhwPdBXf/CMa5Kmf FGTK0OFv/KLUbka9g57QoJghGYGHYhCKPrP9Le6zec5En9VheawMDW7Tlms9Az7qjIsNMaKAjWl6 7jSBC6xD5RpOdHtc1QeRbiKkWmvlwE421a5pU5NFD+nfL1vxyn03wKEnYLGkqlgI4GApnik1pUZ5 UxmGlltBsnQdSzyQmuAAmAWxKLpYu+1KuNmON2Xt9yRjqgTM0uGytR8Z6+8dSMQ0zdaR2b4zIAks 9d4XiJ2eOcNSjVfu/nONG0r19fZK/BUaAv+XdfwSXFm2lZk+6dUTrRQqdS/Fk2m4LsY3fNYmnVOX f4jAufWvX9WMkZhe6PxXnQUPaKQtHy9mVI3T5ZduAW/1+Jp0oQc4+FDVDsBe =z59W -----END PGP SIGNATURE----- --hx5VOJ4Ra8xw1voEZu6XEefBqyer2IGkN--