[PATCH] Handle doc comment strings in lexer and parser

public inbox for gcc-rust@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH] Handle doc comment strings in lexer and parser
@ 2021-07-11 20:10 Mark Wielaard
  2021-07-12  8:09 ` Philip Herron
  0 siblings, 1 reply; 10+ messages in thread
From: Mark Wielaard @ 2021-07-11 20:10 UTC (permalink / raw)
  To: gcc-rust; +Cc: Mark Wielaard

Remove (unused) comment related tokens and replace them with
INNER_DOC_COMMENT and OUTER_DOC_COMMENT tokens, which keep the comment
text as a string. These can be constructed with the new
make_inner_doc_comment and make_outer_doc_comment methods.

Make sure to not confuse doc strings with normal comments in the lexer
when detecting shebang lines. Both single line //! and /*! */ blocks
are turned into INNER_DOC_COMMENT tokens. And both single line /// and
/** */ blocks are turned into OUTER_DOC_COMMENT tokens.

Also fixes some issues with cr/lf line endings and keeping the line
map correct when seeing \n in a comment.

In the parser handle INNER_DOC_COMMENT and OUTER_DOC_COMMENTS where
inner (#[]) and outer (#![]) attributes are handled. Add a method
parse_doc_comment which turns the tokens into an "doc" Attribute with
the string as literal expression.

Add get_locus method to Attribute class for better error reporting.

Tests are added for correctly placed and formatted doc strings, with
or without cr/lf line endings. Incorrect formatted (isolated CRs) doc
strings and badly placed inner doc strings. No tests add handling of
the actual doc attributes yet. These could be tested once we add
support for the #![warn(missing_docs)] attribute.
---
 gcc/rust/ast/rust-ast.h                       |   2 +
 gcc/rust/lex/rust-lex.cc                      | 214 ++++++++++++++++--
 gcc/rust/lex/rust-token.h                     |  25 +-
 gcc/rust/parse/rust-parse-impl.h              |  60 ++++-
 gcc/rust/parse/rust-parse.h                   |   1 +
 gcc/testsuite/rust/compile/bad_inner_doc.rs   |  15 ++
 .../compile/doc_isolated_cr_block_comment.rs  |   3 +
 .../doc_isolated_cr_inner_block_comment.rs    |   5 +
 .../doc_isolated_cr_inner_line_comment.rs     |   5 +
 .../compile/doc_isolated_cr_line_comment.rs   |   3 +
 .../torture/all_doc_comment_line_blocks.rs    |  47 ++++
 .../all_doc_comment_line_blocks_crlf.rs       |  47 ++++
 .../torture/isolated_cr_block_comment.rs      |   2 +
 .../torture/isolated_cr_line_comment.rs       |   2 +
 14 files changed, 401 insertions(+), 30 deletions(-)
 create mode 100644 gcc/testsuite/rust/compile/bad_inner_doc.rs
 create mode 100644 gcc/testsuite/rust/compile/doc_isolated_cr_block_comment.rs
 create mode 100644 gcc/testsuite/rust/compile/doc_isolated_cr_inner_block_comment.rs
 create mode 100644 gcc/testsuite/rust/compile/doc_isolated_cr_inner_line_comment.rs
 create mode 100644 gcc/testsuite/rust/compile/doc_isolated_cr_line_comment.rs
 create mode 100644 gcc/testsuite/rust/compile/torture/all_doc_comment_line_blocks.rs
 create mode 100644 gcc/testsuite/rust/compile/torture/all_doc_comment_line_blocks_crlf.rs
 create mode 100644 gcc/testsuite/rust/compile/torture/isolated_cr_block_comment.rs
 create mode 100644 gcc/testsuite/rust/compile/torture/isolated_cr_line_comment.rs

diff --git a/gcc/rust/ast/rust-ast.h b/gcc/rust/ast/rust-ast.h
index 75b08f8aa66..3e3e185b9b5 100644
--- a/gcc/rust/ast/rust-ast.h
+++ b/gcc/rust/ast/rust-ast.h
@@ -455,6 +455,8 @@ public:
   // Returns whether the attribute is considered an "empty" attribute.
   bool is_empty () const { return attr_input == nullptr && path.is_empty (); }
 
+  Location get_locus () const { return locus; }
+
   /* e.g.:
       #![crate_type = "lib"]
       #[test]
diff --git a/gcc/rust/lex/rust-lex.cc b/gcc/rust/lex/rust-lex.cc
index 617dd69a080..0b8a8eae651 100644
--- a/gcc/rust/lex/rust-lex.cc
+++ b/gcc/rust/lex/rust-lex.cc
@@ -265,9 +265,16 @@ Lexer::build_token ()
 	      int next_char = peek_input (n);
 	      if (is_whitespace (next_char))
 		n++;
-	      else if (next_char == '/' && peek_input (n + 1) == '/')
+	      else if ((next_char == '/' && peek_input (n + 1) == '/'
+			&& peek_input (n + 2) != '!'
+			&& peek_input (n + 2) != '/')
+		       || (next_char == '/' && peek_input (n + 1) == '/'
+			   && peek_input (n + 2) == '/'
+			   && peek_input (n + 3) == '/'))
 		{
+		  // two // or four ////
 		  // A single line comment
+		  // (but not an inner or outer doc comment)
 		  n += 2;
 		  next_char = peek_input (n);
 		  while (next_char != '\n' && next_char != EOF)
@@ -278,9 +285,30 @@ Lexer::build_token ()
 		  if (next_char == '\n')
 		    n++;
 		}
-	      else if (next_char == '/' && peek_input (n + 1) == '*')
+	      else if (next_char == '/' && peek_input (n + 1) == '*'
+		       && peek_input (n + 2) == '*'
+		       && peek_input (n + 3) == '/')
 		{
+		  /**/
+		  n += 4;
+		}
+	      else if (next_char == '/' && peek_input (n + 1) == '*'
+		       && peek_input (n + 2) == '*' && peek_input (n + 3) == '*'
+		       && peek_input (n + 4) == '/')
+		{
+		  /***/
+		  n += 5;
+		}
+	      else if ((next_char == '/' && peek_input (n + 1) == '*'
+			&& peek_input (n + 2) != '*'
+			&& peek_input (n + 2) != '!')
+		       || (next_char == '/' && peek_input (n + 1) == '*'
+			   && peek_input (n + 2) == '*'
+			   && peek_input (n + 3) == '*'))
+		{
+		  // one /* or three /***
 		  // Start of a block comment
+		  // (but not an inner or outer doc comment)
 		  n += 2;
 		  int level = 1;
 		  while (level > 0)
@@ -339,6 +367,9 @@ Lexer::build_token ()
 	  // tell line_table that new line starts
 	  line_map->start_line (current_line, max_column_hint);
 	  continue;
+	case '\r': // cr
+	  // Ignore, we expect a newline (lf) soon.
+	  continue;
 	case ' ': // space
 	  current_column++;
 	  continue;
@@ -445,11 +476,14 @@ Lexer::build_token ()
 
 	      return Token::make (DIV_EQ, loc);
 	    }
-	  else if (peek_input () == '/')
+	  else if ((peek_input () == '/' && peek_input (1) != '!'
+		    && peek_input (1) != '/')
+		   || (peek_input () == '/' && peek_input (1) == '/'
+		       && peek_input (2) == '/'))
 	    {
-	      // TODO: single-line doc comments
-
+	      // two // or four ////
 	      // single line comment
+	      // (but not an inner or outer doc comment)
 	      skip_input ();
 	      current_column += 2;
 
@@ -461,23 +495,85 @@ Lexer::build_token ()
 		  current_char = peek_input ();
 		}
 	      continue;
-	      break;
 	    }
-	  else if (peek_input () == '*')
+	  else if (peek_input () == '/'
+		   && (peek_input (1) == '!' || peek_input (1) == '/'))
 	    {
+	      /* single line doc comment, inner or outer.  */
+	      bool is_inner = peek_input (1) == '!';
+	      skip_input (1);
+	      current_column += 3;
+
+	      std::string str;
+	      str.reserve (32);
+	      current_char = peek_input ();
+	      while (current_char != '\n')
+		{
+		  skip_input ();
+		  if (current_char == '\r')
+		    {
+		      char next_char = peek_input ();
+		      if (next_char == '\n')
+			{
+			  current_char = '\n';
+			  break;
+			}
+		      rust_error_at (
+			loc, "Isolated CR %<\\r%> not allowed in doc comment");
+		      current_char = next_char;
+		      continue;
+		    }
+		  if (current_char == EOF)
+		    {
+		      rust_error_at (
+			loc, "unexpected EOF while looking for end of comment");
+		      break;
+		    }
+		  str += current_char;
+		  current_char = peek_input ();
+		}
+	      skip_input ();
+	      current_line++;
+	      current_column = 1;
+	      // tell line_table that new line starts
+	      line_map->start_line (current_line, max_column_hint);
+
+	      str.shrink_to_fit ();
+	      if (is_inner)
+		return Token::make_inner_doc_comment (loc, std::move (str));
+	      else
+		return Token::make_outer_doc_comment (loc, std::move (str));
+	    }
+	  else if (peek_input () == '*' && peek_input (1) == '*'
+		   && peek_input (2) == '/')
+	    {
+	      /**/
+	      skip_input (2);
+	      current_column += 4;
+	      continue;
+	    }
+	  else if (peek_input () == '*' && peek_input (1) == '*'
+		   && peek_input (2) == '*' && peek_input (3) == '/')
+	    {
+	      /***/
+	      skip_input (3);
+	      current_column += 5;
+	      continue;
+	    }
+	  else if ((peek_input () == '*' && peek_input (1) != '!'
+		    && peek_input (1) != '*')
+		   || (peek_input () == '*' && peek_input (1) == '*'
+		       && peek_input (2) == '*'))
+	    {
+	      // one /* or three /***
 	      // block comment
+	      // (but not an inner or outer doc comment)
 	      skip_input ();
 	      current_column += 2;
 
-	      // TODO: block doc comments
-
-	      current_char = peek_input ();
-
 	      int level = 1;
 	      while (level > 0)
 		{
-		  skip_input ();
-		  current_column++; // for error-handling
 		  current_char = peek_input ();
 
 		  if (current_char == EOF)
@@ -496,6 +592,7 @@ Lexer::build_token ()
 		      current_column += 2;
 
 		      level += 1;
+		      continue;
 		    }
 
 		  // ignore until */ is found
@@ -505,16 +602,101 @@ Lexer::build_token ()
 		      skip_input (1);
 
 		      current_column += 2;
-		      // should only break inner loop here - seems to do so
-		      // break;
 
 		      level -= 1;
+		      continue;
 		    }
+
+		  if (current_char == '\n')
+		    {
+		      skip_input ();
+		      current_line++;
+		      current_column = 1;
+		      // tell line_table that new line starts
+		      line_map->start_line (current_line, max_column_hint);
+		      continue;
+		    }
+
+		  skip_input ();
+		  current_column++;
 		}
 
 	      // refresh new token
 	      continue;
-	      break;
+	    }
+	  else if (peek_input () == '*'
+		   && (peek_input (1) == '!' || peek_input (1) == '*'))
+	    {
+	      // block doc comment, inner /*! or outer /**
+	      bool is_inner = peek_input (1) == '!';
+	      skip_input (1);
+	      current_column += 3;
+
+	      std::string str;
+	      str.reserve (96);
+
+	      int level = 1;
+	      while (level > 0)
+		{
+		  current_char = peek_input ();
+
+		  if (current_char == EOF)
+		    {
+		      rust_error_at (
+			loc, "unexpected EOF while looking for end of comment");
+		      break;
+		    }
+
+		  // if /* found
+		  if (current_char == '/' && peek_input (1) == '*')
+		    {
+		      // skip /* characters
+		      skip_input (1);
+		      current_column += 2;
+
+		      level += 1;
+		      str += "/*";
+		      continue;
+		    }
+
+		  // ignore until */ is found
+		  if (current_char == '*' && peek_input (1) == '/')
+		    {
+		      // skip */ characters
+		      skip_input (1);
+		      current_column += 2;
+
+		      level -= 1;
+		      if (level > 0)
+			str += "*/";
+		      continue;
+		    }
+
+		  if (current_char == '\r' && peek_input (1) != '\n')
+		    rust_error_at (
+		      loc, "Isolated CR %<\\r%> not allowed in doc comment");
+
+		  if (current_char == '\n')
+		    {
+		      skip_input ();
+		      current_line++;
+		      current_column = 1;
+		      // tell line_table that new line starts
+		      line_map->start_line (current_line, max_column_hint);
+		      str += '\n';
+		      continue;
+		    }
+
+		  str += current_char;
+		  skip_input ();
+		  current_column++;
+		}
+
+	      str.shrink_to_fit ();
+	      if (is_inner)
+		return Token::make_inner_doc_comment (loc, std::move (str));
+	      else
+		return Token::make_outer_doc_comment (loc, std::move (str));
 	    }
 	  else
 	    {
diff --git a/gcc/rust/lex/rust-token.h b/gcc/rust/lex/rust-token.h
index 771910119b7..1c397c839fd 100644
--- a/gcc/rust/lex/rust-token.h
+++ b/gcc/rust/lex/rust-token.h
@@ -151,15 +151,10 @@ enum PrimitiveCoreType
   RS_TOKEN (RIGHT_SQUARE, "]")                                                 \
   /* Macros */                                                                 \
   RS_TOKEN (DOLLAR_SIGN, "$")                                                  \
-  /* Comments */                                                               \
-  RS_TOKEN (LINE_COMMENT, "//")                                                \
-  RS_TOKEN (INNER_LINE_DOC, "//!")                                             \
-  RS_TOKEN (OUTER_LINE_DOC, "///")                                             \
-  RS_TOKEN (BLOCK_COMMENT_START, "/*")                                         \
-  RS_TOKEN (BLOCK_COMMENT_END, "*/")                                           \
-  RS_TOKEN (INNER_BLOCK_DOC_START, "/*!")                                      \
-  RS_TOKEN (OUTER_BLOCK_DOC_START,                                             \
-	    "/**") /* have "weak" union and 'static keywords? */               \
+  /* Doc Comments */                                                           \
+  RS_TOKEN (INNER_DOC_COMMENT, "#![doc]")                                      \
+  RS_TOKEN (OUTER_DOC_COMMENT, "#[doc]")                                       \
+  /* have "weak" union and 'static keywords? */                                \
                                                                                \
   RS_TOKEN_KEYWORD (ABSTRACT, "abstract") /* unused */                         \
   RS_TOKEN_KEYWORD (AS, "as")                                                  \
@@ -368,6 +363,18 @@ public:
     return TokenPtr (new Token (BYTE_STRING_LITERAL, locus, std::move (str)));
   }
 
+  // Makes and returns a new TokenPtr of type INNER_DOC_COMMENT.
+  static TokenPtr make_inner_doc_comment (Location locus, std::string &&str)
+  {
+    return TokenPtr (new Token (INNER_DOC_COMMENT, locus, std::move (str)));
+  }
+
+  // Makes and returns a new TokenPtr of type OUTER_DOC_COMMENT.
+  static TokenPtr make_outer_doc_comment (Location locus, std::string &&str)
+  {
+    return TokenPtr (new Token (OUTER_DOC_COMMENT, locus, std::move (str)));
+  }
+
   // Makes and returns a new TokenPtr of type LIFETIME.
   static TokenPtr make_lifetime (Location locus, std::string &&str)
   {
diff --git a/gcc/rust/parse/rust-parse-impl.h b/gcc/rust/parse/rust-parse-impl.h
index a8597fa401e..eedc76db43e 100644
--- a/gcc/rust/parse/rust-parse-impl.h
+++ b/gcc/rust/parse/rust-parse-impl.h
@@ -434,8 +434,9 @@ Parser<ManagedTokenSource>::parse_inner_attributes ()
   AST::AttrVec inner_attributes;
 
   // only try to parse it if it starts with "#!" not only "#"
-  while (lexer.peek_token ()->get_id () == HASH
-	 && lexer.peek_token (1)->get_id () == EXCLAM)
+  while ((lexer.peek_token ()->get_id () == HASH
+	  && lexer.peek_token (1)->get_id () == EXCLAM)
+	 || lexer.peek_token ()->get_id () == INNER_DOC_COMMENT)
     {
       AST::Attribute inner_attr = parse_inner_attribute ();
 
@@ -457,11 +458,33 @@ Parser<ManagedTokenSource>::parse_inner_attributes ()
   return inner_attributes;
 }
 
+// Parse a inner or outer doc comment into an doc attribute
+template <typename ManagedTokenSource>
+AST::Attribute
+Parser<ManagedTokenSource>::parse_doc_comment ()
+{
+  const_TokenPtr token = lexer.peek_token ();
+  Location locus = token->get_locus ();
+  AST::SimplePathSegment segment ("doc", locus);
+  std::vector<AST::SimplePathSegment> segments;
+  segments.push_back (std::move (segment));
+  AST::SimplePath attr_path (std::move (segments), false, locus);
+  AST::LiteralExpr lit_expr (token->get_str (), AST::Literal::STRING,
+			     PrimitiveCoreType::CORETYPE_STR, {}, locus);
+  std::unique_ptr<AST::AttrInput> attr_input (
+    new AST::AttrInputLiteral (std::move (lit_expr)));
+  lexer.skip_token ();
+  return AST::Attribute (std::move (attr_path), std::move (attr_input), locus);
+}
+
 // Parse a single inner attribute.
 template <typename ManagedTokenSource>
 AST::Attribute
 Parser<ManagedTokenSource>::parse_inner_attribute ()
 {
+  if (lexer.peek_token ()->get_id () == INNER_DOC_COMMENT)
+    return parse_doc_comment ();
+
   if (lexer.peek_token ()->get_id () != HASH)
     {
       Error error (lexer.peek_token ()->get_locus (),
@@ -1019,7 +1042,15 @@ Parser<ManagedTokenSource>::parse_item (bool called_from_statement)
   switch (t->get_id ())
     {
     case END_OF_FILE:
-      // not necessarily an error
+      // not necessarily an error, unless we just read outer
+      // attributes which needs to be attached
+      if (!outer_attrs.empty ())
+	{
+	  Rust::AST::Attribute attr = outer_attrs.back ();
+	  Error error (attr.get_locus (),
+		       "expected item after outer attribute or doc comment");
+	  add_error (std::move (error));
+	}
       return nullptr;
     case PUB:
     case MOD:
@@ -1091,7 +1122,11 @@ Parser<ManagedTokenSource>::parse_outer_attributes ()
 {
   AST::AttrVec outer_attributes;
 
-  while (lexer.peek_token ()->get_id () == HASH)
+  while (lexer.peek_token ()->get_id ()
+	   == HASH /* Can also be #!, which catches errors.  */
+	 || lexer.peek_token ()->get_id () == OUTER_DOC_COMMENT
+	 || lexer.peek_token ()->get_id ()
+	      == INNER_DOC_COMMENT) /* For error handling.  */
     {
       AST::Attribute outer_attr = parse_outer_attribute ();
 
@@ -1121,6 +1156,20 @@ template <typename ManagedTokenSource>
 AST::Attribute
 Parser<ManagedTokenSource>::parse_outer_attribute ()
 {
+  if (lexer.peek_token ()->get_id () == OUTER_DOC_COMMENT)
+    return parse_doc_comment ();
+
+  if (lexer.peek_token ()->get_id () == INNER_DOC_COMMENT)
+    {
+      Error error (
+	lexer.peek_token ()->get_locus (),
+	"inner doc (%<//!%> or %</*!%>) only allowed at start of item "
+	"and before any outer attribute or doc (%<#[%>, %<///%> or %</**%>)");
+      add_error (std::move (error));
+      lexer.skip_token ();
+      return AST::Attribute::create_empty ();
+    }
+
   /* OuterAttribute -> '#' '[' Attr ']' */
 
   if (lexer.peek_token ()->get_id () != HASH)
@@ -1134,12 +1183,13 @@ Parser<ManagedTokenSource>::parse_outer_attribute ()
       if (id == EXCLAM)
 	{
 	  // this is inner attribute syntax, so throw error
+	  // inner attributes were either already parsed or not allowed here.
 	  Error error (
 	    lexer.peek_token ()->get_locus (),
 	    "token %<!%> found, indicating inner attribute definition. Inner "
 	    "attributes are not possible at this location");
 	  add_error (std::move (error));
-	} // TODO: are there any cases where this wouldn't be an error?
+	}
       return AST::Attribute::create_empty ();
     }
 
diff --git a/gcc/rust/parse/rust-parse.h b/gcc/rust/parse/rust-parse.h
index bde2613f03d..1cd85eae8c2 100644
--- a/gcc/rust/parse/rust-parse.h
+++ b/gcc/rust/parse/rust-parse.h
@@ -107,6 +107,7 @@ private:
   AST::Attribute parse_outer_attribute ();
   AST::Attribute parse_attribute_body ();
   std::unique_ptr<AST::AttrInput> parse_attr_input ();
+  AST::Attribute parse_doc_comment ();
 
   // Path-related
   AST::SimplePath parse_simple_path ();
diff --git a/gcc/testsuite/rust/compile/bad_inner_doc.rs b/gcc/testsuite/rust/compile/bad_inner_doc.rs
new file mode 100644
index 00000000000..cfd166ce3ec
--- /dev/null
+++ b/gcc/testsuite/rust/compile/bad_inner_doc.rs
@@ -0,0 +1,15 @@
+pub fn main ()
+{
+  //! inner doc allowed
+  let _x = 42;
+  // { dg-error "inner doc" "" { target *-*-* } .+1 }
+  //! inner doc disallowed
+  mod module
+  {
+    /*! inner doc allowed */
+    /// outer doc allowed
+    // { dg-error "inner doc" "" { target *-*-* } .+1 }
+    /*! but inner doc not here */
+    mod x { }
+  }
+}
diff --git a/gcc/testsuite/rust/compile/doc_isolated_cr_block_comment.rs b/gcc/testsuite/rust/compile/doc_isolated_cr_block_comment.rs
new file mode 100644
index 00000000000..0ada77f69cf
--- /dev/null
+++ b/gcc/testsuite/rust/compile/doc_isolated_cr_block_comment.rs
@@ -0,0 +1,3 @@
+// { dg-error "Isolated CR" "" { target *-*-* } .+1 }
+/** doc cr\r comment */
+pub fn main () { }
diff --git a/gcc/testsuite/rust/compile/doc_isolated_cr_inner_block_comment.rs b/gcc/testsuite/rust/compile/doc_isolated_cr_inner_block_comment.rs
new file mode 100644
index 00000000000..7db35341bee
--- /dev/null
+++ b/gcc/testsuite/rust/compile/doc_isolated_cr_inner_block_comment.rs
@@ -0,0 +1,5 @@
+pub fn main ()
+{
+// { dg-error "Isolated CR" "" { target *-*-* } .+1 }
+  /*! doc cr\r comment */
+}
diff --git a/gcc/testsuite/rust/compile/doc_isolated_cr_inner_line_comment.rs b/gcc/testsuite/rust/compile/doc_isolated_cr_inner_line_comment.rs
new file mode 100644
index 00000000000..d75da75e218
--- /dev/null
+++ b/gcc/testsuite/rust/compile/doc_isolated_cr_inner_line_comment.rs
@@ -0,0 +1,5 @@
+pub fn main ()
+{
+// { dg-error "Isolated CR" "" { target *-*-* } .+1 }
+  //! doc cr\r comment
+}
diff --git a/gcc/testsuite/rust/compile/doc_isolated_cr_line_comment.rs b/gcc/testsuite/rust/compile/doc_isolated_cr_line_comment.rs
new file mode 100644
index 00000000000..7b6ef989c30
--- /dev/null
+++ b/gcc/testsuite/rust/compile/doc_isolated_cr_line_comment.rs
@@ -0,0 +1,3 @@
+// { dg-error "Isolated CR" "" { target *-*-* } .+1 }
+/// doc cr\r comment
+pub fn main () { }
diff --git a/gcc/testsuite/rust/compile/torture/all_doc_comment_line_blocks.rs b/gcc/testsuite/rust/compile/torture/all_doc_comment_line_blocks.rs
new file mode 100644
index 00000000000..ab38ac69610
--- /dev/null
+++ b/gcc/testsuite/rust/compile/torture/all_doc_comment_line_blocks.rs
@@ -0,0 +1,47 @@
+// comment line not a doc
+/* comment block not a doc                   */
+
+//! inner line comment for most outer crate
+/*! inner block comment for most outer crate */
+
+// comment line not a doc
+/* comment block not a doc                   */
+
+/// outer doc line for module
+/** outer doc block for module               */
+pub mod module
+{
+  //!  inner line doc
+  //!! inner line doc!
+  /*!  inner block doc  */
+  /*!! inner block doc! */
+
+  //   line comment
+  ///  outer line doc
+  //// line comment
+
+  /*   block comment   */
+  /**  outer block doc */
+  /*** block comment   */
+
+  mod block_doc_comments
+  {
+    /*   /* */  /** */  /*! */  */
+    /*!  /* */  /** */  /*! */  */
+    /**  /* */  /** */  /*! */  */
+    mod item { }
+  }
+
+  pub mod empty
+  {
+    //!
+    /*!*/
+    //
+
+    ///
+    mod doc { }
+    /**/
+    /***/
+  }
+}
+pub fn main () { }
diff --git a/gcc/testsuite/rust/compile/torture/all_doc_comment_line_blocks_crlf.rs b/gcc/testsuite/rust/compile/torture/all_doc_comment_line_blocks_crlf.rs
new file mode 100644
index 00000000000..3ea2cd01c8c
--- /dev/null
+++ b/gcc/testsuite/rust/compile/torture/all_doc_comment_line_blocks_crlf.rs
@@ -0,0 +1,47 @@
+// comment line not a doc
+/* comment block not a doc                   */
+
+//! inner line comment for most outer crate
+/*! inner block comment for most outer crate */
+
+// comment line not a doc
+/* comment block not a doc                   */
+
+/// outer doc line for module
+/** outer doc block for module               */
+pub mod module
+{
+  //!  inner line doc
+  //!! inner line doc!
+  /*!  inner block doc  */
+  /*!! inner block doc! */
+
+  //   line comment
+  ///  outer line doc
+  //// line comment
+
+  /*   block comment   */
+  /**  outer block doc */
+  /*** block comment   */
+
+  mod block_doc_comments
+  {
+    /*   /* */  /** */  /*! */  */
+    /*!  /* */  /** */  /*! */  */
+    /**  /* */  /** */  /*! */  */
+    mod item { }
+  }
+
+  pub mod empty
+  {
+    //!
+    /*!*/
+    //
+
+    ///
+    mod doc { }
+    /**/
+    /***/
+  }
+}
+pub fn main () { }
diff --git a/gcc/testsuite/rust/compile/torture/isolated_cr_block_comment.rs b/gcc/testsuite/rust/compile/torture/isolated_cr_block_comment.rs
new file mode 100644
index 00000000000..9a1e090f330
--- /dev/null
+++ b/gcc/testsuite/rust/compile/torture/isolated_cr_block_comment.rs
@@ -0,0 +1,2 @@
+/* comment cr\r is allowed */
+pub fn main () { }
diff --git a/gcc/testsuite/rust/compile/torture/isolated_cr_line_comment.rs b/gcc/testsuite/rust/compile/torture/isolated_cr_line_comment.rs
new file mode 100644
index 00000000000..4e921a225c2
--- /dev/null
+++ b/gcc/testsuite/rust/compile/torture/isolated_cr_line_comment.rs
@@ -0,0 +1,2 @@
+// comment cr\r is allowed
+pub fn main () { }
-- 
2.32.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Handle doc comment strings in lexer and parser
  2021-07-11 20:10 [PATCH] Handle doc comment strings in lexer and parser Mark Wielaard
@ 2021-07-12  8:09 ` Philip Herron
  2021-07-12  8:32   ` Mark Wielaard
  0 siblings, 1 reply; 10+ messages in thread
From: Philip Herron @ 2021-07-12  8:09 UTC (permalink / raw)
  To: gcc-rust


[-- Attachment #1.1: Type: text/plain, Size: 26382 bytes --]

On 11/07/2021 21:10, Mark Wielaard wrote:
> Remove (unused) comment related tokens and replace them with
> INNER_DOC_COMMENT and OUTER_DOC_COMMENT tokens, which keep the comment
> text as a string. These can be constructed with the new
> make_inner_doc_comment and make_outer_doc_comment methods.
>
> Make sure to not confuse doc strings with normal comments in the lexer
> when detecting shebang lines. Both single line //! and /*! */ blocks
> are turned into INNER_DOC_COMMENT tokens. And both single line /// and
> /** */ blocks are turned into OUTER_DOC_COMMENT tokens.
>
> Also fixes some issues with cr/lf line endings and keeping the line
> map correct when seeing \n in a comment.
>
> In the parser handle INNER_DOC_COMMENT and OUTER_DOC_COMMENTS where
> inner (#[]) and outer (#![]) attributes are handled. Add a method
> parse_doc_comment which turns the tokens into an "doc" Attribute with
> the string as literal expression.
>
> Add get_locus method to Attribute class for better error reporting.
>
> Tests are added for correctly placed and formatted doc strings, with
> or without cr/lf line endings. Incorrect formatted (isolated CRs) doc
> strings and badly placed inner doc strings. No tests add handling of
> the actual doc attributes yet. These could be tested once we add
> support for the #![warn(missing_docs)] attribute.
> ---
>  gcc/rust/ast/rust-ast.h                       |   2 +
>  gcc/rust/lex/rust-lex.cc                      | 214 ++++++++++++++++--
>  gcc/rust/lex/rust-token.h                     |  25 +-
>  gcc/rust/parse/rust-parse-impl.h              |  60 ++++-
>  gcc/rust/parse/rust-parse.h                   |   1 +
>  gcc/testsuite/rust/compile/bad_inner_doc.rs   |  15 ++
>  .../compile/doc_isolated_cr_block_comment.rs  |   3 +
>  .../doc_isolated_cr_inner_block_comment.rs    |   5 +
>  .../doc_isolated_cr_inner_line_comment.rs     |   5 +
>  .../compile/doc_isolated_cr_line_comment.rs   |   3 +
>  .../torture/all_doc_comment_line_blocks.rs    |  47 ++++
>  .../all_doc_comment_line_blocks_crlf.rs       |  47 ++++
>  .../torture/isolated_cr_block_comment.rs      |   2 +
>  .../torture/isolated_cr_line_comment.rs       |   2 +
>  14 files changed, 401 insertions(+), 30 deletions(-)
>  create mode 100644 gcc/testsuite/rust/compile/bad_inner_doc.rs
>  create mode 100644 gcc/testsuite/rust/compile/doc_isolated_cr_block_comment.rs
>  create mode 100644 gcc/testsuite/rust/compile/doc_isolated_cr_inner_block_comment.rs
>  create mode 100644 gcc/testsuite/rust/compile/doc_isolated_cr_inner_line_comment.rs
>  create mode 100644 gcc/testsuite/rust/compile/doc_isolated_cr_line_comment.rs
>  create mode 100644 gcc/testsuite/rust/compile/torture/all_doc_comment_line_blocks.rs
>  create mode 100644 gcc/testsuite/rust/compile/torture/all_doc_comment_line_blocks_crlf.rs
>  create mode 100644 gcc/testsuite/rust/compile/torture/isolated_cr_block_comment.rs
>  create mode 100644 gcc/testsuite/rust/compile/torture/isolated_cr_line_comment.rs
>
> diff --git a/gcc/rust/ast/rust-ast.h b/gcc/rust/ast/rust-ast.h
> index 75b08f8aa66..3e3e185b9b5 100644
> --- a/gcc/rust/ast/rust-ast.h
> +++ b/gcc/rust/ast/rust-ast.h
> @@ -455,6 +455,8 @@ public:
>    // Returns whether the attribute is considered an "empty" attribute.
>    bool is_empty () const { return attr_input == nullptr && path.is_empty (); }
>  
> +  Location get_locus () const { return locus; }
> +
>    /* e.g.:
>        #![crate_type = "lib"]
>        #[test]
> diff --git a/gcc/rust/lex/rust-lex.cc b/gcc/rust/lex/rust-lex.cc
> index 617dd69a080..0b8a8eae651 100644
> --- a/gcc/rust/lex/rust-lex.cc
> +++ b/gcc/rust/lex/rust-lex.cc
> @@ -265,9 +265,16 @@ Lexer::build_token ()
>  	      int next_char = peek_input (n);
>  	      if (is_whitespace (next_char))
>  		n++;
> -	      else if (next_char == '/' && peek_input (n + 1) == '/')
> +	      else if ((next_char == '/' && peek_input (n + 1) == '/'
> +			&& peek_input (n + 2) != '!'
> +			&& peek_input (n + 2) != '/')
> +		       || (next_char == '/' && peek_input (n + 1) == '/'
> +			   && peek_input (n + 2) == '/'
> +			   && peek_input (n + 3) == '/'))
>  		{
> +		  // two // or four ////
>  		  // A single line comment
> +		  // (but not an inner or outer doc comment)
>  		  n += 2;
>  		  next_char = peek_input (n);
>  		  while (next_char != '\n' && next_char != EOF)
> @@ -278,9 +285,30 @@ Lexer::build_token ()
>  		  if (next_char == '\n')
>  		    n++;
>  		}
> -	      else if (next_char == '/' && peek_input (n + 1) == '*')
> +	      else if (next_char == '/' && peek_input (n + 1) == '*'
> +		       && peek_input (n + 2) == '*'
> +		       && peek_input (n + 3) == '/')
>  		{
> +		  /**/
> +		  n += 4;
> +		}
> +	      else if (next_char == '/' && peek_input (n + 1) == '*'
> +		       && peek_input (n + 2) == '*' && peek_input (n + 3) == '*'
> +		       && peek_input (n + 4) == '/')
> +		{
> +		  /***/
> +		  n += 5;
> +		}
> +	      else if ((next_char == '/' && peek_input (n + 1) == '*'
> +			&& peek_input (n + 2) != '*'
> +			&& peek_input (n + 2) != '!')
> +		       || (next_char == '/' && peek_input (n + 1) == '*'
> +			   && peek_input (n + 2) == '*'
> +			   && peek_input (n + 3) == '*'))
> +		{
> +		  // one /* or three /***
>  		  // Start of a block comment
> +		  // (but not an inner or outer doc comment)
>  		  n += 2;
>  		  int level = 1;
>  		  while (level > 0)
> @@ -339,6 +367,9 @@ Lexer::build_token ()
>  	  // tell line_table that new line starts
>  	  line_map->start_line (current_line, max_column_hint);
>  	  continue;
> +	case '\r': // cr
> +	  // Ignore, we expect a newline (lf) soon.
> +	  continue;
>  	case ' ': // space
>  	  current_column++;
>  	  continue;
> @@ -445,11 +476,14 @@ Lexer::build_token ()
>  
>  	      return Token::make (DIV_EQ, loc);
>  	    }
> -	  else if (peek_input () == '/')
> +	  else if ((peek_input () == '/' && peek_input (1) != '!'
> +		    && peek_input (1) != '/')
> +		   || (peek_input () == '/' && peek_input (1) == '/'
> +		       && peek_input (2) == '/'))
>  	    {
> -	      // TODO: single-line doc comments
> -
> +	      // two // or four ////
>  	      // single line comment
> +	      // (but not an inner or outer doc comment)
>  	      skip_input ();
>  	      current_column += 2;
>  
> @@ -461,23 +495,85 @@ Lexer::build_token ()
>  		  current_char = peek_input ();
>  		}
>  	      continue;
> -	      break;
>  	    }
> -	  else if (peek_input () == '*')
> +	  else if (peek_input () == '/'
> +		   && (peek_input (1) == '!' || peek_input (1) == '/'))
>  	    {
> +	      /* single line doc comment, inner or outer.  */
> +	      bool is_inner = peek_input (1) == '!';
> +	      skip_input (1);
> +	      current_column += 3;
> +
> +	      std::string str;
> +	      str.reserve (32);
> +	      current_char = peek_input ();
> +	      while (current_char != '\n')
> +		{
> +		  skip_input ();
> +		  if (current_char == '\r')
> +		    {
> +		      char next_char = peek_input ();
> +		      if (next_char == '\n')
> +			{
> +			  current_char = '\n';
> +			  break;
> +			}
> +		      rust_error_at (
> +			loc, "Isolated CR %<\\r%> not allowed in doc comment");
> +		      current_char = next_char;
> +		      continue;
> +		    }
> +		  if (current_char == EOF)
> +		    {
> +		      rust_error_at (
> +			loc, "unexpected EOF while looking for end of comment");
> +		      break;
> +		    }
> +		  str += current_char;
> +		  current_char = peek_input ();
> +		}
> +	      skip_input ();
> +	      current_line++;
> +	      current_column = 1;
> +	      // tell line_table that new line starts
> +	      line_map->start_line (current_line, max_column_hint);
> +
> +	      str.shrink_to_fit ();
> +	      if (is_inner)
> +		return Token::make_inner_doc_comment (loc, std::move (str));
> +	      else
> +		return Token::make_outer_doc_comment (loc, std::move (str));
> +	    }
> +	  else if (peek_input () == '*' && peek_input (1) == '*'
> +		   && peek_input (2) == '/')
> +	    {
> +	      /**/
> +	      skip_input (2);
> +	      current_column += 4;
> +	      continue;
> +	    }
> +	  else if (peek_input () == '*' && peek_input (1) == '*'
> +		   && peek_input (2) == '*' && peek_input (3) == '/')
> +	    {
> +	      /***/
> +	      skip_input (3);
> +	      current_column += 5;
> +	      continue;
> +	    }
> +	  else if ((peek_input () == '*' && peek_input (1) != '!'
> +		    && peek_input (1) != '*')
> +		   || (peek_input () == '*' && peek_input (1) == '*'
> +		       && peek_input (2) == '*'))
> +	    {
> +	      // one /* or three /***
>  	      // block comment
> +	      // (but not an inner or outer doc comment)
>  	      skip_input ();
>  	      current_column += 2;
>  
> -	      // TODO: block doc comments
> -
> -	      current_char = peek_input ();
> -
>  	      int level = 1;
>  	      while (level > 0)
>  		{
> -		  skip_input ();
> -		  current_column++; // for error-handling
>  		  current_char = peek_input ();
>  
>  		  if (current_char == EOF)
> @@ -496,6 +592,7 @@ Lexer::build_token ()
>  		      current_column += 2;
>  
>  		      level += 1;
> +		      continue;
>  		    }
>  
>  		  // ignore until */ is found
> @@ -505,16 +602,101 @@ Lexer::build_token ()
>  		      skip_input (1);
>  
>  		      current_column += 2;
> -		      // should only break inner loop here - seems to do so
> -		      // break;
>  
>  		      level -= 1;
> +		      continue;
>  		    }
> +
> +		  if (current_char == '\n')
> +		    {
> +		      skip_input ();
> +		      current_line++;
> +		      current_column = 1;
> +		      // tell line_table that new line starts
> +		      line_map->start_line (current_line, max_column_hint);
> +		      continue;
> +		    }
> +
> +		  skip_input ();
> +		  current_column++;
>  		}
>  
>  	      // refresh new token
>  	      continue;
> -	      break;
> +	    }
> +	  else if (peek_input () == '*'
> +		   && (peek_input (1) == '!' || peek_input (1) == '*'))
> +	    {
> +	      // block doc comment, inner /*! or outer /**
> +	      bool is_inner = peek_input (1) == '!';
> +	      skip_input (1);
> +	      current_column += 3;
> +
> +	      std::string str;
> +	      str.reserve (96);
> +
> +	      int level = 1;
> +	      while (level > 0)
> +		{
> +		  current_char = peek_input ();
> +
> +		  if (current_char == EOF)
> +		    {
> +		      rust_error_at (
> +			loc, "unexpected EOF while looking for end of comment");
> +		      break;
> +		    }
> +
> +		  // if /* found
> +		  if (current_char == '/' && peek_input (1) == '*')
> +		    {
> +		      // skip /* characters
> +		      skip_input (1);
> +		      current_column += 2;
> +
> +		      level += 1;
> +		      str += "/*";
> +		      continue;
> +		    }
> +
> +		  // ignore until */ is found
> +		  if (current_char == '*' && peek_input (1) == '/')
> +		    {
> +		      // skip */ characters
> +		      skip_input (1);
> +		      current_column += 2;
> +
> +		      level -= 1;
> +		      if (level > 0)
> +			str += "*/";
> +		      continue;
> +		    }
> +
> +		  if (current_char == '\r' && peek_input (1) != '\n')
> +		    rust_error_at (
> +		      loc, "Isolated CR %<\\r%> not allowed in doc comment");
> +
> +		  if (current_char == '\n')
> +		    {
> +		      skip_input ();
> +		      current_line++;
> +		      current_column = 1;
> +		      // tell line_table that new line starts
> +		      line_map->start_line (current_line, max_column_hint);
> +		      str += '\n';
> +		      continue;
> +		    }
> +
> +		  str += current_char;
> +		  skip_input ();
> +		  current_column++;
> +		}
> +
> +	      str.shrink_to_fit ();
> +	      if (is_inner)
> +		return Token::make_inner_doc_comment (loc, std::move (str));
> +	      else
> +		return Token::make_outer_doc_comment (loc, std::move (str));
>  	    }
>  	  else
>  	    {
> diff --git a/gcc/rust/lex/rust-token.h b/gcc/rust/lex/rust-token.h
> index 771910119b7..1c397c839fd 100644
> --- a/gcc/rust/lex/rust-token.h
> +++ b/gcc/rust/lex/rust-token.h
> @@ -151,15 +151,10 @@ enum PrimitiveCoreType
>    RS_TOKEN (RIGHT_SQUARE, "]")                                                 \
>    /* Macros */                                                                 \
>    RS_TOKEN (DOLLAR_SIGN, "$")                                                  \
> -  /* Comments */                                                               \
> -  RS_TOKEN (LINE_COMMENT, "//")                                                \
> -  RS_TOKEN (INNER_LINE_DOC, "//!")                                             \
> -  RS_TOKEN (OUTER_LINE_DOC, "///")                                             \
> -  RS_TOKEN (BLOCK_COMMENT_START, "/*")                                         \
> -  RS_TOKEN (BLOCK_COMMENT_END, "*/")                                           \
> -  RS_TOKEN (INNER_BLOCK_DOC_START, "/*!")                                      \
> -  RS_TOKEN (OUTER_BLOCK_DOC_START,                                             \
> -	    "/**") /* have "weak" union and 'static keywords? */               \
> +  /* Doc Comments */                                                           \
> +  RS_TOKEN (INNER_DOC_COMMENT, "#![doc]")                                      \
> +  RS_TOKEN (OUTER_DOC_COMMENT, "#[doc]")                                       \
> +  /* have "weak" union and 'static keywords? */                                \
>                                                                                 \
>    RS_TOKEN_KEYWORD (ABSTRACT, "abstract") /* unused */                         \
>    RS_TOKEN_KEYWORD (AS, "as")                                                  \
> @@ -368,6 +363,18 @@ public:
>      return TokenPtr (new Token (BYTE_STRING_LITERAL, locus, std::move (str)));
>    }
>  
> +  // Makes and returns a new TokenPtr of type INNER_DOC_COMMENT.
> +  static TokenPtr make_inner_doc_comment (Location locus, std::string &&str)
> +  {
> +    return TokenPtr (new Token (INNER_DOC_COMMENT, locus, std::move (str)));
> +  }
> +
> +  // Makes and returns a new TokenPtr of type OUTER_DOC_COMMENT.
> +  static TokenPtr make_outer_doc_comment (Location locus, std::string &&str)
> +  {
> +    return TokenPtr (new Token (OUTER_DOC_COMMENT, locus, std::move (str)));
> +  }
> +
>    // Makes and returns a new TokenPtr of type LIFETIME.
>    static TokenPtr make_lifetime (Location locus, std::string &&str)
>    {
> diff --git a/gcc/rust/parse/rust-parse-impl.h b/gcc/rust/parse/rust-parse-impl.h
> index a8597fa401e..eedc76db43e 100644
> --- a/gcc/rust/parse/rust-parse-impl.h
> +++ b/gcc/rust/parse/rust-parse-impl.h
> @@ -434,8 +434,9 @@ Parser<ManagedTokenSource>::parse_inner_attributes ()
>    AST::AttrVec inner_attributes;
>  
>    // only try to parse it if it starts with "#!" not only "#"
> -  while (lexer.peek_token ()->get_id () == HASH
> -	 && lexer.peek_token (1)->get_id () == EXCLAM)
> +  while ((lexer.peek_token ()->get_id () == HASH
> +	  && lexer.peek_token (1)->get_id () == EXCLAM)
> +	 || lexer.peek_token ()->get_id () == INNER_DOC_COMMENT)
>      {
>        AST::Attribute inner_attr = parse_inner_attribute ();
>  
> @@ -457,11 +458,33 @@ Parser<ManagedTokenSource>::parse_inner_attributes ()
>    return inner_attributes;
>  }
>  
> +// Parse a inner or outer doc comment into an doc attribute
> +template <typename ManagedTokenSource>
> +AST::Attribute
> +Parser<ManagedTokenSource>::parse_doc_comment ()
> +{
> +  const_TokenPtr token = lexer.peek_token ();
> +  Location locus = token->get_locus ();
> +  AST::SimplePathSegment segment ("doc", locus);
> +  std::vector<AST::SimplePathSegment> segments;
> +  segments.push_back (std::move (segment));
> +  AST::SimplePath attr_path (std::move (segments), false, locus);
> +  AST::LiteralExpr lit_expr (token->get_str (), AST::Literal::STRING,
> +			     PrimitiveCoreType::CORETYPE_STR, {}, locus);
> +  std::unique_ptr<AST::AttrInput> attr_input (
> +    new AST::AttrInputLiteral (std::move (lit_expr)));
> +  lexer.skip_token ();
> +  return AST::Attribute (std::move (attr_path), std::move (attr_input), locus);
> +}
> +
>  // Parse a single inner attribute.
>  template <typename ManagedTokenSource>
>  AST::Attribute
>  Parser<ManagedTokenSource>::parse_inner_attribute ()
>  {
> +  if (lexer.peek_token ()->get_id () == INNER_DOC_COMMENT)
> +    return parse_doc_comment ();
> +
>    if (lexer.peek_token ()->get_id () != HASH)
>      {
>        Error error (lexer.peek_token ()->get_locus (),
> @@ -1019,7 +1042,15 @@ Parser<ManagedTokenSource>::parse_item (bool called_from_statement)
>    switch (t->get_id ())
>      {
>      case END_OF_FILE:
> -      // not necessarily an error
> +      // not necessarily an error, unless we just read outer
> +      // attributes which needs to be attached
> +      if (!outer_attrs.empty ())
> +	{
> +	  Rust::AST::Attribute attr = outer_attrs.back ();
> +	  Error error (attr.get_locus (),
> +		       "expected item after outer attribute or doc comment");
> +	  add_error (std::move (error));
> +	}
>        return nullptr;
>      case PUB:
>      case MOD:
> @@ -1091,7 +1122,11 @@ Parser<ManagedTokenSource>::parse_outer_attributes ()
>  {
>    AST::AttrVec outer_attributes;
>  
> -  while (lexer.peek_token ()->get_id () == HASH)
> +  while (lexer.peek_token ()->get_id ()
> +	   == HASH /* Can also be #!, which catches errors.  */
> +	 || lexer.peek_token ()->get_id () == OUTER_DOC_COMMENT
> +	 || lexer.peek_token ()->get_id ()
> +	      == INNER_DOC_COMMENT) /* For error handling.  */
>      {
>        AST::Attribute outer_attr = parse_outer_attribute ();
>  
> @@ -1121,6 +1156,20 @@ template <typename ManagedTokenSource>
>  AST::Attribute
>  Parser<ManagedTokenSource>::parse_outer_attribute ()
>  {
> +  if (lexer.peek_token ()->get_id () == OUTER_DOC_COMMENT)
> +    return parse_doc_comment ();
> +
> +  if (lexer.peek_token ()->get_id () == INNER_DOC_COMMENT)
> +    {
> +      Error error (
> +	lexer.peek_token ()->get_locus (),
> +	"inner doc (%<//!%> or %</*!%>) only allowed at start of item "
> +	"and before any outer attribute or doc (%<#[%>, %<///%> or %</**%>)");
> +      add_error (std::move (error));
> +      lexer.skip_token ();
> +      return AST::Attribute::create_empty ();
> +    }
> +
>    /* OuterAttribute -> '#' '[' Attr ']' */
>  
>    if (lexer.peek_token ()->get_id () != HASH)
> @@ -1134,12 +1183,13 @@ Parser<ManagedTokenSource>::parse_outer_attribute ()
>        if (id == EXCLAM)
>  	{
>  	  // this is inner attribute syntax, so throw error
> +	  // inner attributes were either already parsed or not allowed here.
>  	  Error error (
>  	    lexer.peek_token ()->get_locus (),
>  	    "token %<!%> found, indicating inner attribute definition. Inner "
>  	    "attributes are not possible at this location");
>  	  add_error (std::move (error));
> -	} // TODO: are there any cases where this wouldn't be an error?
> +	}
>        return AST::Attribute::create_empty ();
>      }
>  
> diff --git a/gcc/rust/parse/rust-parse.h b/gcc/rust/parse/rust-parse.h
> index bde2613f03d..1cd85eae8c2 100644
> --- a/gcc/rust/parse/rust-parse.h
> +++ b/gcc/rust/parse/rust-parse.h
> @@ -107,6 +107,7 @@ private:
>    AST::Attribute parse_outer_attribute ();
>    AST::Attribute parse_attribute_body ();
>    std::unique_ptr<AST::AttrInput> parse_attr_input ();
> +  AST::Attribute parse_doc_comment ();
>  
>    // Path-related
>    AST::SimplePath parse_simple_path ();
> diff --git a/gcc/testsuite/rust/compile/bad_inner_doc.rs b/gcc/testsuite/rust/compile/bad_inner_doc.rs
> new file mode 100644
> index 00000000000..cfd166ce3ec
> --- /dev/null
> +++ b/gcc/testsuite/rust/compile/bad_inner_doc.rs
> @@ -0,0 +1,15 @@
> +pub fn main ()
> +{
> +  //! inner doc allowed
> +  let _x = 42;
> +  // { dg-error "inner doc" "" { target *-*-* } .+1 }
> +  //! inner doc disallowed
> +  mod module
> +  {
> +    /*! inner doc allowed */
> +    /// outer doc allowed
> +    // { dg-error "inner doc" "" { target *-*-* } .+1 }
> +    /*! but inner doc not here */
> +    mod x { }
> +  }
> +}
> diff --git a/gcc/testsuite/rust/compile/doc_isolated_cr_block_comment.rs b/gcc/testsuite/rust/compile/doc_isolated_cr_block_comment.rs
> new file mode 100644
> index 00000000000..0ada77f69cf
> --- /dev/null
> +++ b/gcc/testsuite/rust/compile/doc_isolated_cr_block_comment.rs
> @@ -0,0 +1,3 @@
> +// { dg-error "Isolated CR" "" { target *-*-* } .+1 }
> +/** doc cr
>  comment */
> +pub fn main () { }
> diff --git a/gcc/testsuite/rust/compile/doc_isolated_cr_inner_block_comment.rs b/gcc/testsuite/rust/compile/doc_isolated_cr_inner_block_comment.rs
> new file mode 100644
> index 00000000000..7db35341bee
> --- /dev/null
> +++ b/gcc/testsuite/rust/compile/doc_isolated_cr_inner_block_comment.rs
> @@ -0,0 +1,5 @@
> +pub fn main ()
> +{
> +// { dg-error "Isolated CR" "" { target *-*-* } .+1 }
> +  /*! doc cr
>  comment */
> +}
> diff --git a/gcc/testsuite/rust/compile/doc_isolated_cr_inner_line_comment.rs b/gcc/testsuite/rust/compile/doc_isolated_cr_inner_line_comment.rs
> new file mode 100644
> index 00000000000..d75da75e218
> --- /dev/null
> +++ b/gcc/testsuite/rust/compile/doc_isolated_cr_inner_line_comment.rs
> @@ -0,0 +1,5 @@
> +pub fn main ()
> +{
> +// { dg-error "Isolated CR" "" { target *-*-* } .+1 }
> +  //! doc cr
>  comment
> +}
> diff --git a/gcc/testsuite/rust/compile/doc_isolated_cr_line_comment.rs b/gcc/testsuite/rust/compile/doc_isolated_cr_line_comment.rs
> new file mode 100644
> index 00000000000..7b6ef989c30
> --- /dev/null
> +++ b/gcc/testsuite/rust/compile/doc_isolated_cr_line_comment.rs
> @@ -0,0 +1,3 @@
> +// { dg-error "Isolated CR" "" { target *-*-* } .+1 }
> +/// doc cr
>  comment
> +pub fn main () { }
> diff --git a/gcc/testsuite/rust/compile/torture/all_doc_comment_line_blocks.rs b/gcc/testsuite/rust/compile/torture/all_doc_comment_line_blocks.rs
> new file mode 100644
> index 00000000000..ab38ac69610
> --- /dev/null
> +++ b/gcc/testsuite/rust/compile/torture/all_doc_comment_line_blocks.rs
> @@ -0,0 +1,47 @@
> +// comment line not a doc
> +/* comment block not a doc                   */
> +
> +//! inner line comment for most outer crate
> +/*! inner block comment for most outer crate */
> +
> +// comment line not a doc
> +/* comment block not a doc                   */
> +
> +/// outer doc line for module
> +/** outer doc block for module               */
> +pub mod module
> +{
> +  //!  inner line doc
> +  //!! inner line doc!
> +  /*!  inner block doc  */
> +  /*!! inner block doc! */
> +
> +  //   line comment
> +  ///  outer line doc
> +  //// line comment
> +
> +  /*   block comment   */
> +  /**  outer block doc */
> +  /*** block comment   */
> +
> +  mod block_doc_comments
> +  {
> +    /*   /* */  /** */  /*! */  */
> +    /*!  /* */  /** */  /*! */  */
> +    /**  /* */  /** */  /*! */  */
> +    mod item { }
> +  }
> +
> +  pub mod empty
> +  {
> +    //!
> +    /*!*/
> +    //
> +
> +    ///
> +    mod doc { }
> +    /**/
> +    /***/
> +  }
> +}
> +pub fn main () { }
> diff --git a/gcc/testsuite/rust/compile/torture/all_doc_comment_line_blocks_crlf.rs b/gcc/testsuite/rust/compile/torture/all_doc_comment_line_blocks_crlf.rs
> new file mode 100644
> index 00000000000..3ea2cd01c8c
> --- /dev/null
> +++ b/gcc/testsuite/rust/compile/torture/all_doc_comment_line_blocks_crlf.rs
> @@ -0,0 +1,47 @@
> +// comment line not a doc
> +/* comment block not a doc                   */
> +
> +//! inner line comment for most outer crate
> +/*! inner block comment for most outer crate */
> +
> +// comment line not a doc
> +/* comment block not a doc                   */
> +
> +/// outer doc line for module
> +/** outer doc block for module               */
> +pub mod module
> +{
> +  //!  inner line doc
> +  //!! inner line doc!
> +  /*!  inner block doc  */
> +  /*!! inner block doc! */
> +
> +  //   line comment
> +  ///  outer line doc
> +  //// line comment
> +
> +  /*   block comment   */
> +  /**  outer block doc */
> +  /*** block comment   */
> +
> +  mod block_doc_comments
> +  {
> +    /*   /* */  /** */  /*! */  */
> +    /*!  /* */  /** */  /*! */  */
> +    /**  /* */  /** */  /*! */  */
> +    mod item { }
> +  }
> +
> +  pub mod empty
> +  {
> +    //!
> +    /*!*/
> +    //
> +
> +    ///
> +    mod doc { }
> +    /**/
> +    /***/
> +  }
> +}
> +pub fn main () { }
> diff --git a/gcc/testsuite/rust/compile/torture/isolated_cr_block_comment.rs b/gcc/testsuite/rust/compile/torture/isolated_cr_block_comment.rs
> new file mode 100644
> index 00000000000..9a1e090f330
> --- /dev/null
> +++ b/gcc/testsuite/rust/compile/torture/isolated_cr_block_comment.rs
> @@ -0,0 +1,2 @@
> +/* comment cr
>  is allowed */
> +pub fn main () { }
> diff --git a/gcc/testsuite/rust/compile/torture/isolated_cr_line_comment.rs b/gcc/testsuite/rust/compile/torture/isolated_cr_line_comment.rs
> new file mode 100644
> index 00000000000..4e921a225c2
> --- /dev/null
> +++ b/gcc/testsuite/rust/compile/torture/isolated_cr_line_comment.rs
> @@ -0,0 +1,2 @@
> +// comment cr
>  is allowed
> +pub fn main () { }


Hi Mark,

This patch looks good to me. When I tried to apply it to merge it I got
the following:

```
$ git am  '[PATCH] Handle doc comment strings in lexer and parser.eml'
Applying: Handle doc comment strings in lexer and parser
error: corrupt patch at line 531
Patch failed at 0001 Handle doc comment strings in lexer and parser
hint: Use 'git am --show-current-patch' to see the failed patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
```

Not sure if I have done something wrong, have you any pointers?

Thanks

--Phil



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 665 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Handle doc comment strings in lexer and parser
  2021-07-12  8:09 ` Philip Herron
@ 2021-07-12  8:32   ` Mark Wielaard
  2021-07-12 10:06     ` Philip Herron
  0 siblings, 1 reply; 10+ messages in thread
From: Mark Wielaard @ 2021-07-12  8:32 UTC (permalink / raw)
  To: Philip Herron; +Cc: gcc-rust

Hi Philip,

On Mon, Jul 12, 2021 at 09:09:09AM +0100, Philip Herron wrote:
> This patch looks good to me. When I tried to apply it to merge it I got
> the following:
> 
> ```
> $ git am  '[PATCH] Handle doc comment strings in lexer and parser.eml'
> Applying: Handle doc comment strings in lexer and parser
> error: corrupt patch at line 531
> Patch failed at 0001 Handle doc comment strings in lexer and parser
> hint: Use 'git am --show-current-patch' to see the failed patch
> When you have resolved this problem, run "git am --continue".
> If you prefer to skip this patch, run "git am --skip" instead.
> To restore the original branch and stop patching, run "git am --abort".
> ```
> 
> Not sure if I have done something wrong, have you any pointers?

Looks like that is one of the IsolatedCR tests (a bare \r not at the
end of a line followed by \n in a doc comment string). I assume some
mailer ate it and/or added a \n somehwere to "correct" it.

Would you be able to pull directly from my git repo?

The following changes since commit 4560f469ee33536cec6af0f8e5816ff97de60de0:

  Merge #551 (2021-07-10 21:02:06 +0000)

are available in the Git repository at:

  https://code.wildebeest.org/git/user/mjw/gccrs doc-comments

for you to fetch changes up to e1e14958a90397a1ed6ab7236dc5a6f1c2f22505:

  Handle doc comment strings in lexer and parser (2021-07-11 21:09:21 +0200)

----------------------------------------------------------------
Mark Wielaard (1):
      Handle doc comment strings in lexer and parser

 gcc/rust/ast/rust-ast.h                            |   2 +
 gcc/rust/lex/rust-lex.cc                           | 214 +++++++++++++++++++--
 gcc/rust/lex/rust-token.h                          |  25 ++-
 gcc/rust/parse/rust-parse-impl.h                   |  60 +++++-
 gcc/rust/parse/rust-parse.h                        |   1 +
 gcc/testsuite/rust/compile/bad_inner_doc.rs        |  15 ++
 .../rust/compile/doc_isolated_cr_block_comment.rs  |   3 +
 .../compile/doc_isolated_cr_inner_block_comment.rs |   5 +
 .../compile/doc_isolated_cr_inner_line_comment.rs  |   5 +
 .../rust/compile/doc_isolated_cr_line_comment.rs   |   3 +
 .../compile/torture/all_doc_comment_line_blocks.rs |  47 +++++
 .../torture/all_doc_comment_line_blocks_crlf.rs    |  47 +++++
 .../compile/torture/isolated_cr_block_comment.rs   |   2 +
 .../compile/torture/isolated_cr_line_comment.rs    |   2 +
 14 files changed, 401 insertions(+), 30 deletions(-)
 create mode 100644 gcc/testsuite/rust/compile/bad_inner_doc.rs
 create mode 100644 gcc/testsuite/rust/compile/doc_isolated_cr_block_comment.rs
 create mode 100644 gcc/testsuite/rust/compile/doc_isolated_cr_inner_block_comment.rs
 create mode 100644 gcc/testsuite/rust/compile/doc_isolated_cr_inner_line_comment.rs
 create mode 100644 gcc/testsuite/rust/compile/doc_isolated_cr_line_comment.rs
 create mode 100644 gcc/testsuite/rust/compile/torture/all_doc_comment_line_blocks.rs
 create mode 100644 gcc/testsuite/rust/compile/torture/all_doc_comment_line_blocks_crlf.rs
 create mode 100644 gcc/testsuite/rust/compile/torture/isolated_cr_block_comment.rs
 create mode 100644 gcc/testsuite/rust/compile/torture/isolated_cr_line_comment.rs


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Handle doc comment strings in lexer and parser
  2021-07-12  8:32   ` Mark Wielaard
@ 2021-07-12 10:06     ` Philip Herron
  2021-07-12 22:44       ` New contributor tasks Mark Wielaard
  0 siblings, 1 reply; 10+ messages in thread
From: Philip Herron @ 2021-07-12 10:06 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: gcc-rust


[-- Attachment #1.1: Type: text/plain, Size: 3901 bytes --]

On 12/07/2021 09:32, Mark Wielaard wrote:
> Hi Philip,
>
> On Mon, Jul 12, 2021 at 09:09:09AM +0100, Philip Herron wrote:
>> This patch looks good to me. When I tried to apply it to merge it I got
>> the following:
>>
>> ```
>> $ git am  '[PATCH] Handle doc comment strings in lexer and parser.eml'
>> Applying: Handle doc comment strings in lexer and parser
>> error: corrupt patch at line 531
>> Patch failed at 0001 Handle doc comment strings in lexer and parser
>> hint: Use 'git am --show-current-patch' to see the failed patch
>> When you have resolved this problem, run "git am --continue".
>> If you prefer to skip this patch, run "git am --skip" instead.
>> To restore the original branch and stop patching, run "git am --abort".
>> ```
>>
>> Not sure if I have done something wrong, have you any pointers?
> Looks like that is one of the IsolatedCR tests (a bare \r not at the
> end of a line followed by \n in a doc comment string). I assume some
> mailer ate it and/or added a \n somehwere to "correct" it.
>
> Would you be able to pull directly from my git repo?
>
> The following changes since commit 4560f469ee33536cec6af0f8e5816ff97de60de0:
>
>   Merge #551 (2021-07-10 21:02:06 +0000)
>
> are available in the Git repository at:
>
>   https://code.wildebeest.org/git/user/mjw/gccrs doc-comments
>
> for you to fetch changes up to e1e14958a90397a1ed6ab7236dc5a6f1c2f22505:
>
>   Handle doc comment strings in lexer and parser (2021-07-11 21:09:21 +0200)
>
> ----------------------------------------------------------------
> Mark Wielaard (1):
>       Handle doc comment strings in lexer and parser
>
>  gcc/rust/ast/rust-ast.h                            |   2 +
>  gcc/rust/lex/rust-lex.cc                           | 214 +++++++++++++++++++--
>  gcc/rust/lex/rust-token.h                          |  25 ++-
>  gcc/rust/parse/rust-parse-impl.h                   |  60 +++++-
>  gcc/rust/parse/rust-parse.h                        |   1 +
>  gcc/testsuite/rust/compile/bad_inner_doc.rs        |  15 ++
>  .../rust/compile/doc_isolated_cr_block_comment.rs  |   3 +
>  .../compile/doc_isolated_cr_inner_block_comment.rs |   5 +
>  .../compile/doc_isolated_cr_inner_line_comment.rs  |   5 +
>  .../rust/compile/doc_isolated_cr_line_comment.rs   |   3 +
>  .../compile/torture/all_doc_comment_line_blocks.rs |  47 +++++
>  .../torture/all_doc_comment_line_blocks_crlf.rs    |  47 +++++
>  .../compile/torture/isolated_cr_block_comment.rs   |   2 +
>  .../compile/torture/isolated_cr_line_comment.rs    |   2 +
>  14 files changed, 401 insertions(+), 30 deletions(-)
>  create mode 100644 gcc/testsuite/rust/compile/bad_inner_doc.rs
>  create mode 100644 gcc/testsuite/rust/compile/doc_isolated_cr_block_comment.rs
>  create mode 100644 gcc/testsuite/rust/compile/doc_isolated_cr_inner_block_comment.rs
>  create mode 100644 gcc/testsuite/rust/compile/doc_isolated_cr_inner_line_comment.rs
>  create mode 100644 gcc/testsuite/rust/compile/doc_isolated_cr_line_comment.rs
>  create mode 100644 gcc/testsuite/rust/compile/torture/all_doc_comment_line_blocks.rs
>  create mode 100644 gcc/testsuite/rust/compile/torture/all_doc_comment_line_blocks_crlf.rs
>  create mode 100644 gcc/testsuite/rust/compile/torture/isolated_cr_block_comment.rs
>  create mode 100644 gcc/testsuite/rust/compile/torture/isolated_cr_line_comment.rs
>
Thanks Mark that worked its up now:
https://github.com/Rust-GCC/gccrs/pull/561

Great work once again. I am aiming to spend some time towards the end of
the week to add more tickets and info for new contributors to get
involved, which I will post the interesting ones onto the mailing list
as well. I think it should be interesting to contributors of all levels.
The main one that sticks out in my mind is the AST, HIR dumps which are
a bit of a mess at the moment.

Thanks

--Phil



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 665 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* New contributor tasks
  2021-07-12 10:06     ` Philip Herron
@ 2021-07-12 22:44       ` Mark Wielaard
  2021-07-13 13:16         ` Philip Herron
  2021-07-13 13:30         ` Thomas Schwinge
  0 siblings, 2 replies; 10+ messages in thread
From: Mark Wielaard @ 2021-07-12 22:44 UTC (permalink / raw)
  To: Philip Herron; +Cc: gcc-rust

On Mon, Jul 12, 2021 at 11:06:01AM +0100, Philip Herron wrote:
> Great work once again. I am aiming to spend some time towards the end of
> the week to add more tickets and info for new contributors to get
> involved, which I will post the interesting ones onto the mailing list
> as well. I think it should be interesting to contributors of all levels.
> The main one that sticks out in my mind is the AST, HIR dumps which are
> a bit of a mess at the moment.

The AST dump (--rust-dump-parse) was actually useful for checking the
comment doc strings, but it could certainly be improved. Ideally it
would be structured in a way that can easily be used in tests.

Some (random) notes I made on issues that might be nice to explain
and/or work on.

- Full unicode/utf8 support in the lexer. Currently the lexer only
  explicitly interprets the input as UTF8 for string parseing. It
  should really treat all input as UTF-8. gnulib has some handy
  modules we could use to read/convert from/to utf8 (unistr/u8-to-u32,
  unistr/u32-to-u8) and test various unicode properties
  (unictype/property-white-space, unictype/property-xid-continue,
  unictype/property-xid-start). I don't know if we can import those or
  if gcc already has these kind of UTF-8/unicode support functions for
  other languages?

- Error handling using rich locations in the lexer and parser.  It
  seems some support is already there, but it isn't totally clear to
  me what is already in place and what could/should be added. e.g. how
  to add notes to an Error.

- I noticed some expressions didn't parse because of what looks to me
  operator precedence issues. e.g the following:

  const S: usize = 64;

  pub fn main ()
  {
    let a:u8 = 1;
    let b:u8 = 2;
    let _c = S * a as usize + b as usize;
  }

  $ gcc/gccrs -Bgcc as.rs

  as.rs:7:27: error: type param bounds (in TraitObjectType) are not allowed as TypeNoBounds
    7 |   let _c = S * a as usize + b as usize;
      |                           ^

  How does one fix such operator precedence issues in the parser?

- Related, TypeCastExpr as the above aren't lowered from AST to HIR.
  I believe I know how to do it, but a small description of the visitor
  pattern used and in which files one does such lowering would be helpful.

- And of course, how to lower HIR to GENERIC?  For TypeCastExpr you
  said on irc we need traits first, but the semantics for primitive
  types is actually spelled out in The Reference. Can we already
  handle them for primitive types (like in the above example having an
  u8 as usize)?

- rust-macro-expand tries to handle both macros and attributes, is
  this by design?  Should we handle different passes for different
  (inert or not) attributes that run before or after macro expansion?

Cheers,

Mark


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: New contributor tasks
  2021-07-12 22:44       ` New contributor tasks Mark Wielaard
@ 2021-07-13 13:16         ` Philip Herron
       [not found]           ` <CADYxmzTdEH2pHba1+1nq5AXEQAyb6UhT8xvRKdWB7bu41ex1UA@mail.gmail.com>
  2021-07-13 13:30         ` Thomas Schwinge
  1 sibling, 1 reply; 10+ messages in thread
From: Philip Herron @ 2021-07-13 13:16 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: gcc-rust, simplytheother


[-- Attachment #1.1: Type: text/plain, Size: 6627 bytes --]

On 12/07/2021 23:44, Mark Wielaard wrote:
> On Mon, Jul 12, 2021 at 11:06:01AM +0100, Philip Herron wrote:
>> Great work once again. I am aiming to spend some time towards the end of
>> the week to add more tickets and info for new contributors to get
>> involved, which I will post the interesting ones onto the mailing list
>> as well. I think it should be interesting to contributors of all levels.
>> The main one that sticks out in my mind is the AST, HIR dumps which are
>> a bit of a mess at the moment.
> The AST dump (--rust-dump-parse) was actually useful for checking the
> comment doc strings, but it could certainly be improved. Ideally it
> would be structured in a way that can easily be used in tests.
I think a really good project would be to update our HIR dump, it should
really be an S-expression format so we can emit the
Analysis::NodeMapping information in a way that looks good at the moment
its a mess.
> Some (random) notes I made on issues that might be nice to explain
> and/or work on.
>
> - Full unicode/utf8 support in the lexer. Currently the lexer only
>   explicitly interprets the input as UTF8 for string parseing. It
>   should really treat all input as UTF-8. gnulib has some handy
>   modules we could use to read/convert from/to utf8 (unistr/u8-to-u32,
>   unistr/u32-to-u8) and test various unicode properties
>   (unictype/property-white-space, unictype/property-xid-continue,
>   unictype/property-xid-start). I don't know if we can import those or
>   if gcc already has these kind of UTF-8/unicode support functions for
>   other languages?
GCCGO supports utf-8 formats for identifiers but I think it has its own
implementation to do this. I think pulling in gnulib sounds like a good
idea, i assume we should ask about this on the GCC mailing list but I
would prefer to reuse a library for utf8 support. The piece about
creating the strings in GENERIC will need updated as part of that work.
> - Error handling using rich locations in the lexer and parser.  It
>   seems some support is already there, but it isn't totally clear to
>   me what is already in place and what could/should be added. e.g. how
>   to add notes to an Error.
I've made a wrapper over RichLocation i had some crashes when i added
methods for annotations. Overall my understanding is that a Location
that we have at the moment is a single character location in the source
code but Rustc uses Spans which might be an abstraction we could think
about implementing instead of the Location wrapper we are reusing for
GCCGO.
> - I noticed some expressions didn't parse because of what looks to me
>   operator precedence issues. e.g the following:
>
>   const S: usize = 64;
>
>   pub fn main ()
>   {
>     let a:u8 = 1;
>     let b:u8 = 2;
>     let _c = S * a as usize + b as usize;
>   }
>
>   $ gcc/gccrs -Bgcc as.rs
>
>   as.rs:7:27: error: type param bounds (in TraitObjectType) are not allowed as TypeNoBounds
>     7 |   let _c = S * a as usize + b as usize;
>       |                           ^
>
>   How does one fix such operator precedence issues in the parser?

Off the top of my head it looks as though the parse_type_cast_expr has a
FIXME for the precedence issue for it. The Pratt parser uses the notion
of binding powers to handle this and i think it needs to follow in a
similar style to the ::parse_expr piece.

> - Related, TypeCastExpr as the above aren't lowered from AST to HIR.
>   I believe I know how to do it, but a small description of the visitor
>   pattern used and in which files one does such lowering would be helpful.
The AST->HIR lowering does need some documentation, since it must go
through name-resolution first but there is no documentation on how any
of this works yet. I will put this on my todo list its come up a few
times the naming of some of the classes like ResolveItemToplevel vs
ResolveItem are confusing things. Some of this will get cleaned up as
part of traits, such as the forward declared items within a block bug:

Basically the idea is that we always perform a toplevel scan for all
items and create long canonical names in the top most scope, such that
we can resolve their names at any point without requiring prototypes or
look ahead. This means we have a pass to look for the names then we have
a pass to then resolve each structures fields, functions parameters,
returns types and blocks of code. So if a block calls to a function
declared ahead we can still resolve it to its NodeId. It is when we
ResolveItem we push new contexts onto the stack to have lexical scoping
for names. Its worth noting that Rust also supports shadowing of
variables within a block so these do not cause a duplicate name error
and simply add a new declaration to that context or what rustc calls
Ribs such that further resolution will reference this new declaration
and the previous one is shadowed correctly.

> - And of course, how to lower HIR to GENERIC?  For TypeCastExpr you
>   said on irc we need traits first, but the semantics for primitive
>   types is actually spelled out in The Reference. Can we already
>   handle them for primitive types (like in the above example having an
>   u8 as usize)?
Lowering HIR to GENERIC documentation is on my todo list as well, though
there are a bunch of cleanups I have in progress which should also help
here.
> - rust-macro-expand tries to handle both macros and attributes, is
>   this by design?  Should we handle different passes for different
>   (inert or not) attributes that run before or after macro expansion?
As for macro and cfg expansion Joel some stuff already in place but i do
think they need to be separated into distinct passes which would be a
good first start with the expand folder.
>
> Cheers,
>
> Mark
>
Great summary mail i think this sums up a lot of the common issues. Note
I added in Joel who wrote the parser he might be to provide more insight.

I added some comments inline to each point. I think i can take away from
this that we are missing some useful pieces of architecture
documentation which is becoming important. I think it will be easier for
me to get this done in a few weeks as there are changes in the areas
referenced which will affect the documentation.

Overall I do really like the visitor pattern for this work since it is
isolating the code for each AST or HIR node but it is more difficult to
follow the flow of the pipeline.

Sorry this does not contain all of the answers yet but I will work on
them. Thanks

--Phil













[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 665 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: New contributor tasks
  2021-07-12 22:44       ` New contributor tasks Mark Wielaard
  2021-07-13 13:16         ` Philip Herron
@ 2021-07-13 13:30         ` Thomas Schwinge
  1 sibling, 0 replies; 10+ messages in thread
From: Thomas Schwinge @ 2021-07-13 13:30 UTC (permalink / raw)
  To: Mark Wielaard, Philip Herron, gcc-rust

Hi!

On 2021-07-13T00:44:13+0200, Mark Wielaard <mark@klomp.org> wrote:
> On Mon, Jul 12, 2021 at 11:06:01AM +0100, Philip Herron wrote:
>> The main one that sticks out in my mind is the AST, HIR dumps which are
>> a bit of a mess at the moment.
>
> The AST dump (--rust-dump-parse) was actually useful for checking the
> comment doc strings, but it could certainly be improved. Ideally it
> would be structured in a way that can easily be used in tests.

Right.  Already a while ago, I had run into the same (for a lexer-level
thing), and have early-stages WIP changes to implement dumps for the
several GCC/Rust front end stages using the (more or less) standard
'-fdump-lang-[...]' flag.  These dump files may then be scanned using the
usual GCC/DejaGnu testsuite idioms.  I plan to complete that work at some
later point in time, hopefully not too far out.  (Mark, I then actually
had planned to add some testcases fore your recent lexer changes.)

(My work there is independent of/orthogonal to the S-expression dump
format discussed elsewhere.)


Grüße
 Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Fwd: New contributor tasks
       [not found]           ` <CADYxmzTdEH2pHba1+1nq5AXEQAyb6UhT8xvRKdWB7bu41ex1UA@mail.gmail.com>
@ 2021-07-17 14:25             ` The Other
  2021-07-17 21:23               ` Mark Wielaard
  2021-07-18 20:45               ` Mark Wielaard
  0 siblings, 2 replies; 10+ messages in thread
From: The Other @ 2021-07-17 14:25 UTC (permalink / raw)
  To: mark; +Cc: gcc-rust

[-- Attachment #1: Type: text/plain, Size: 12175 bytes --]

Sorry, pressed the wrong button. I meant to "reply all".

---------- Forwarded message ---------
From: The Other <simplytheother@gmail.com>
Date: Sat, Jul 17, 2021 at 10:20 PM
Subject: Re: New contributor tasks
To: Philip Herron <philip.herron@embecosm.com>


> The AST dump (--rust-dump-parse) was actually useful for checking the
> comment doc strings, but it could certainly be improved. Ideally it
> would be structured in a way that can easily be used in tests.

Yes, I agree. It has its mismatched style because I originally intended it
to be basically "to_string" in the most literal sense possible, but then
realised this would be infeasible for some of the more complicated parts.
Theoretically, I would personally like it to be in a format similar to
clang's AST dump.

> - Full unicode/utf8 support in the lexer. Currently the lexer only
>   explicitly interprets the input as UTF8 for string parseing. It
>   should really treat all input as UTF-8. gnulib has some handy
>   modules we could use to read/convert from/to utf8 (unistr/u8-to-u32,
>   unistr/u32-to-u8) and test various unicode properties
>   (unictype/property-white-space, unictype/property-xid-continue,
>   unictype/property-xid-start). I don't know if we can import those or
>   if gcc already has these kind of UTF-8/unicode support functions for
>   other languages?

At the time of writing the lexer, I was under the impression that Rust only
supported UTF-8 in strings. The Rust Reference seems to have changed now to
show that it supports UTF-8 in identifiers as well. I believe that the C++
frontend, at least, has its own specific hardcoded UTF-8 handling for
identifiers and strings (rather than using a library).

There could be issues with lookahead of several bytes (which the lexer uses
liberally) if using UTF-8 in strings, depending on the exact implementation
of whatever library you use (or function you write).

>> - Error handling using rich locations in the lexer and parser.  It
>>   seems some support is already there, but it isn't totally clear to
>>   me what is already in place and what could/should be added. e.g. how
>>   to add notes to an Error.
> I've made a wrapper over RichLocation i had some crashes when i added
> methods for annotations. Overall my understanding is that a Location
> that we have at the moment is a single character location in the source
> code but Rustc uses Spans which might be an abstraction we could think
> about implementing instead of the Location wrapper we are reusing for
> GCCGO.

The Error class may need to be redesigned. It was a quick fix I made to
allow parse errors to be ignored (since macro expansion would cause parse
errors with non-matching macro matchers). Instead of having the
"emit_error" and "emit_fatal_error" methods, it may be better to instead
store a "kind" of error upon construction, and then just have an "emit"
method that will emit the type of error as specified.
Similarly, Error may have to be rewritten to use RichLocation instead of
Location or something.

>> - I noticed some expressions didn't parse because of what looks to me
>>   operator precedence issues. e.g the following:
>>
>>   const S: usize = 64;
>>
>>   pub fn main ()
>>   {
>>     let a:u8 = 1;
>>     let b:u8 = 2;
>>     let _c = S * a as usize + b as usize;
>>   }
>>
>>   $ gcc/gccrs -Bgcc as.rs
>>
>>   as.rs:7:27: error: type param bounds (in TraitObjectType) are not
allowed as TypeNoBounds
>>     7 |   let _c = S * a as usize + b as usize;
>>       |                           ^
>>
>>   How does one fix such operator precedence issues in the parser?

> Off the top of my head it looks as though the parse_type_cast_expr has a
> FIXME for the precedence issue for it. The Pratt parser uses the notion
> of binding powers to handle this and i think it needs to follow in a
> similar style to the ::parse_expr piece.

Yes, this is probably a precedence issue. The actual issue is that while
expressions have precedence, types (such as "usize", which is what is being
parsed) do not, and greedily parse tokens like "+". Additionally, the
interactions of types and expressions and precedence between them is
something that I have no idea how to approach.
I believe that this specific issue could be fixed by modifying the
parse_type_no_bounds method - if instead of erroring when finding a plus,
it simply returned (treating it like an expression would treat a semicolon,
basically), then this would have the desired functionality. I don't believe
that parse_type_no_bounds (TypeNoBounds do not have '+' in them) would ever
be called in an instance where a Type (that allows bounds) is allowable, so
this change should hopefully not cause any correct programs to parse
incorrectly.

>> - rust-macro-expand tries to handle both macros and attributes, is
>>  this by design?  Should we handle different passes for different
>>  (inert or not) attributes that run before or after macro expansion?
> As for macro and cfg expansion Joel some stuff already in place but i do
> think they need to be separated into distinct passes which would be a
> good first start with the expand folder.

That is a good question. Technically, rust-macro-expand only handles cfg
expansion at the moment. You can read and discuss more about that here:
https://github.com/Rust-GCC/gccrs/issues/563

Thanks,
Joel


On Tue, Jul 13, 2021 at 9:16 PM Philip Herron <philip.herron@embecosm.com>
wrote:

> On 12/07/2021 23:44, Mark Wielaard wrote:
> > On Mon, Jul 12, 2021 at 11:06:01AM +0100, Philip Herron wrote:
> >> Great work once again. I am aiming to spend some time towards the end of
> >> the week to add more tickets and info for new contributors to get
> >> involved, which I will post the interesting ones onto the mailing list
> >> as well. I think it should be interesting to contributors of all levels.
> >> The main one that sticks out in my mind is the AST, HIR dumps which are
> >> a bit of a mess at the moment.
> > The AST dump (--rust-dump-parse) was actually useful for checking the
> > comment doc strings, but it could certainly be improved. Ideally it
> > would be structured in a way that can easily be used in tests.
> I think a really good project would be to update our HIR dump, it should
> really be an S-expression format so we can emit the
> Analysis::NodeMapping information in a way that looks good at the moment
> its a mess.
> > Some (random) notes I made on issues that might be nice to explain
> > and/or work on.
> >
> > - Full unicode/utf8 support in the lexer. Currently the lexer only
> >   explicitly interprets the input as UTF8 for string parseing. It
> >   should really treat all input as UTF-8. gnulib has some handy
> >   modules we could use to read/convert from/to utf8 (unistr/u8-to-u32,
> >   unistr/u32-to-u8) and test various unicode properties
> >   (unictype/property-white-space, unictype/property-xid-continue,
> >   unictype/property-xid-start). I don't know if we can import those or
> >   if gcc already has these kind of UTF-8/unicode support functions for
> >   other languages?
> GCCGO supports utf-8 formats for identifiers but I think it has its own
> implementation to do this. I think pulling in gnulib sounds like a good
> idea, i assume we should ask about this on the GCC mailing list but I
> would prefer to reuse a library for utf8 support. The piece about
> creating the strings in GENERIC will need updated as part of that work.
> > - Error handling using rich locations in the lexer and parser.  It
> >   seems some support is already there, but it isn't totally clear to
> >   me what is already in place and what could/should be added. e.g. how
> >   to add notes to an Error.
> I've made a wrapper over RichLocation i had some crashes when i added
> methods for annotations. Overall my understanding is that a Location
> that we have at the moment is a single character location in the source
> code but Rustc uses Spans which might be an abstraction we could think
> about implementing instead of the Location wrapper we are reusing for
> GCCGO.
> > - I noticed some expressions didn't parse because of what looks to me
> >   operator precedence issues. e.g the following:
> >
> >   const S: usize = 64;
> >
> >   pub fn main ()
> >   {
> >     let a:u8 = 1;
> >     let b:u8 = 2;
> >     let _c = S * a as usize + b as usize;
> >   }
> >
> >   $ gcc/gccrs -Bgcc as.rs
> >
> >   as.rs:7:27: error: type param bounds (in TraitObjectType) are not
> allowed as TypeNoBounds
> >     7 |   let _c = S * a as usize + b as usize;
> >       |                           ^
> >
> >   How does one fix such operator precedence issues in the parser?
>
> Off the top of my head it looks as though the parse_type_cast_expr has a
> FIXME for the precedence issue for it. The Pratt parser uses the notion
> of binding powers to handle this and i think it needs to follow in a
> similar style to the ::parse_expr piece.
>
> > - Related, TypeCastExpr as the above aren't lowered from AST to HIR.
> >   I believe I know how to do it, but a small description of the visitor
> >   pattern used and in which files one does such lowering would be
> helpful.
> The AST->HIR lowering does need some documentation, since it must go
> through name-resolution first but there is no documentation on how any
> of this works yet. I will put this on my todo list its come up a few
> times the naming of some of the classes like ResolveItemToplevel vs
> ResolveItem are confusing things. Some of this will get cleaned up as
> part of traits, such as the forward declared items within a block bug:
>
> Basically the idea is that we always perform a toplevel scan for all
> items and create long canonical names in the top most scope, such that
> we can resolve their names at any point without requiring prototypes or
> look ahead. This means we have a pass to look for the names then we have
> a pass to then resolve each structures fields, functions parameters,
> returns types and blocks of code. So if a block calls to a function
> declared ahead we can still resolve it to its NodeId. It is when we
> ResolveItem we push new contexts onto the stack to have lexical scoping
> for names. Its worth noting that Rust also supports shadowing of
> variables within a block so these do not cause a duplicate name error
> and simply add a new declaration to that context or what rustc calls
> Ribs such that further resolution will reference this new declaration
> and the previous one is shadowed correctly.
>
> > - And of course, how to lower HIR to GENERIC?  For TypeCastExpr you
> >   said on irc we need traits first, but the semantics for primitive
> >   types is actually spelled out in The Reference. Can we already
> >   handle them for primitive types (like in the above example having an
> >   u8 as usize)?
> Lowering HIR to GENERIC documentation is on my todo list as well, though
> there are a bunch of cleanups I have in progress which should also help
> here.
> > - rust-macro-expand tries to handle both macros and attributes, is
> >   this by design?  Should we handle different passes for different
> >   (inert or not) attributes that run before or after macro expansion?
> As for macro and cfg expansion Joel some stuff already in place but i do
> think they need to be separated into distinct passes which would be a
> good first start with the expand folder.
> >
> > Cheers,
> >
> > Mark
> >
> Great summary mail i think this sums up a lot of the common issues. Note
> I added in Joel who wrote the parser he might be to provide more insight.
>
> I added some comments inline to each point. I think i can take away from
> this that we are missing some useful pieces of architecture
> documentation which is becoming important. I think it will be easier for
> me to get this done in a few weeks as there are changes in the areas
> referenced which will affect the documentation.
>
> Overall I do really like the visitor pattern for this work since it is
> isolating the code for each AST or HIR node but it is more difficult to
> follow the flow of the pipeline.
>
> Sorry this does not contain all of the answers yet but I will work on
> them. Thanks
>
> --Phil
>
>
>
>
>
>
>
>
>
>
>
>
>

[-- Attachment #2: Type: text/html, Size: 14768 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fwd: New contributor tasks
  2021-07-17 14:25             ` Fwd: " The Other
@ 2021-07-17 21:23               ` Mark Wielaard
  2021-07-18 20:45               ` Mark Wielaard
  1 sibling, 0 replies; 10+ messages in thread
From: Mark Wielaard @ 2021-07-17 21:23 UTC (permalink / raw)
  To: The Other; +Cc: gcc-rust

Hi Joel,

On Sat, Jul 17, 2021 at 10:25:48PM +0800, The Other wrote:
> > - Full unicode/utf8 support in the lexer. Currently the lexer only
> >   explicitly interprets the input as UTF8 for string parseing. It
> >   should really treat all input as UTF-8. gnulib has some handy
> >   modules we could use to read/convert from/to utf8 (unistr/u8-to-u32,
> >   unistr/u32-to-u8) and test various unicode properties
> >   (unictype/property-white-space, unictype/property-xid-continue,
> >   unictype/property-xid-start). I don't know if we can import those or
> >   if gcc already has these kind of UTF-8/unicode support functions for
> >   other languages?
> 
> At the time of writing the lexer, I was under the impression that Rust only
> supported UTF-8 in strings. The Rust Reference seems to have changed now to
> show that it supports UTF-8 in identifiers as well. I believe that the C++
> frontend, at least, has its own specific hardcoded UTF-8 handling for
> identifiers and strings (rather than using a library).
> 
> There could be issues with lookahead of several bytes (which the lexer uses
> liberally) if using UTF-8 in strings, depending on the exact implementation
> of whatever library you use (or function you write).

The whole source file should be valid UTF-8. You can use it in
comments too. And any invalid UTF-8 encoding means the file isn't a
valid Rust source file. So the simplest is to make the lexer handle
UTF-8 and handle one codepoint (UCS4/32bits) at a time. Lookahead then
also simply works per codepoint. We would still store strings as
UTF-8. gnulib contains various helpers to convert to/from utf-8/ucs4
and to test various unicode properties of codepoints. I'll ask on the
gcc mailinglist whether to use the C++ frontend support or import the
gnulib helpers.

> >> - rust-macro-expand tries to handle both macros and attributes, is
> >>  this by design?  Should we handle different passes for different
> >>  (inert or not) attributes that run before or after macro expansion?
> > As for macro and cfg expansion Joel some stuff already in place but i do
> > think they need to be separated into distinct passes which would be a
> > good first start with the expand folder.
> 
> That is a good question. Technically, rust-macro-expand only handles cfg
> expansion at the moment. You can read and discuss more about that here:
> https://github.com/Rust-GCC/gccrs/issues/563

I have to think about whether it makes sense to handle the cfg
attribute and the !cfg macro rules in hte same pass/expansion. The
!cfg macro seems so simple it could be handled immediately by the
parser since it only relies on the compiler/host attributes and simply
generates a true or false token.

In general it seems attribute expansion cannot be simply done by one
AttributeVisitor pass because the effect can be at different stages of
parsing (and they can even affect what the lexer accepts -
e.g. whether identifiers as unicode strings are accepted). For example
the various lint attributes can warn/error/etc when lowering the final
AST (CamelCaseStructs for example), after type checking or after
lifeness analysis. So maybe we need to design a pass for each
different attribute and not try to combine them (except maybe to
recognize and validate the attribute syntax).

Cheers,

Mark

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fwd: New contributor tasks
  2021-07-17 14:25             ` Fwd: " The Other
  2021-07-17 21:23               ` Mark Wielaard
@ 2021-07-18 20:45               ` Mark Wielaard
  1 sibling, 0 replies; 10+ messages in thread
From: Mark Wielaard @ 2021-07-18 20:45 UTC (permalink / raw)
  To: The Other; +Cc: gcc-rust

[-- Attachment #1: Type: text/plain, Size: 3794 bytes --]

Hi Joel,

On Sat, Jul 17, 2021 at 10:25:48PM +0800, The Other wrote:
> >> - I noticed some expressions didn't parse because of what looks to me
> >>   operator precedence issues. e.g the following:
> >>
> >>   const S: usize = 64;
> >>
> >>   pub fn main ()
> >>   {
> >>     let a:u8 = 1;
> >>     let b:u8 = 2;
> >>     let _c = S * a as usize + b as usize;
> >>   }
> >>
> >>   $ gcc/gccrs -Bgcc as.rs
> >>
> >>   as.rs:7:27: error: type param bounds (in TraitObjectType) are not
> >>   allowed as TypeNoBounds
> >>     7 |   let _c = S * a as usize + b as usize;
> >>       |                           ^
> >>
> >>   How does one fix such operator precedence issues in the parser?
> 
> > Off the top of my head it looks as though the parse_type_cast_expr has a
> > FIXME for the precedence issue for it. The Pratt parser uses the notion
> > of binding powers to handle this and i think it needs to follow in a
> > similar style to the ::parse_expr piece.
> 
> Yes, this is probably a precedence issue. The actual issue is that while
> expressions have precedence, types (such as "usize", which is what is being
> parsed) do not, and greedily parse tokens like "+". Additionally, the
> interactions of types and expressions and precedence between them is
> something that I have no idea how to approach.
> I believe that this specific issue could be fixed by modifying the
> parse_type_no_bounds method - if instead of erroring when finding a plus,
> it simply returned (treating it like an expression would treat a semicolon,
> basically), then this would have the desired functionality. I don't believe
> that parse_type_no_bounds (TypeNoBounds do not have '+' in them) would ever
> be called in an instance where a Type (that allows bounds) is allowable, so
> this change should hopefully not cause any correct programs to parse
> incorrectly.

I think you are correct. The issue is that parse_type_no_bounds tries
to be helpful and greedily looks for a PLUS so it can produce an
error. Simply removing that case makes things parse.

Patch attached and also here:
https://code.wildebeest.org/git/user/mjw/gccrs/commit/?h=as-type

This cannot be fully tested yet, because as Cast Expressions aren't
lowered from AST to HIR yet. I didn't get very far trying to lower the
CastExpr to HIR. This is what I came up with. But I didn't know how to
handle the type path yet.

diff --git a/gcc/rust/hir/rust-ast-lower-expr.h b/gcc/rust/hir/rust-ast-lower-expr.h
index 19ce8c2cf1f..96f6073cd86 100644
--- a/gcc/rust/hir/rust-ast-lower-expr.h
+++ b/gcc/rust/hir/rust-ast-lower-expr.h
@@ -405,6 +405,24 @@ public:
                               expr.get_locus ());
   }
 
+  void visit (AST::TypeCastExpr &expr) override
+  {
+    HIR::Expr *expr_to_cast_to
+      = ASTLoweringExpr::translate (expr.get_casted_expr ().get ());
+
+    HIR::TypeNoBounds *type_to_cast_to
+      = nullptr; /* ... (expr._get_type_to_cast_to ().get ())); */
+
+    auto crate_num = mappings->get_current_crate ();
+    Analysis::NodeMapping mapping (crate_num, expr.get_node_id (),
+                                  mappings->get_next_hir_id (crate_num),
+                                  UNKNOWN_LOCAL_DEFID);
+
+    translated = new HIR::TypeCastExpr (
+      mapping, std::unique_ptr<HIR::Expr> (expr_to_cast_to),
+      std::unique_ptr<HIR::TypeNoBounds> (type_to_cast_to), expr.get_locus ());
+  }
+
   /* Compound assignment expression is compiled away. */
   void visit (AST::CompoundAssignmentExpr &expr) override
   {

It does get us a little bit further into the type checker:

as2.rs:7:12: error: failed to type resolve expression
    7 |   let _c = a as usize + b as usize;
      |            ^
as2.rs:7:25: error: failed to type resolve expression
    7 |   let _c = a as usize + b as usize;


Cheers,

Mark

[-- Attachment #2: 0001-Remove-error-handling-in-parse_type_no_bounds-for-PL.patch --]
[-- Type: text/x-diff, Size: 1344 bytes --]

From 4c92de44cde1bdd8d0fcb8a19adafd529d6c759c Mon Sep 17 00:00:00 2001
From: Mark Wielaard <mark@klomp.org>
Date: Sun, 18 Jul 2021 22:12:20 +0200
Subject: [PATCH] Remove error handling in parse_type_no_bounds for PLUS token

parse_type_no_bounds tries to be helpful and greedily looks for a PLUS
token after having parsed a typepath so it can produce an error. But
that error breaks parsing expressions that contain "as" Cast
Expressions like "a as usize + b as usize". Drop the explicit error on
seeing a PLUS token and just return the type path parsed.
---
 gcc/rust/parse/rust-parse-impl.h | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/gcc/rust/parse/rust-parse-impl.h b/gcc/rust/parse/rust-parse-impl.h
index eedc76db43e..a0607926950 100644
--- a/gcc/rust/parse/rust-parse-impl.h
+++ b/gcc/rust/parse/rust-parse-impl.h
@@ -9996,13 +9996,6 @@ Parser<ManagedTokenSource>::parse_type_no_bounds ()
 				       std::move (tok_tree)),
 		  {}, locus));
 	    }
-	  case PLUS:
-	    // type param bounds - not allowed, here for error message
-	    add_error (Error (t->get_locus (),
-			      "type param bounds (in TraitObjectType) are not "
-			      "allowed as TypeNoBounds"));
-
-	    return nullptr;
 	  default:
 	    // assume that this is a type path and not an error
 	    return std::unique_ptr<AST::TypePath> (
-- 
2.32.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-07-18 20:45 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-11 20:10 [PATCH] Handle doc comment strings in lexer and parser Mark Wielaard
2021-07-12  8:09 ` Philip Herron
2021-07-12  8:32   ` Mark Wielaard
2021-07-12 10:06     ` Philip Herron
2021-07-12 22:44       ` New contributor tasks Mark Wielaard
2021-07-13 13:16         ` Philip Herron
     [not found]           ` <CADYxmzTdEH2pHba1+1nq5AXEQAyb6UhT8xvRKdWB7bu41ex1UA@mail.gmail.com>
2021-07-17 14:25             ` Fwd: " The Other
2021-07-17 21:23               ` Mark Wielaard
2021-07-18 20:45               ` Mark Wielaard
2021-07-13 13:30         ` Thomas Schwinge

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).