public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Better error messages for merge-conflict markers (v3)
@ 2015-10-30 15:02 David Malcolm
  2015-11-02 22:52 ` Jeff Law
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: David Malcolm @ 2015-10-30 15:02 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

This is a rebased version of this patch from back in April:
  v2: https://gcc.gnu.org/ml/gcc-patches/2015-04/msg00944.html
which in turn is a rewrite of this one:
  v1: https://gcc.gnu.org/ml/gcc-patches/2015-03/msg01087.html

The idea is to more gracefully handle merger conflict markers
in the source code being compiled.  Specifically, in the C and
C++ frontends, if we're about to emit an error, see if the
source code is at a merger conflict marker, and if so, emit
a more specific message, so that the user gets this:

foo.c:1:1: error: source file contains patch conflict marker
 <<<<<<< HEAD
 ^

rather than this gobbledegook:

foo.c:1:1: error: expected identifier or '(' before '<<' token
 <<<<<<< HEAD
 ^

It's something of a "fit and finish" cosmetic item, but these
things add up.

Successfully bootstrapped&regrtested on x86_64-pc-linux-gnu;
adds 82 new PASSes to g++.sum and 27 new PASSes to gcc.sum.

OK for trunk?

This implementation works by looking at the token stream, which
is slightly clunky; an alternative way of doing it would be to directly
look at the line the error occurs in and to see if it begins with a
conflict marker.  Would that be preferable?

gcc/c-family/ChangeLog:
	* c-common.h (conflict_marker_get_final_tok_kind): New prototype.
	* c-lex.c (conflict_marker_get_final_tok_kind): New function.

gcc/c/ChangeLog:
	* c-parser.c (struct c_parser): Expand array "tokens_buf" from 2
	to 4.
	(c_parser_peek_nth_token): New function.
	(c_parser_peek_conflict_marker): New function.
	(c_parser_error): Detect patch conflict markers and report them as
	such.

gcc/cp/ChangeLog:
	* parser.c (cp_lexer_peek_conflict_marker): New function.
	(cp_parser_error): Detect patch conflict markers and report them
	as such.

gcc/testsuite/ChangeLog:
	* c-c++-common/patch-conflict-markers-1.c: New testcase.
	* c-c++-common/patch-conflict-markers-2.c: Likewise.
	* c-c++-common/patch-conflict-markers-3.c: Likewise.
	* c-c++-common/patch-conflict-markers-4.c: Likewise.
	* c-c++-common/patch-conflict-markers-5.c: Likewise.
	* c-c++-common/patch-conflict-markers-6.c: Likewise.
	* c-c++-common/patch-conflict-markers-7.c: Likewise.
	* c-c++-common/patch-conflict-markers-8.c: Likewise.
	* c-c++-common/patch-conflict-markers-9.c: Likewise.
	* g++.dg/patch-conflict-markers-1.C: Likewise.
---
 gcc/c-family/c-common.h                            |  4 ++
 gcc/c-family/c-lex.c                               | 26 +++++++++++
 gcc/c/c-parser.c                                   | 53 +++++++++++++++++++++-
 gcc/cp/parser.c                                    | 34 ++++++++++++++
 .../c-c++-common/patch-conflict-markers-1.c        |  9 ++++
 .../c-c++-common/patch-conflict-markers-2.c        |  2 +
 .../c-c++-common/patch-conflict-markers-3.c        | 11 +++++
 .../c-c++-common/patch-conflict-markers-4.c        | 11 +++++
 .../c-c++-common/patch-conflict-markers-5.c        | 11 +++++
 .../c-c++-common/patch-conflict-markers-6.c        | 38 ++++++++++++++++
 .../c-c++-common/patch-conflict-markers-7.c        |  6 +++
 .../c-c++-common/patch-conflict-markers-8.c        |  4 ++
 .../c-c++-common/patch-conflict-markers-9.c        |  8 ++++
 gcc/testsuite/g++.dg/patch-conflict-markers-1.C    | 13 ++++++
 14 files changed, 228 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/patch-conflict-markers-1.c
 create mode 100644 gcc/testsuite/c-c++-common/patch-conflict-markers-2.c
 create mode 100644 gcc/testsuite/c-c++-common/patch-conflict-markers-3.c
 create mode 100644 gcc/testsuite/c-c++-common/patch-conflict-markers-4.c
 create mode 100644 gcc/testsuite/c-c++-common/patch-conflict-markers-5.c
 create mode 100644 gcc/testsuite/c-c++-common/patch-conflict-markers-6.c
 create mode 100644 gcc/testsuite/c-c++-common/patch-conflict-markers-7.c
 create mode 100644 gcc/testsuite/c-c++-common/patch-conflict-markers-8.c
 create mode 100644 gcc/testsuite/c-c++-common/patch-conflict-markers-9.c
 create mode 100644 gcc/testsuite/g++.dg/patch-conflict-markers-1.C

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 4b5cac8..7ca9497 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1090,6 +1090,10 @@ extern void c_genericize (tree);
 extern int c_gimplify_expr (tree *, gimple_seq *, gimple_seq *);
 extern tree c_build_bind_expr (location_t, tree, tree);
 
+/* In c-lex.c.  */
+extern enum cpp_ttype
+conflict_marker_get_final_tok_kind (enum cpp_ttype tok1_kind);
+
 /* In c-pch.c  */
 extern void pch_init (void);
 extern void pch_cpp_save_state (void);
diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index c69f4a6..6fc63eb 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -1269,3 +1269,29 @@ lex_charconst (const cpp_token *token)
 
   return value;
 }
+
+/* Helper function for c_parser_peek_conflict_marker
+   and cp_lexer_peek_conflict_marker.
+   Given a possible patch conflict marker token of kind TOK1_KIND
+   consisting of a pair of characters, get the token kind for the
+   standalone final character.  */
+
+enum cpp_ttype
+conflict_marker_get_final_tok_kind (enum cpp_ttype tok1_kind)
+{
+  switch (tok1_kind)
+    {
+    default: gcc_unreachable ();
+    case CPP_LSHIFT:
+      /* "<<" and '<' */
+      return CPP_LESS;
+
+    case CPP_EQ_EQ:
+      /* "==" and '=' */
+      return CPP_EQ;
+
+    case CPP_RSHIFT:
+      /* ">>" and '>' */
+      return CPP_GREATER;
+    }
+}
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index c8c6a2d..a2d7ef6 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -181,8 +181,8 @@ struct GTY(()) c_parser {
   /* The look-ahead tokens.  */
   c_token * GTY((skip)) tokens;
   /* Buffer for look-ahead tokens.  */
-  c_token tokens_buf[2];
-  /* How many look-ahead tokens are available (0, 1 or 2, or
+  c_token tokens_buf[4];
+  /* How many look-ahead tokens are available (0 - 4, or
      more if parsing from pre-lexed tokens).  */
   unsigned int tokens_avail;
   /* True if a syntax error is being recovered from; false otherwise.
@@ -471,6 +471,20 @@ c_parser_peek_2nd_token (c_parser *parser)
   return &parser->tokens[1];
 }
 
+/* Return a pointer to the Nth token from PARSER, reading it
+   in if necessary.  The N-1th token is already read in.  */
+
+static c_token *
+c_parser_peek_nth_token (c_parser *parser, unsigned int n)
+{
+  if (parser->tokens_avail >= n)
+    return &parser->tokens[n - 1];
+  gcc_assert (parser->tokens_avail == n - 1);
+  c_lex_one_token (parser, &parser->tokens[n - 1]);
+  parser->tokens_avail = n;
+  return &parser->tokens[n - 1];
+}
+
 /* Return true if TOKEN can start a type name,
    false otherwise.  */
 static bool
@@ -808,6 +822,30 @@ c_parser_set_source_position_from_token (c_token *token)
     }
 }
 
+/* Helper function for c_parser_error.
+   Having peeked a token of kind TOK1_KIND that might signify
+   a patch conflict marker, peek successor tokens to determine
+   if we actually do have a patch conflict marker.
+   Specifically, we consider a run of 7 '<', '=' or '>' characters
+   as a patch conflict marker.
+   These come through the lexer as three pairs and a single,
+   e.g. three CPP_LSHIFT ("<<") and a CPP_LESS ('<').  */
+
+static bool
+c_parser_peek_conflict_marker (c_parser *parser, enum cpp_ttype tok1_kind)
+{
+  c_token *token2 = c_parser_peek_2nd_token (parser);
+  if (token2->type != tok1_kind)
+    return false;
+  c_token *token3 = c_parser_peek_nth_token (parser, 3);
+  if (token3->type != tok1_kind)
+    return false;
+  c_token *token4 = c_parser_peek_nth_token (parser, 4);
+  if (token4->type != conflict_marker_get_final_tok_kind (tok1_kind))
+    return false;
+  return true;
+}
+
 /* Issue a diagnostic of the form
       FILE:LINE: MESSAGE before TOKEN
    where TOKEN is the next token in the input stream of PARSER.
@@ -829,6 +867,17 @@ c_parser_error (c_parser *parser, const char *gmsgid)
   parser->error = true;
   if (!gmsgid)
     return;
+
+  /* If this is actually a patch conflict marker, report it as such.  */
+  if (token->type == CPP_LSHIFT
+      || token->type == CPP_RSHIFT
+      || token->type == CPP_EQ_EQ)
+    if (c_parser_peek_conflict_marker (parser, token->type))
+      {
+	error_at (token->location, "source file contains patch conflict marker");
+	return;
+      }
+
   /* This diagnostic makes more sense if it is tagged to the line of
      the token we just peeked at.  */
   c_parser_set_source_position_from_token (token);
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 7555bf3..6b6bd29 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -2632,6 +2632,30 @@ cp_parser_is_keyword (cp_token* token, enum rid keyword)
   return token->keyword == keyword;
 }
 
+/* Helper function for cp_parser_error.
+   Having peeked a token of kind TOK1_KIND that might signify
+   a patch conflict marker, peek successor tokens to determine
+   if we actually do have a patch conflict marker.
+   Specifically, we consider a run of 7 '<', '=' or '>' characters
+   as a patch conflict marker.
+   These come through the lexer as three pairs and a single,
+   e.g. three CPP_LSHIFT tokens ("<<") and a CPP_LESS token ('<').  */
+
+static bool
+cp_lexer_peek_conflict_marker (cp_lexer *lexer, enum cpp_ttype tok1_kind)
+{
+  cp_token *token2 = cp_lexer_peek_nth_token (lexer, 2);
+  if (token2->type != tok1_kind)
+    return false;
+  cp_token *token3 = cp_lexer_peek_nth_token (lexer, 3);
+  if (token3->type != tok1_kind)
+    return false;
+  cp_token *token4 = cp_lexer_peek_nth_token (lexer, 4);
+  if (token4->type != conflict_marker_get_final_tok_kind (tok1_kind))
+    return false;
+  return true;
+}
+
 /* If not parsing tentatively, issue a diagnostic of the form
       FILE:LINE: MESSAGE before TOKEN
    where TOKEN is the next token in the input stream.  MESSAGE
@@ -2656,6 +2680,16 @@ cp_parser_error (cp_parser* parser, const char* gmsgid)
 	  return;
 	}
 
+      /* If this is actually a patch conflict marker, report it as such.  */
+      if (token->type == CPP_LSHIFT
+          || token->type == CPP_RSHIFT
+          || token->type == CPP_EQ_EQ)
+        if (cp_lexer_peek_conflict_marker (parser->lexer, token->type))
+          {
+            error_at (token->location, "source file contains patch conflict marker");
+            return;
+          }
+
       c_parse_error (gmsgid,
 		     /* Because c_parser_error does not understand
 			CPP_KEYWORD, keywords are treated like
diff --git a/gcc/testsuite/c-c++-common/patch-conflict-markers-1.c b/gcc/testsuite/c-c++-common/patch-conflict-markers-1.c
new file mode 100644
index 0000000..71e9fa7
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/patch-conflict-markers-1.c
@@ -0,0 +1,9 @@
+int p;
+
+<<<<<<< HEAD /* { dg-error "patch conflict marker" } */
+extern int some_var;
+=======      /* { dg-error "patch conflict marker" } */
+extern short some_var; /* this line would lead to a warning */
+>>>>>>> Some commit message  /* { dg-error "patch conflict marker" } */
+
+int q;
diff --git a/gcc/testsuite/c-c++-common/patch-conflict-markers-2.c b/gcc/testsuite/c-c++-common/patch-conflict-markers-2.c
new file mode 100644
index 0000000..79030ee
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/patch-conflict-markers-2.c
@@ -0,0 +1,2 @@
+/* This should not be flagged as a patch conflict marker.  */
+const char *msg = "<<<<<<< ";
diff --git a/gcc/testsuite/c-c++-common/patch-conflict-markers-3.c b/gcc/testsuite/c-c++-common/patch-conflict-markers-3.c
new file mode 100644
index 0000000..be956b2
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/patch-conflict-markers-3.c
@@ -0,0 +1,11 @@
+/* Ensure we can handle unterminated conflict markers.  */
+
+int p;
+
+<<<<<<< HEAD  /* { dg-error "patch conflict marker" } */
+
+int q;
+
+<<<<<<< HEAD  /* { dg-error "patch conflict marker" } */
+
+int r;
diff --git a/gcc/testsuite/c-c++-common/patch-conflict-markers-4.c b/gcc/testsuite/c-c++-common/patch-conflict-markers-4.c
new file mode 100644
index 0000000..ec3730c
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/patch-conflict-markers-4.c
@@ -0,0 +1,11 @@
+/* Ensure we can handle mismatched conflict markers.  */
+
+int p;
+
+>>>>>>> Some commit message  /* { dg-error "patch conflict marker" } */
+
+int q;
+
+>>>>>>> Some other commit message  /* { dg-error "patch conflict marker" } */
+
+int r;
diff --git a/gcc/testsuite/c-c++-common/patch-conflict-markers-5.c b/gcc/testsuite/c-c++-common/patch-conflict-markers-5.c
new file mode 100644
index 0000000..816a97e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/patch-conflict-markers-5.c
@@ -0,0 +1,11 @@
+/* Ensure we can handle mismatched conflict markers.  */
+
+int p;
+
+=======  /* { dg-error "patch conflict marker" } */
+
+int q;
+
+=======  /* { dg-error "patch conflict marker" } */
+
+int r;
diff --git a/gcc/testsuite/c-c++-common/patch-conflict-markers-6.c b/gcc/testsuite/c-c++-common/patch-conflict-markers-6.c
new file mode 100644
index 0000000..74ea2d5
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/patch-conflict-markers-6.c
@@ -0,0 +1,38 @@
+/* Branch coverage of patch conflict marker detection:
+   none of these should be reported as patch conflict markers.  */
+
+int a0;
+
+<< HEAD  /* { dg-error "expected" } */
+
+int a1;
+
+<<<< HEAD  /* { dg-error "expected" } */
+
+int a2;
+
+<<<<<< HEAD  /* { dg-error "expected" } */
+
+int b0;
+
+== HEAD  /* { dg-error "expected" } */
+
+int b1;
+
+==== HEAD  /* { dg-error "expected" } */
+
+int b2;
+
+====== HEAD  /* { dg-error "expected" } */
+
+int c0;
+
+>> HEAD  /* { dg-error "expected" } */
+
+int c1;
+
+>>>> HEAD  /* { dg-error "expected" } */
+
+int c2;
+
+>>>>>> HEAD  /* { dg-error "expected" } */
diff --git a/gcc/testsuite/c-c++-common/patch-conflict-markers-7.c b/gcc/testsuite/c-c++-common/patch-conflict-markers-7.c
new file mode 100644
index 0000000..2b5d4e6
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/patch-conflict-markers-7.c
@@ -0,0 +1,6 @@
+/* It's valid to stringize the "<<<<<<<"; don't
+   report it as a patch conflict marker.  */
+#define str(s) #s
+const char *s = str(
+<<<<<<<
+);
diff --git a/gcc/testsuite/c-c++-common/patch-conflict-markers-8.c b/gcc/testsuite/c-c++-common/patch-conflict-markers-8.c
new file mode 100644
index 0000000..90d75b0
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/patch-conflict-markers-8.c
@@ -0,0 +1,4 @@
+/* A macro that's never expanded shouldn't be reported as a patch
+   conflict marker.  */
+#define foo \
+<<<<<<<
diff --git a/gcc/testsuite/c-c++-common/patch-conflict-markers-9.c b/gcc/testsuite/c-c++-common/patch-conflict-markers-9.c
new file mode 100644
index 0000000..5c1e663
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/patch-conflict-markers-9.c
@@ -0,0 +1,8 @@
+/* It's valid to have
+<<<<<<<
+   inside both
+   comments (as above), and within string literals.  */
+const char *s = "\
+<<<<<<<";
+
+/* The above shouldn't be reported as errors.  */
diff --git a/gcc/testsuite/g++.dg/patch-conflict-markers-1.C b/gcc/testsuite/g++.dg/patch-conflict-markers-1.C
new file mode 100644
index 0000000..ae19193
--- /dev/null
+++ b/gcc/testsuite/g++.dg/patch-conflict-markers-1.C
@@ -0,0 +1,13 @@
+/* Ensure that we don't complain about patch conflict markers on
+   valid template argument lists, valid in C++11 onwards.  */
+// { dg-options "-std=c++11" }
+
+template <typename T>
+struct foo
+{
+  T t;
+};
+
+foo <foo <foo <foo <foo <foo <foo <int
+>>>>>>> f;
+// The above line is valid C++11, and isn't a patch conflict marker
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Better error messages for merge-conflict markers (v3)
  2015-10-30 15:02 [PATCH] Better error messages for merge-conflict markers (v3) David Malcolm
@ 2015-11-02 22:52 ` Jeff Law
  2015-11-03  4:05   ` Trevor Saunders
  2015-11-04 13:56 ` Bernd Schmidt
  2015-12-09 23:50 ` [PATCH] Better error messages for merge-conflict markers (v3) Martin Sebor
  2 siblings, 1 reply; 12+ messages in thread
From: Jeff Law @ 2015-11-02 22:52 UTC (permalink / raw)
  To: David Malcolm, gcc-patches

On 10/30/2015 09:16 AM, David Malcolm wrote:
> This is a rebased version of this patch from back in April:
>    v2: https://gcc.gnu.org/ml/gcc-patches/2015-04/msg00944.html
> which in turn is a rewrite of this one:
>    v1: https://gcc.gnu.org/ml/gcc-patches/2015-03/msg01087.html
>
> The idea is to more gracefully handle merger conflict markers
> in the source code being compiled.  Specifically, in the C and
> C++ frontends, if we're about to emit an error, see if the
> source code is at a merger conflict marker, and if so, emit
> a more specific message, so that the user gets this:
>
> foo.c:1:1: error: source file contains patch conflict marker
>   <<<<<<< HEAD
>   ^
>
> rather than this gobbledegook:
>
> foo.c:1:1: error: expected identifier or '(' before '<<' token
>   <<<<<<< HEAD
>   ^
>
> It's something of a "fit and finish" cosmetic item, but these
> things add up.
>
> Successfully bootstrapped&regrtested on x86_64-pc-linux-gnu;
> adds 82 new PASSes to g++.sum and 27 new PASSes to gcc.sum.
>
> OK for trunk?
>
> This implementation works by looking at the token stream, which
> is slightly clunky; an alternative way of doing it would be to directly
> look at the line the error occurs in and to see if it begins with a
> conflict marker.  Would that be preferable?
>
> gcc/c-family/ChangeLog:
> 	* c-common.h (conflict_marker_get_final_tok_kind): New prototype.
> 	* c-lex.c (conflict_marker_get_final_tok_kind): New function.
>
> gcc/c/ChangeLog:
> 	* c-parser.c (struct c_parser): Expand array "tokens_buf" from 2
> 	to 4.
> 	(c_parser_peek_nth_token): New function.
> 	(c_parser_peek_conflict_marker): New function.
> 	(c_parser_error): Detect patch conflict markers and report them as
> 	such.
>
> gcc/cp/ChangeLog:
> 	* parser.c (cp_lexer_peek_conflict_marker): New function.
> 	(cp_parser_error): Detect patch conflict markers and report them
> 	as such.
I'm still having a hard time seeing this one as all that useful.  It's 
not like it's terribly hard to see the <<<<<< HEAD and realize that 
there's a conflict marker mucking things up.

I'm willing to step aside if other folks thing this is really useful and 
want to review the changes.

jeff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Better error messages for merge-conflict markers (v3)
  2015-11-02 22:52 ` Jeff Law
@ 2015-11-03  4:05   ` Trevor Saunders
  0 siblings, 0 replies; 12+ messages in thread
From: Trevor Saunders @ 2015-11-03  4:05 UTC (permalink / raw)
  To: Jeff Law; +Cc: David Malcolm, gcc-patches

On Mon, Nov 02, 2015 at 03:52:13PM -0700, Jeff Law wrote:
> On 10/30/2015 09:16 AM, David Malcolm wrote:
> >This is a rebased version of this patch from back in April:
> >   v2: https://gcc.gnu.org/ml/gcc-patches/2015-04/msg00944.html
> >which in turn is a rewrite of this one:
> >   v1: https://gcc.gnu.org/ml/gcc-patches/2015-03/msg01087.html
> >
> >The idea is to more gracefully handle merger conflict markers
> >in the source code being compiled.  Specifically, in the C and
> >C++ frontends, if we're about to emit an error, see if the
> >source code is at a merger conflict marker, and if so, emit
> >a more specific message, so that the user gets this:
> >
> >foo.c:1:1: error: source file contains patch conflict marker
> >  <<<<<<< HEAD
> >  ^
> >
> >rather than this gobbledegook:
> >
> >foo.c:1:1: error: expected identifier or '(' before '<<' token
> >  <<<<<<< HEAD
> >  ^
> >
> >It's something of a "fit and finish" cosmetic item, but these
> >things add up.
> >
> >Successfully bootstrapped&regrtested on x86_64-pc-linux-gnu;
> >adds 82 new PASSes to g++.sum and 27 new PASSes to gcc.sum.
> >
> >OK for trunk?
> >
> >This implementation works by looking at the token stream, which
> >is slightly clunky; an alternative way of doing it would be to directly
> >look at the line the error occurs in and to see if it begins with a
> >conflict marker.  Would that be preferable?
> >
> >gcc/c-family/ChangeLog:
> >	* c-common.h (conflict_marker_get_final_tok_kind): New prototype.
> >	* c-lex.c (conflict_marker_get_final_tok_kind): New function.
> >
> >gcc/c/ChangeLog:
> >	* c-parser.c (struct c_parser): Expand array "tokens_buf" from 2
> >	to 4.
> >	(c_parser_peek_nth_token): New function.
> >	(c_parser_peek_conflict_marker): New function.
> >	(c_parser_error): Detect patch conflict markers and report them as
> >	such.
> >
> >gcc/cp/ChangeLog:
> >	* parser.c (cp_lexer_peek_conflict_marker): New function.
> >	(cp_parser_error): Detect patch conflict markers and report them
> >	as such.
> I'm still having a hard time seeing this one as all that useful.  It's not
> like it's terribly hard to see the <<<<<< HEAD and realize that there's a
> conflict marker mucking things up.

Its probably not that hard, but I do know people who really like this
sort of thing.  I would think it also becomes more useful when you have
fix it hints, or even allow the compiler to try to fix and continue.
Apparently clang does that?  That does actually sound useful if I can
avoid thinking about having left the conflict marker at all and have
tools deal with it for me.  So unless this is a lot of code which I
doubt it seems worth while to me.

Trev

> 
> I'm willing to step aside if other folks thing this is really useful and
> want to review the changes.
> 
> jeff
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Better error messages for merge-conflict markers (v3)
  2015-10-30 15:02 [PATCH] Better error messages for merge-conflict markers (v3) David Malcolm
  2015-11-02 22:52 ` Jeff Law
@ 2015-11-04 13:56 ` Bernd Schmidt
  2015-12-09 16:39   ` [PATCH] Better error recovery for merge-conflict markers (v4) David Malcolm
  2015-12-15 19:11   ` [PATCH] Better error recovery for merge-conflict markers (v5) David Malcolm
  2015-12-09 23:50 ` [PATCH] Better error messages for merge-conflict markers (v3) Martin Sebor
  2 siblings, 2 replies; 12+ messages in thread
From: Bernd Schmidt @ 2015-11-04 13:56 UTC (permalink / raw)
  To: David Malcolm, gcc-patches, Joseph Myers, Jason Merrill

On 10/30/2015 04:16 PM, David Malcolm wrote:
> The idea is to more gracefully handle merger conflict markers
> in the source code being compiled.  Specifically, in the C and
> C++ frontends, if we're about to emit an error, see if the
> source code is at a merger conflict marker, and if so, emit
> a more specific message, so that the user gets this:
>
> foo.c:1:1: error: source file contains patch conflict marker
>   <<<<<<< HEAD
>   ^
>
> It's something of a "fit and finish" cosmetic item, but these
> things add up.

This seems like fairly low impact but also low cost, so I'm fine with it 
in principle. I wonder whether the length of the marker is the same 
across all versions of patch (and VC tools)?

> +static bool
> +c_parser_peek_conflict_marker (c_parser *parser, enum cpp_ttype tok1_kind)
> +{
> +  c_token *token2 = c_parser_peek_2nd_token (parser);
> +  if (token2->type != tok1_kind)
> +    return false;
> +  c_token *token3 = c_parser_peek_nth_token (parser, 3);
> +  if (token3->type != tok1_kind)
> +    return false;
> +  c_token *token4 = c_parser_peek_nth_token (parser, 4);
> +  if (token4->type != conflict_marker_get_final_tok_kind (tok1_kind))
> +    return false;
> +  return true;
> +}

Just thinking out loud - I guess it would be too much to hope for to 
share lexers between frontends so that we need only one copy of this?

> +extern short some_var; /* this line would lead to a warning */

Would or does? I don't see anything suppressing it?

There seems to be no testcase verifying what happens if the marker is 
not at the start of the line (IMO it should not be interpreted as a marker).

It would be good to have buy-in from the frontend maintainers (Joseph 
commented on v1 and as far as I can see you've addressed his feedback). 
If you do not hear back from them by the end of the week, I'll approve 
it if the start-of-line thing is sorted.


Bernd

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] Better error recovery for merge-conflict markers (v4)
  2015-11-04 13:56 ` Bernd Schmidt
@ 2015-12-09 16:39   ` David Malcolm
  2015-12-09 17:44     ` Bernd Schmidt
  2015-12-15 19:11   ` [PATCH] Better error recovery for merge-conflict markers (v5) David Malcolm
  1 sibling, 1 reply; 12+ messages in thread
From: David Malcolm @ 2015-12-09 16:39 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: gcc-patches, Joseph Myers, Jason Merrill, David Malcolm

On Wed, 2015-11-04 at 14:56 +0100, Bernd Schmidt wrote:
> On 10/30/2015 04:16 PM, David Malcolm wrote:
> > The idea is to more gracefully handle merger conflict markers
> > in the source code being compiled.  Specifically, in the C and
> > C++ frontends, if we're about to emit an error, see if the
> > source code is at a merger conflict marker, and if so, emit
> > a more specific message, so that the user gets this:
> >
> > foo.c:1:1: error: source file contains patch conflict marker
> >   <<<<<<< HEAD
> >   ^
> >
> > It's something of a "fit and finish" cosmetic item, but these
> > things add up.
>
> This seems like fairly low impact but also low cost, so I'm fine with it
> in principle. I wonder whether the length of the marker is the same
> across all versions of patch (and VC tools)?

It's hardcoded for GNU patch:
  http://git.savannah.gnu.org/cgit/patch.git/tree/src/merge.c
which hardcodes e.g.:
	  fputs (outstate->after_newline + "\n<<<<<<<\n", fp);

I don't know if it's hardcoded for CVS or Subversion, but both have
documentation showing that format:
 ftp://ftp.gnu.org/old-gnu/Manuals/cvs/html_node/cvs_38.html
 http://svnbook.red-bean.com/en/1.7/svn.tour.cycle.html#svn.tour.cycle.resolve

It's the default of git:
  http://git.kernel.org/cgit/git/git.git/tree/Documentation/merge-config.txt
(config option "merge.conflictStyle")
This git commit (to git) seems to have generalized it to support a
"conflict-marker-size" attribute:
  https://github.com/git/git/commit/8588567c96490b8d236b1bc13f9bcb0dfa118efe

Mercurial uses them; the format appears to be a keyword-argument in:
  https://selenic.com/hg/file/tip/mercurial/simplemerge.py#l91
but it's hardcoded in this regex in filemerge.py:
        if re.search("^(<<<<<<< .*|=======|>>>>>>> .*)$", fcd.data(),

Bazaar uses them; see e.g.:
  http://bazaar.launchpad.net/~bzr-pqm/bzr/bzr.dev/view/head:/bzrlib/tests/test_merge3.py
(I couldn't easily tell if they're configurable)

FWIW, Perforce appears to use a different format;
http://www.perforce.com/perforce/doc.current/manuals/p4guide/chapter.resolve.html
has an example showing:

>>>> ORIGINAL file#n
(text from the original version)
==== THEIR file#m
(text from their file)
==== YOURS file
(text from your file)
<<<<

From what I can tell, Perforce is the outlier here.

> > +static bool
> > +c_parser_peek_conflict_marker (c_parser *parser, enum cpp_ttype tok1_kind)
> > +{
> > +  c_token *token2 = c_parser_peek_2nd_token (parser);
> > +  if (token2->type != tok1_kind)
> > +    return false;
> > +  c_token *token3 = c_parser_peek_nth_token (parser, 3);
> > +  if (token3->type != tok1_kind)
> > +    return false;
> > +  c_token *token4 = c_parser_peek_nth_token (parser, 4);
> > +  if (token4->type != conflict_marker_get_final_tok_kind (tok1_kind))
> > +    return false;
> > +  return true;
> > +}
>
> Just thinking out loud - I guess it would be too much to hope for to
> share lexers between frontends so that we need only one copy of this?

Probably :(

> > +extern short some_var; /* this line would lead to a warning */
>
> Would or does? I don't see anything suppressing it?

It's skipped in error-handling.

c_parser_declaration_or_fndef has:
1794	      declarator = c_parser_declarator (parser,
1795						specs->typespec_kind != ctsk_none,
1796						C_DTR_NORMAL, &dummy);
1797	      if (declarator == NULL)
1798		{
[...snip...]
1807		  c_parser_skip_to_end_of_block_or_statement (parser);

The call to c_parser_declarator fails, and when issuing:
3465		  c_parser_error (parser, "expected identifier or %<(%>");
we emit the "conflict marker" wording error for the error.

Then at line 1807 we skip, discarding everything up to the ";" in that
decl.

Would a better wording be:

extern short some_var; /* This line would lead to a warning due to the
                          duplicate name, but it is skipped when handling
                          the conflict marker.  */

> There seems to be no testcase verifying what happens if the marker is
> not at the start of the line (IMO it should not be interpreted as a marker).

The v3 patch actually reported them as markers regardless of
location.
The v4 patch now verifies that they are at the start of the line;
I've added test coverage for this (patch-conflict-markers-11.c).

That said, it's not clear they're always at the beginning of a line;
this bazaar bug indicates that CVS (and bazaar) can emit them
mid-line:
  https://bugs.launchpad.net/bzr/+bug/36399

I noticed a visual glitch with the v3 patch now that we have range
information for tokens:  with caret-printing, we get just the first
token within the marker underlined:

 <<<<<<< HEAD
 ^~

which looks strange (especially with the underlined chars colorized).

Hence in the v4 patch I've added a location tweak so that it
underline/colorizes *all* of the marker:

 <<<<<<< HEAD
 ^~~~~~~

Wording-wise, should it be "merge conflict marker", rather
than "patch conflict marker"?

Clang spells it:
"error: version control conflict marker in file"
http://blog.llvm.org/2010/04/amazing-feats-of-clang-error-recovery.html#merge_conflicts

Maybe I should simply use that wording?

> It would be good to have buy-in from the frontend maintainers (Joseph
> commented on v1 and as far as I can see you've addressed his feedback).
> If you do not hear back from them by the end of the week, I'll approve
> it if the start-of-line thing is sorted.

(clearly over a week by now; I got bogged down in the C++ FE
expression ranges; sorry).

> Bernd

Rebased on top of r231445 (from yesterday).
Successfully bootstrapped&regrtested on x86_64-pc-linux-gnu.
Adds 82 new PASSes to g++.sum and 27 new PASSes to gcc.sum.

OK for trunk?

gcc/c-family/ChangeLog:
	* c-common.h (conflict_marker_get_final_tok_kind): New prototype.
	* c-lex.c (conflict_marker_get_final_tok_kind): New function.

gcc/c/ChangeLog:
	* c-parser.c (struct c_parser): Expand array "tokens_buf" from 2
	to 4.
	(c_parser_peek_nth_token): New function.
	(c_parser_peek_conflict_marker): New function.
	(c_parser_error): Detect patch conflict markers and report them as
	such.

gcc/cp/ChangeLog:
	* parser.c (cp_lexer_peek_conflict_marker): New function.
	(cp_parser_error): Detect patch conflict markers and report them
	as such.

gcc/testsuite/ChangeLog:
	* c-c++-common/patch-conflict-markers-1.c: New testcase.
	* c-c++-common/patch-conflict-markers-2.c: Likewise.
	* c-c++-common/patch-conflict-markers-3.c: Likewise.
	* c-c++-common/patch-conflict-markers-4.c: Likewise.
	* c-c++-common/patch-conflict-markers-5.c: Likewise.
	* c-c++-common/patch-conflict-markers-6.c: Likewise.
	* c-c++-common/patch-conflict-markers-7.c: Likewise.
	* c-c++-common/patch-conflict-markers-8.c: Likewise.
	* c-c++-common/patch-conflict-markers-9.c: Likewise.
	* c-c++-common/patch-conflict-markers-10.c: Likewise.
	* c-c++-common/patch-conflict-markers-11.c: Likewise.
	* g++.dg/patch-conflict-markers-1.C: Likewise.
---
 gcc/c-family/c-common.h                            |  4 ++
 gcc/c-family/c-lex.c                               | 26 ++++++++
 gcc/c/c-parser.c                                   | 72 +++++++++++++++++++++-
 gcc/cp/parser.c                                    | 53 ++++++++++++++++
 .../c-c++-common/patch-conflict-markers-1.c        |  9 +++
 .../c-c++-common/patch-conflict-markers-10.c       | 23 +++++++
 .../c-c++-common/patch-conflict-markers-11.c       | 14 +++++
 .../c-c++-common/patch-conflict-markers-2.c        |  2 +
 .../c-c++-common/patch-conflict-markers-3.c        | 11 ++++
 .../c-c++-common/patch-conflict-markers-4.c        | 11 ++++
 .../c-c++-common/patch-conflict-markers-5.c        | 11 ++++
 .../c-c++-common/patch-conflict-markers-6.c        | 38 ++++++++++++
 .../c-c++-common/patch-conflict-markers-7.c        |  6 ++
 .../c-c++-common/patch-conflict-markers-8.c        |  4 ++
 .../c-c++-common/patch-conflict-markers-9.c        |  8 +++
 gcc/testsuite/g++.dg/patch-conflict-markers-1.C    | 13 ++++
 16 files changed, 303 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/patch-conflict-markers-1.c
 create mode 100644 gcc/testsuite/c-c++-common/patch-conflict-markers-10.c
 create mode 100644 gcc/testsuite/c-c++-common/patch-conflict-markers-11.c
 create mode 100644 gcc/testsuite/c-c++-common/patch-conflict-markers-2.c
 create mode 100644 gcc/testsuite/c-c++-common/patch-conflict-markers-3.c
 create mode 100644 gcc/testsuite/c-c++-common/patch-conflict-markers-4.c
 create mode 100644 gcc/testsuite/c-c++-common/patch-conflict-markers-5.c
 create mode 100644 gcc/testsuite/c-c++-common/patch-conflict-markers-6.c
 create mode 100644 gcc/testsuite/c-c++-common/patch-conflict-markers-7.c
 create mode 100644 gcc/testsuite/c-c++-common/patch-conflict-markers-8.c
 create mode 100644 gcc/testsuite/c-c++-common/patch-conflict-markers-9.c
 create mode 100644 gcc/testsuite/g++.dg/patch-conflict-markers-1.C

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index ef64e6b..2183565 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1089,6 +1089,10 @@ extern void c_genericize (tree);
 extern int c_gimplify_expr (tree *, gimple_seq *, gimple_seq *);
 extern tree c_build_bind_expr (location_t, tree, tree);
 
+/* In c-lex.c.  */
+extern enum cpp_ttype
+conflict_marker_get_final_tok_kind (enum cpp_ttype tok1_kind);
+
 /* In c-pch.c  */
 extern void pch_init (void);
 extern void pch_cpp_save_state (void);
diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index 9c86ba7..6e0205b 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -1263,3 +1263,29 @@ lex_charconst (const cpp_token *token)
 
   return value;
 }
+
+/* Helper function for c_parser_peek_conflict_marker
+   and cp_lexer_peek_conflict_marker.
+   Given a possible patch conflict marker token of kind TOK1_KIND
+   consisting of a pair of characters, get the token kind for the
+   standalone final character.  */
+
+enum cpp_ttype
+conflict_marker_get_final_tok_kind (enum cpp_ttype tok1_kind)
+{
+  switch (tok1_kind)
+    {
+    default: gcc_unreachable ();
+    case CPP_LSHIFT:
+      /* "<<" and '<' */
+      return CPP_LESS;
+
+    case CPP_EQ_EQ:
+      /* "==" and '=' */
+      return CPP_EQ;
+
+    case CPP_RSHIFT:
+      /* ">>" and '>' */
+      return CPP_GREATER;
+    }
+}
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 124c30b..87ceeff 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -202,8 +202,8 @@ struct GTY(()) c_parser {
   /* The look-ahead tokens.  */
   c_token * GTY((skip)) tokens;
   /* Buffer for look-ahead tokens.  */
-  c_token tokens_buf[2];
-  /* How many look-ahead tokens are available (0, 1 or 2, or
+  c_token tokens_buf[4];
+  /* How many look-ahead tokens are available (0 - 4, or
      more if parsing from pre-lexed tokens).  */
   unsigned int tokens_avail;
   /* True if a syntax error is being recovered from; false otherwise.
@@ -492,6 +492,20 @@ c_parser_peek_2nd_token (c_parser *parser)
   return &parser->tokens[1];
 }
 
+/* Return a pointer to the Nth token from PARSER, reading it
+   in if necessary.  The N-1th token is already read in.  */
+
+static c_token *
+c_parser_peek_nth_token (c_parser *parser, unsigned int n)
+{
+  if (parser->tokens_avail >= n)
+    return &parser->tokens[n - 1];
+  gcc_assert (parser->tokens_avail == n - 1);
+  c_lex_one_token (parser, &parser->tokens[n - 1]);
+  parser->tokens_avail = n;
+  return &parser->tokens[n - 1];
+}
+
 /* Return true if TOKEN can start a type name,
    false otherwise.  */
 static bool
@@ -829,6 +843,46 @@ c_parser_set_source_position_from_token (c_token *token)
     }
 }
 
+/* Helper function for c_parser_error.
+   Having peeked a token of kind TOK1_KIND that might signify
+   a patch conflict marker, peek successor tokens to determine
+   if we actually do have a patch conflict marker.
+   Specifically, we consider a run of 7 '<', '=' or '>' characters
+   at the start of a line as a patch conflict marker.
+   These come through the lexer as three pairs and a single,
+   e.g. three CPP_LSHIFT ("<<") and a CPP_LESS ('<').
+   If it returns true, *OUT_LOC is written to with the location/range
+   of the marker.  */
+
+static bool
+c_parser_peek_conflict_marker (c_parser *parser, enum cpp_ttype tok1_kind,
+			       location_t *out_loc)
+{
+  c_token *token2 = c_parser_peek_2nd_token (parser);
+  if (token2->type != tok1_kind)
+    return false;
+  c_token *token3 = c_parser_peek_nth_token (parser, 3);
+  if (token3->type != tok1_kind)
+    return false;
+  c_token *token4 = c_parser_peek_nth_token (parser, 4);
+  if (token4->type != conflict_marker_get_final_tok_kind (tok1_kind))
+    return false;
+
+  /* It must be at the start of the line.  */
+  location_t start_loc = c_parser_peek_token (parser)->location;
+  if (LOCATION_COLUMN (start_loc) != 1)
+    return false;
+
+  /* We have a conflict marker.  Construct a location of the form:
+       <<<<<<<
+       ^~~~~~~
+     with start == caret, finishing at the end of the marker.  */
+  location_t finish_loc = get_finish (token4->location);
+  *out_loc = make_location (start_loc, start_loc, finish_loc);
+
+  return true;
+}
+
 /* Issue a diagnostic of the form
       FILE:LINE: MESSAGE before TOKEN
    where TOKEN is the next token in the input stream of PARSER.
@@ -850,6 +904,20 @@ c_parser_error (c_parser *parser, const char *gmsgid)
   parser->error = true;
   if (!gmsgid)
     return;
+
+  /* If this is actually a patch conflict marker, report it as such.  */
+  if (token->type == CPP_LSHIFT
+      || token->type == CPP_RSHIFT
+      || token->type == CPP_EQ_EQ)
+    {
+      location_t loc;
+      if (c_parser_peek_conflict_marker (parser, token->type, &loc))
+	{
+	  error_at (loc, "source file contains patch conflict marker");
+	  return;
+	}
+    }
+
   /* This diagnostic makes more sense if it is tagged to the line of
      the token we just peeked at.  */
   c_parser_set_source_position_from_token (token);
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index a420cf1..a904e8d 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -2689,6 +2689,46 @@ cp_parser_is_keyword (cp_token* token, enum rid keyword)
   return token->keyword == keyword;
 }
 
+/* Helper function for cp_parser_error.
+   Having peeked a token of kind TOK1_KIND that might signify
+   a patch conflict marker, peek successor tokens to determine
+   if we actually do have a patch conflict marker.
+   Specifically, we consider a run of 7 '<', '=' or '>' characters
+   at the start of a line as a patch conflict marker.
+   These come through the lexer as three pairs and a single,
+   e.g. three CPP_LSHIFT tokens ("<<") and a CPP_LESS token ('<').
+   If it returns true, *OUT_LOC is written to with the location/range
+   of the marker.  */
+
+static bool
+cp_lexer_peek_conflict_marker (cp_lexer *lexer, enum cpp_ttype tok1_kind,
+			       location_t *out_loc)
+{
+  cp_token *token2 = cp_lexer_peek_nth_token (lexer, 2);
+  if (token2->type != tok1_kind)
+    return false;
+  cp_token *token3 = cp_lexer_peek_nth_token (lexer, 3);
+  if (token3->type != tok1_kind)
+    return false;
+  cp_token *token4 = cp_lexer_peek_nth_token (lexer, 4);
+  if (token4->type != conflict_marker_get_final_tok_kind (tok1_kind))
+    return false;
+
+  /* It must be at the start of the line.  */
+  location_t start_loc = cp_lexer_peek_token (lexer)->location;
+  if (LOCATION_COLUMN (start_loc) != 1)
+    return false;
+
+  /* We have a conflict marker.  Construct a location of the form:
+       <<<<<<<
+       ^~~~~~~
+     with start == caret, finishing at the end of the marker.  */
+  location_t finish_loc = get_finish (token4->location);
+  *out_loc = make_location (start_loc, start_loc, finish_loc);
+
+  return true;
+}
+
 /* If not parsing tentatively, issue a diagnostic of the form
       FILE:LINE: MESSAGE before TOKEN
    where TOKEN is the next token in the input stream.  MESSAGE
@@ -2713,6 +2753,19 @@ cp_parser_error (cp_parser* parser, const char* gmsgid)
 	  return;
 	}
 
+      /* If this is actually a patch conflict marker, report it as such.  */
+      if (token->type == CPP_LSHIFT
+	  || token->type == CPP_RSHIFT
+	  || token->type == CPP_EQ_EQ)
+	{
+	  location_t loc;
+	  if (cp_lexer_peek_conflict_marker (parser->lexer, token->type, &loc))
+	    {
+	      error_at (loc, "source file contains patch conflict marker");
+	      return;
+	    }
+	}
+
       c_parse_error (gmsgid,
 		     /* Because c_parser_error does not understand
 			CPP_KEYWORD, keywords are treated like
diff --git a/gcc/testsuite/c-c++-common/patch-conflict-markers-1.c b/gcc/testsuite/c-c++-common/patch-conflict-markers-1.c
new file mode 100644
index 0000000..71e9fa7
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/patch-conflict-markers-1.c
@@ -0,0 +1,9 @@
+int p;
+
+<<<<<<< HEAD /* { dg-error "patch conflict marker" } */
+extern int some_var;
+=======      /* { dg-error "patch conflict marker" } */
+extern short some_var; /* this line would lead to a warning */
+>>>>>>> Some commit message  /* { dg-error "patch conflict marker" } */
+
+int q;
diff --git a/gcc/testsuite/c-c++-common/patch-conflict-markers-10.c b/gcc/testsuite/c-c++-common/patch-conflict-markers-10.c
new file mode 100644
index 0000000..839c0a6
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/patch-conflict-markers-10.c
@@ -0,0 +1,23 @@
+/* { dg-options "-fdiagnostics-show-caret" } */
+
+<<<<<<< HEAD /* { dg-error "patch conflict marker" } */
+/* { dg-begin-multiline-output "" }
+ <<<<<<< HEAD
+ ^~~~~~~
+   { dg-end-multiline-output "" } */
+
+extern int some_var;
+
+=======      /* { dg-error "patch conflict marker" } */
+/* { dg-begin-multiline-output "" }
+ =======
+ ^~~~~~~
+   { dg-end-multiline-output "" } */
+
+extern short some_var; /* this line would lead to a warning */
+
+>>>>>>> Some commit message  /* { dg-error "patch conflict marker" } */
+/* { dg-begin-multiline-output "" }
+ >>>>>>>
+ ^~~~~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/patch-conflict-markers-11.c b/gcc/testsuite/c-c++-common/patch-conflict-markers-11.c
new file mode 100644
index 0000000..8771453
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/patch-conflict-markers-11.c
@@ -0,0 +1,14 @@
+/* Verify that we only report conflict markers at the start of lines.  */
+int p;
+
+ <<<<<<< HEAD /* { dg-error "expected identifier|expected unqualified-id" } */
+
+int q;
+
+ =======      /* { dg-error "expected identifier|expected unqualified-id" } */
+
+int r;
+
+ >>>>>>> Some commit message  /* { dg-error "expected identifier|expected unqualified-id" } */
+
+int s;
diff --git a/gcc/testsuite/c-c++-common/patch-conflict-markers-2.c b/gcc/testsuite/c-c++-common/patch-conflict-markers-2.c
new file mode 100644
index 0000000..79030ee
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/patch-conflict-markers-2.c
@@ -0,0 +1,2 @@
+/* This should not be flagged as a patch conflict marker.  */
+const char *msg = "<<<<<<< ";
diff --git a/gcc/testsuite/c-c++-common/patch-conflict-markers-3.c b/gcc/testsuite/c-c++-common/patch-conflict-markers-3.c
new file mode 100644
index 0000000..be956b2
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/patch-conflict-markers-3.c
@@ -0,0 +1,11 @@
+/* Ensure we can handle unterminated conflict markers.  */
+
+int p;
+
+<<<<<<< HEAD  /* { dg-error "patch conflict marker" } */
+
+int q;
+
+<<<<<<< HEAD  /* { dg-error "patch conflict marker" } */
+
+int r;
diff --git a/gcc/testsuite/c-c++-common/patch-conflict-markers-4.c b/gcc/testsuite/c-c++-common/patch-conflict-markers-4.c
new file mode 100644
index 0000000..ec3730c
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/patch-conflict-markers-4.c
@@ -0,0 +1,11 @@
+/* Ensure we can handle mismatched conflict markers.  */
+
+int p;
+
+>>>>>>> Some commit message  /* { dg-error "patch conflict marker" } */
+
+int q;
+
+>>>>>>> Some other commit message  /* { dg-error "patch conflict marker" } */
+
+int r;
diff --git a/gcc/testsuite/c-c++-common/patch-conflict-markers-5.c b/gcc/testsuite/c-c++-common/patch-conflict-markers-5.c
new file mode 100644
index 0000000..816a97e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/patch-conflict-markers-5.c
@@ -0,0 +1,11 @@
+/* Ensure we can handle mismatched conflict markers.  */
+
+int p;
+
+=======  /* { dg-error "patch conflict marker" } */
+
+int q;
+
+=======  /* { dg-error "patch conflict marker" } */
+
+int r;
diff --git a/gcc/testsuite/c-c++-common/patch-conflict-markers-6.c b/gcc/testsuite/c-c++-common/patch-conflict-markers-6.c
new file mode 100644
index 0000000..74ea2d5
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/patch-conflict-markers-6.c
@@ -0,0 +1,38 @@
+/* Branch coverage of patch conflict marker detection:
+   none of these should be reported as patch conflict markers.  */
+
+int a0;
+
+<< HEAD  /* { dg-error "expected" } */
+
+int a1;
+
+<<<< HEAD  /* { dg-error "expected" } */
+
+int a2;
+
+<<<<<< HEAD  /* { dg-error "expected" } */
+
+int b0;
+
+== HEAD  /* { dg-error "expected" } */
+
+int b1;
+
+==== HEAD  /* { dg-error "expected" } */
+
+int b2;
+
+====== HEAD  /* { dg-error "expected" } */
+
+int c0;
+
+>> HEAD  /* { dg-error "expected" } */
+
+int c1;
+
+>>>> HEAD  /* { dg-error "expected" } */
+
+int c2;
+
+>>>>>> HEAD  /* { dg-error "expected" } */
diff --git a/gcc/testsuite/c-c++-common/patch-conflict-markers-7.c b/gcc/testsuite/c-c++-common/patch-conflict-markers-7.c
new file mode 100644
index 0000000..2b5d4e6
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/patch-conflict-markers-7.c
@@ -0,0 +1,6 @@
+/* It's valid to stringize the "<<<<<<<"; don't
+   report it as a patch conflict marker.  */
+#define str(s) #s
+const char *s = str(
+<<<<<<<
+);
diff --git a/gcc/testsuite/c-c++-common/patch-conflict-markers-8.c b/gcc/testsuite/c-c++-common/patch-conflict-markers-8.c
new file mode 100644
index 0000000..90d75b0
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/patch-conflict-markers-8.c
@@ -0,0 +1,4 @@
+/* A macro that's never expanded shouldn't be reported as a patch
+   conflict marker.  */
+#define foo \
+<<<<<<<
diff --git a/gcc/testsuite/c-c++-common/patch-conflict-markers-9.c b/gcc/testsuite/c-c++-common/patch-conflict-markers-9.c
new file mode 100644
index 0000000..5c1e663
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/patch-conflict-markers-9.c
@@ -0,0 +1,8 @@
+/* It's valid to have
+<<<<<<<
+   inside both
+   comments (as above), and within string literals.  */
+const char *s = "\
+<<<<<<<";
+
+/* The above shouldn't be reported as errors.  */
diff --git a/gcc/testsuite/g++.dg/patch-conflict-markers-1.C b/gcc/testsuite/g++.dg/patch-conflict-markers-1.C
new file mode 100644
index 0000000..ae19193
--- /dev/null
+++ b/gcc/testsuite/g++.dg/patch-conflict-markers-1.C
@@ -0,0 +1,13 @@
+/* Ensure that we don't complain about patch conflict markers on
+   valid template argument lists, valid in C++11 onwards.  */
+// { dg-options "-std=c++11" }
+
+template <typename T>
+struct foo
+{
+  T t;
+};
+
+foo <foo <foo <foo <foo <foo <foo <int
+>>>>>>> f;
+// The above line is valid C++11, and isn't a patch conflict marker
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Better error recovery for merge-conflict markers (v4)
  2015-12-09 16:39   ` [PATCH] Better error recovery for merge-conflict markers (v4) David Malcolm
@ 2015-12-09 17:44     ` Bernd Schmidt
  2015-12-09 20:18       ` Jeff Law
  2015-12-16 18:23       ` David Malcolm
  0 siblings, 2 replies; 12+ messages in thread
From: Bernd Schmidt @ 2015-12-09 17:44 UTC (permalink / raw)
  To: David Malcolm; +Cc: gcc-patches, Joseph Myers, Jason Merrill

On 12/09/2015 05:58 PM, David Malcolm wrote:
> On Wed, 2015-11-04 at 14:56 +0100, Bernd Schmidt wrote:
>>
>> This seems like fairly low impact but also low cost, so I'm fine with it
>> in principle. I wonder whether the length of the marker is the same
>> across all versions of patch (and VC tools)?
>
> It's hardcoded for GNU patch:
[...]
 >From what I can tell, Perforce is the outlier here.

Thanks for checking all that.

>> Just thinking out loud - I guess it would be too much to hope for to
>> share lexers between frontends so that we need only one copy of this?
>
> Probably :(

Someone slap sense into me, I just thought of deriving C and C++ parsers 
from a common base class... (no this is not a suggestion for this patch).

> Would a better wording be:
>
> extern short some_var; /* This line would lead to a warning due to the
>                            duplicate name, but it is skipped when handling
>                            the conflict marker.  */

I think so, yes.

> That said, it's not clear they're always at the beginning of a line;
> this bazaar bug indicates that CVS (and bazaar) can emit them
> mid-line:
>    https://bugs.launchpad.net/bzr/+bug/36399

Ok. CVS I think we shouldn't worry about, and it looks like this is one 
particular bug/corner case where the conflict end marker is the last 
thing in the file. I think on the whole it's best to check for beginning 
of the line as you've done.

> Wording-wise, should it be "merge conflict marker", rather
> than "patch conflict marker"?
>
> Clang spells it:
> "error: version control conflict marker in file"
> http://blog.llvm.org/2010/04/amazing-feats-of-clang-error-recovery.html#merge_conflicts

Yeah, if another compiler has a similar/identical diagnostic I think we 
should just copy that unless there's a very good reason not to.

> Rebased on top of r231445 (from yesterday).
> Successfully bootstrapped&regrtested on x86_64-pc-linux-gnu.
> Adds 82 new PASSes to g++.sum and 27 new PASSes to gcc.sum.
>
> OK for trunk?

I'm inclined to say yes since it was originally submitted in time and 
it's hard to imagine how the change could be risky (I'll point out right 
away that there are one or two other patches in the queue that were also 
submitted in time which I feel should not be considered for gcc-6 at 
this point due to risk).

Let's wait until the end of the week for objections, commit then.


Bernd

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Better error recovery for merge-conflict markers (v4)
  2015-12-09 17:44     ` Bernd Schmidt
@ 2015-12-09 20:18       ` Jeff Law
  2015-12-16 18:23       ` David Malcolm
  1 sibling, 0 replies; 12+ messages in thread
From: Jeff Law @ 2015-12-09 20:18 UTC (permalink / raw)
  To: Bernd Schmidt, David Malcolm; +Cc: gcc-patches, Joseph Myers, Jason Merrill

On 12/09/2015 10:44 AM, Bernd Schmidt wrote:
>>> Just thinking out loud - I guess it would be too much to hope for to
>>> share lexers between frontends so that we need only one copy of this?
>>
>> Probably :(
>
> Someone slap sense into me, I just thought of deriving C and C++ parsers
> from a common base class... (no this is not a suggestion for this patch).
It'd be nice.  But I don't even think it's on anyone's TODO list.

>> Wording-wise, should it be "merge conflict marker", rather
>> than "patch conflict marker"?
>>
>> Clang spells it:
>> "error: version control conflict marker in file"
>> http://blog.llvm.org/2010/04/amazing-feats-of-clang-error-recovery.html#merge_conflicts
>>
>
> Yeah, if another compiler has a similar/identical diagnostic I think we
> should just copy that unless there's a very good reason not to.
Agreed.

>
>> Rebased on top of r231445 (from yesterday).
>> Successfully bootstrapped&regrtested on x86_64-pc-linux-gnu.
>> Adds 82 new PASSes to g++.sum and 27 new PASSes to gcc.sum.
>>
>> OK for trunk?
>
> I'm inclined to say yes since it was originally submitted in time and
> it's hard to imagine how the change could be risky (I'll point out right
> away that there are one or two other patches in the queue that were also
> submitted in time which I feel should not be considered for gcc-6 at
> this point due to risk).
>
> Let's wait until the end of the week for objections, commit then.
As the person who has come the closest to objecting, I'll go on the 
record as "not objecting".

jeff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Better error messages for merge-conflict markers (v3)
  2015-10-30 15:02 [PATCH] Better error messages for merge-conflict markers (v3) David Malcolm
  2015-11-02 22:52 ` Jeff Law
  2015-11-04 13:56 ` Bernd Schmidt
@ 2015-12-09 23:50 ` Martin Sebor
  2 siblings, 0 replies; 12+ messages in thread
From: Martin Sebor @ 2015-12-09 23:50 UTC (permalink / raw)
  To: David Malcolm, gcc-patches

> @@ -471,6 +471,20 @@ c_parser_peek_2nd_token (c_parser *parser)
>     return &parser->tokens[1];
>   }
>
> +/* Return a pointer to the Nth token from PARSER, reading it
> +   in if necessary.  The N-1th token is already read in.  */
> +
> +static c_token *
> +c_parser_peek_nth_token (c_parser *parser, unsigned int n)
> +{
> +  if (parser->tokens_avail >= n)
> +    return &parser->tokens[n - 1];
> +  gcc_assert (parser->tokens_avail == n - 1);
> +  c_lex_one_token (parser, &parser->tokens[n - 1]);
> +  parser->tokens_avail = n;
> +  return &parser->tokens[n - 1];
> +}

David, I know little about the code in this area and so I looked
at the patch mostly out of curiosity.  This little function caught
my eye for some reason.  I see it's called only in two places and
in a safe way, but it also looks like it could easily be called
unsafely and cause either the assert to fire (when N is greater
than tokens_avail + 1), or a bad address to be returned (when N
is zero).  It's also a third peek function in the C parser,
making the choice not completely trivial (at least to those not
as familiar with the code).

When compared to the equivalent function in the C++ lexer, that
one is more like I would expect.  I.e., it asserts that N is
positive and doesn't assume any specific prior sequence of peeks.
I can see how someone familiar with the C++ lexer but not so well
with the C parser might inadvertently use the new C function
incorrectly.

May I suggest making the C function equivalent to the C++ one
in terms of its preconditions?

Martin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] Better error recovery for merge-conflict markers (v5)
  2015-11-04 13:56 ` Bernd Schmidt
  2015-12-09 16:39   ` [PATCH] Better error recovery for merge-conflict markers (v4) David Malcolm
@ 2015-12-15 19:11   ` David Malcolm
  2015-12-15 23:52     ` Bernd Schmidt
  1 sibling, 1 reply; 12+ messages in thread
From: David Malcolm @ 2015-12-15 19:11 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: gcc-patches, Joseph Myers, Jason Merrill, David Malcolm

On Wed, 2015-12-09 at 18:44 +0100, Bernd Schmidt wrote:
> On 12/09/2015 05:58 PM, David Malcolm wrote:
> > On Wed, 2015-11-04 at 14:56 +0100, Bernd Schmidt wrote:
> >>
> >> This seems like fairly low impact but also low cost, so I'm fine with it
> >> in principle. I wonder whether the length of the marker is the same
> >> across all versions of patch (and VC tools)?
> >
> > It's hardcoded for GNU patch:
> [...]
>  >From what I can tell, Perforce is the outlier here.
>
> Thanks for checking all that.
>
> >> Just thinking out loud - I guess it would be too much to hope for to
> >> share lexers between frontends so that we need only one copy of this?
> >
> > Probably :(
>
> Someone slap sense into me, I just thought of deriving C and C++ parsers
> from a common base class... (no this is not a suggestion for this patch).
>
> > Would a better wording be:
> >
> > extern short some_var; /* This line would lead to a warning due to the
> >                            duplicate name, but it is skipped when handling
> >                            the conflict marker.  */
>
> I think so, yes.
>
> > That said, it's not clear they're always at the beginning of a line;
> > this bazaar bug indicates that CVS (and bazaar) can emit them
> > mid-line:
> >    https://bugs.launchpad.net/bzr/+bug/36399
>
> Ok. CVS I think we shouldn't worry about, and it looks like this is one
> particular bug/corner case where the conflict end marker is the last
> thing in the file. I think on the whole it's best to check for beginning
> of the line as you've done.
>
> > Wording-wise, should it be "merge conflict marker", rather
> > than "patch conflict marker"?
> >
> > Clang spells it:
> > "error: version control conflict marker in file"
> > http://blog.llvm.org/2010/04/amazing-feats-of-clang-error-recovery.html#merge_conflicts
>
> Yeah, if another compiler has a similar/identical diagnostic I think we
> should just copy that unless there's a very good reason not to.
>
> > Rebased on top of r231445 (from yesterday).
> > Successfully bootstrapped&regrtested on x86_64-pc-linux-gnu.
> > Adds 82 new PASSes to g++.sum and 27 new PASSes to gcc.sum.
> >
> > OK for trunk?
>
> I'm inclined to say yes since it was originally submitted in time and
> it's hard to imagine how the change could be risky (I'll point out right
> away that there are one or two other patches in the queue that were also
> submitted in time which I feel should not be considered for gcc-6 at
> this point due to risk).
>
> Let's wait until the end of the week for objections, commit then.

I got thinking about what we'd have to do to support Perforce-style
markers, and began to find my token-matching approach to be a little
clunky (in conjunction with reading Martin's observations on
c_parser_peek_nth_token).

Here's a reimplementation of the patch which takes a much simpler
approach, and avoids the need to touch the C lexer: check that we're
not in a macro expansion and then read in the source line, and
textually compare against the various possible conflict markers.
This adds the requirement that the source file be readable, so it
won't detect conflict markers in a .i file from -save-temps, but
that seems no great loss compared to the simpler, more flexible
implementation.  We're about to emit an error at the line, so
this shouldn't add any extra file access for the default case
of printing the source line after the error.

Is this approach preferable, or should I just go with the
v4 approach?

Other changes:
- Updated wording to match clang's
  "error: version control conflict marker in file"
and replaced "patch conflict marker" with "conflict marker" in the code
and names of test cases.  (I have a version of the v4 patch
with those changes).
- Renamed local "loc" to "marker_loc" to clarify things.

Successfully bootstrapped&regrtested on x86_64-pc-linux-gnu.

Sorry to prolong this; thanks for your patience.

gcc/c-family/ChangeLog:
	* c-common.h (conflict_marker_at_location_p): New prototype.
	* c-lex.c (conflict_marker_within_line_p): New function.
	(conflict_marker_at_location_p): New function.

gcc/c/ChangeLog:
	* c-parser.c (c_parser_error): Detect conflict markers and report
	them as such.

gcc/cp/ChangeLog:
	* parser.c (cp_parser_error): Detect conflict markers and report
	them as such.

gcc/testsuite/ChangeLog:
	* c-c++-common/conflict-markers-1.c: New testcase.
	* c-c++-common/conflict-markers-2.c: Likewise.
	* c-c++-common/conflict-markers-3.c: Likewise.
	* c-c++-common/conflict-markers-4.c: Likewise.
	* c-c++-common/conflict-markers-5.c: Likewise.
	* c-c++-common/conflict-markers-6.c: Likewise.
	* c-c++-common/conflict-markers-7.c: Likewise.
	* c-c++-common/conflict-markers-8.c: Likewise.
	* c-c++-common/conflict-markers-9.c: Likewise.
	* c-c++-common/conflict-markers-10.c: Likewise.
	* c-c++-common/conflict-markers-11.c: Likewise.
	* g++.dg/conflict-markers-1.C: Likewise.
---
 gcc/c-family/c-common.h                          |  4 ++
 gcc/c-family/c-lex.c                             | 67 ++++++++++++++++++++++++
 gcc/c/c-parser.c                                 | 14 +++++
 gcc/cp/parser.c                                  | 13 +++++
 gcc/testsuite/c-c++-common/conflict-markers-1.c  |  9 ++++
 gcc/testsuite/c-c++-common/conflict-markers-10.c | 23 ++++++++
 gcc/testsuite/c-c++-common/conflict-markers-11.c | 14 +++++
 gcc/testsuite/c-c++-common/conflict-markers-2.c  |  2 +
 gcc/testsuite/c-c++-common/conflict-markers-3.c  | 11 ++++
 gcc/testsuite/c-c++-common/conflict-markers-4.c  | 11 ++++
 gcc/testsuite/c-c++-common/conflict-markers-5.c  | 11 ++++
 gcc/testsuite/c-c++-common/conflict-markers-6.c  | 38 ++++++++++++++
 gcc/testsuite/c-c++-common/conflict-markers-7.c  |  6 +++
 gcc/testsuite/c-c++-common/conflict-markers-8.c  |  4 ++
 gcc/testsuite/c-c++-common/conflict-markers-9.c  |  8 +++
 gcc/testsuite/g++.dg/conflict-markers-1.C        | 13 +++++
 16 files changed, 248 insertions(+)
 create mode 100644 gcc/testsuite/c-c++-common/conflict-markers-1.c
 create mode 100644 gcc/testsuite/c-c++-common/conflict-markers-10.c
 create mode 100644 gcc/testsuite/c-c++-common/conflict-markers-11.c
 create mode 100644 gcc/testsuite/c-c++-common/conflict-markers-2.c
 create mode 100644 gcc/testsuite/c-c++-common/conflict-markers-3.c
 create mode 100644 gcc/testsuite/c-c++-common/conflict-markers-4.c
 create mode 100644 gcc/testsuite/c-c++-common/conflict-markers-5.c
 create mode 100644 gcc/testsuite/c-c++-common/conflict-markers-6.c
 create mode 100644 gcc/testsuite/c-c++-common/conflict-markers-7.c
 create mode 100644 gcc/testsuite/c-c++-common/conflict-markers-8.c
 create mode 100644 gcc/testsuite/c-c++-common/conflict-markers-9.c
 create mode 100644 gcc/testsuite/g++.dg/conflict-markers-1.C

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index ef64e6b..519c3fa 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1089,6 +1089,10 @@ extern void c_genericize (tree);
 extern int c_gimplify_expr (tree *, gimple_seq *, gimple_seq *);
 extern tree c_build_bind_expr (location_t, tree, tree);
 
+/* In c-lex.c  */
+extern bool conflict_marker_at_location_p (location_t start_loc,
+					   location_t *out_loc);
+
 /* In c-pch.c  */
 extern void pch_init (void);
 extern void pch_cpp_save_state (void);
diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index 9c86ba7..dd598c6 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -1263,3 +1263,70 @@ lex_charconst (const cpp_token *token)
 
   return value;
 }
+
+/* Determine if LINE begins with a conflict marker.
+   If so, write to *OUT_LEN with the length of the marker.  */
+
+static bool
+conflict_marker_within_line_p (const char *line, int *out_len)
+{
+  if (0 == strncmp (line, "<<<<<<<", 7))
+    {
+      *out_len = 7;
+      return true;
+    }
+  if (0 == strncmp (line, "=======", 7))
+    {
+      *out_len = 7;
+      return true;
+    }
+  if (0 == strncmp (line, ">>>>>>>", 7))
+    {
+      *out_len = 7;
+      return true;
+    }
+
+  return false;
+}
+
+/* Helper function for c_parser_error and cp_parser_error.
+   Given an error occurring at a "<<", ">>" or "==" token at START_LOC,
+   determine if this is a conflict marker.
+   If it returns true, *OUT_LOC is written to with the location/range
+   of the marker.  */
+
+bool
+conflict_marker_at_location_p (location_t start_loc, location_t *out_loc)
+{
+  /* It must not be in a macro expansion.  */
+  if (linemap_location_from_macro_expansion_p (line_table, start_loc))
+    return false;
+
+  /* It must be at the start of the line.  */
+  expanded_location exploc = expand_location (start_loc);
+  if (exploc.column != 1)
+    return false;
+
+  const char *line = location_get_source_line (exploc.file, exploc.line,
+					       NULL);
+  /* If we can't read the source file (e.g. when handling a .i file)
+     don't attempt to detect conflict markers.  */
+  if (!line)
+    return false;
+
+  /* Does "line" start with a conflict marker?  */
+  int marker_len;
+  if (!conflict_marker_within_line_p (line, &marker_len))
+    return false;
+
+  /* We have a conflict marker.  Construct a location of the form:
+       <<<<<<<
+       ^~~~~~~
+     with start == caret, finishing at the end of the marker.  */
+  location_t finish_loc
+    = linemap_position_for_loc_and_offset (line_table, start_loc,
+					   marker_len - 1);
+  *out_loc = make_location (start_loc, start_loc, finish_loc);
+
+  return true;
+}
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 124c30b..5014c8d 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -850,6 +850,20 @@ c_parser_error (c_parser *parser, const char *gmsgid)
   parser->error = true;
   if (!gmsgid)
     return;
+
+  /* If this is actually a conflict marker, report it as such.  */
+  if (token->type == CPP_LSHIFT
+      || token->type == CPP_RSHIFT
+      || token->type == CPP_EQ_EQ)
+    {
+      location_t marker_loc;
+      if (conflict_marker_at_location_p (token->location, &marker_loc))
+	{
+	  error_at (marker_loc, "version control conflict marker in file");
+	  return;
+	}
+    }
+
   /* This diagnostic makes more sense if it is tagged to the line of
      the token we just peeked at.  */
   c_parser_set_source_position_from_token (token);
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index a420cf1..8bdaed7 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -2713,6 +2713,19 @@ cp_parser_error (cp_parser* parser, const char* gmsgid)
 	  return;
 	}
 
+      /* If this is actually a conflict marker, report it as such.  */
+      if (token->type == CPP_LSHIFT
+	  || token->type == CPP_RSHIFT
+	  || token->type == CPP_EQ_EQ)
+	{
+	  location_t marker_loc;
+	  if (conflict_marker_at_location_p (token->location, &marker_loc))
+	    {
+	      error_at (marker_loc, "version control conflict marker in file");
+	      return;
+	    }
+	}
+
       c_parse_error (gmsgid,
 		     /* Because c_parser_error does not understand
 			CPP_KEYWORD, keywords are treated like
diff --git a/gcc/testsuite/c-c++-common/conflict-markers-1.c b/gcc/testsuite/c-c++-common/conflict-markers-1.c
new file mode 100644
index 0000000..752f146
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/conflict-markers-1.c
@@ -0,0 +1,9 @@
+int p;
+
+<<<<<<< HEAD /* { dg-error "conflict marker" } */
+extern int some_var;
+=======      /* { dg-error "conflict marker" } */
+extern short some_var; /* this line would lead to a warning */
+>>>>>>> Some commit message  /* { dg-error "conflict marker" } */
+
+int q;
diff --git a/gcc/testsuite/c-c++-common/conflict-markers-10.c b/gcc/testsuite/c-c++-common/conflict-markers-10.c
new file mode 100644
index 0000000..279406f
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/conflict-markers-10.c
@@ -0,0 +1,23 @@
+/* { dg-options "-fdiagnostics-show-caret" } */
+
+<<<<<<< HEAD /* { dg-error "conflict marker" } */
+/* { dg-begin-multiline-output "" }
+ <<<<<<< HEAD
+ ^~~~~~~
+   { dg-end-multiline-output "" } */
+
+extern int some_var;
+
+=======      /* { dg-error "conflict marker" } */
+/* { dg-begin-multiline-output "" }
+ =======
+ ^~~~~~~
+   { dg-end-multiline-output "" } */
+
+extern short some_var; /* this line would lead to a warning */
+
+>>>>>>> Some commit message  /* { dg-error "conflict marker" } */
+/* { dg-begin-multiline-output "" }
+ >>>>>>>
+ ^~~~~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/c-c++-common/conflict-markers-11.c b/gcc/testsuite/c-c++-common/conflict-markers-11.c
new file mode 100644
index 0000000..8771453
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/conflict-markers-11.c
@@ -0,0 +1,14 @@
+/* Verify that we only report conflict markers at the start of lines.  */
+int p;
+
+ <<<<<<< HEAD /* { dg-error "expected identifier|expected unqualified-id" } */
+
+int q;
+
+ =======      /* { dg-error "expected identifier|expected unqualified-id" } */
+
+int r;
+
+ >>>>>>> Some commit message  /* { dg-error "expected identifier|expected unqualified-id" } */
+
+int s;
diff --git a/gcc/testsuite/c-c++-common/conflict-markers-2.c b/gcc/testsuite/c-c++-common/conflict-markers-2.c
new file mode 100644
index 0000000..f06d043
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/conflict-markers-2.c
@@ -0,0 +1,2 @@
+/* This should not be flagged as a conflict marker.  */
+const char *msg = "<<<<<<< ";
diff --git a/gcc/testsuite/c-c++-common/conflict-markers-3.c b/gcc/testsuite/c-c++-common/conflict-markers-3.c
new file mode 100644
index 0000000..f149ecc
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/conflict-markers-3.c
@@ -0,0 +1,11 @@
+/* Ensure we can handle unterminated conflict markers.  */
+
+int p;
+
+<<<<<<< HEAD  /* { dg-error "conflict marker" } */
+
+int q;
+
+<<<<<<< HEAD  /* { dg-error "conflict marker" } */
+
+int r;
diff --git a/gcc/testsuite/c-c++-common/conflict-markers-4.c b/gcc/testsuite/c-c++-common/conflict-markers-4.c
new file mode 100644
index 0000000..a3c53ea
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/conflict-markers-4.c
@@ -0,0 +1,11 @@
+/* Ensure we can handle mismatched conflict markers.  */
+
+int p;
+
+>>>>>>> Some commit message  /* { dg-error "conflict marker" } */
+
+int q;
+
+>>>>>>> Some other commit message  /* { dg-error "conflict marker" } */
+
+int r;
diff --git a/gcc/testsuite/c-c++-common/conflict-markers-5.c b/gcc/testsuite/c-c++-common/conflict-markers-5.c
new file mode 100644
index 0000000..b55c9c3
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/conflict-markers-5.c
@@ -0,0 +1,11 @@
+/* Ensure we can handle mismatched conflict markers.  */
+
+int p;
+
+=======  /* { dg-error "conflict marker" } */
+
+int q;
+
+=======  /* { dg-error "conflict marker" } */
+
+int r;
diff --git a/gcc/testsuite/c-c++-common/conflict-markers-6.c b/gcc/testsuite/c-c++-common/conflict-markers-6.c
new file mode 100644
index 0000000..081e289
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/conflict-markers-6.c
@@ -0,0 +1,38 @@
+/* Branch coverage of conflict marker detection:
+   none of these should be reported as conflict markers.  */
+
+int a0;
+
+<< HEAD  /* { dg-error "expected" } */
+
+int a1;
+
+<<<< HEAD  /* { dg-error "expected" } */
+
+int a2;
+
+<<<<<< HEAD  /* { dg-error "expected" } */
+
+int b0;
+
+== HEAD  /* { dg-error "expected" } */
+
+int b1;
+
+==== HEAD  /* { dg-error "expected" } */
+
+int b2;
+
+====== HEAD  /* { dg-error "expected" } */
+
+int c0;
+
+>> HEAD  /* { dg-error "expected" } */
+
+int c1;
+
+>>>> HEAD  /* { dg-error "expected" } */
+
+int c2;
+
+>>>>>> HEAD  /* { dg-error "expected" } */
diff --git a/gcc/testsuite/c-c++-common/conflict-markers-7.c b/gcc/testsuite/c-c++-common/conflict-markers-7.c
new file mode 100644
index 0000000..e68f84d
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/conflict-markers-7.c
@@ -0,0 +1,6 @@
+/* It's valid to stringize the "<<<<<<<"; don't
+   report it as a conflict marker.  */
+#define str(s) #s
+const char *s = str(
+<<<<<<<
+);
diff --git a/gcc/testsuite/c-c++-common/conflict-markers-8.c b/gcc/testsuite/c-c++-common/conflict-markers-8.c
new file mode 100644
index 0000000..be2e121
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/conflict-markers-8.c
@@ -0,0 +1,4 @@
+/* A macro that's never expanded shouldn't be reported as a
+   conflict marker.  */
+#define foo \
+<<<<<<<
diff --git a/gcc/testsuite/c-c++-common/conflict-markers-9.c b/gcc/testsuite/c-c++-common/conflict-markers-9.c
new file mode 100644
index 0000000..5c1e663
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/conflict-markers-9.c
@@ -0,0 +1,8 @@
+/* It's valid to have
+<<<<<<<
+   inside both
+   comments (as above), and within string literals.  */
+const char *s = "\
+<<<<<<<";
+
+/* The above shouldn't be reported as errors.  */
diff --git a/gcc/testsuite/g++.dg/conflict-markers-1.C b/gcc/testsuite/g++.dg/conflict-markers-1.C
new file mode 100644
index 0000000..ae19193
--- /dev/null
+++ b/gcc/testsuite/g++.dg/conflict-markers-1.C
@@ -0,0 +1,13 @@
+/* Ensure that we don't complain about patch conflict markers on
+   valid template argument lists, valid in C++11 onwards.  */
+// { dg-options "-std=c++11" }
+
+template <typename T>
+struct foo
+{
+  T t;
+};
+
+foo <foo <foo <foo <foo <foo <foo <int
+>>>>>>> f;
+// The above line is valid C++11, and isn't a patch conflict marker
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Better error recovery for merge-conflict markers (v5)
  2015-12-15 19:11   ` [PATCH] Better error recovery for merge-conflict markers (v5) David Malcolm
@ 2015-12-15 23:52     ` Bernd Schmidt
  2015-12-16 18:33       ` David Malcolm
  0 siblings, 1 reply; 12+ messages in thread
From: Bernd Schmidt @ 2015-12-15 23:52 UTC (permalink / raw)
  To: David Malcolm; +Cc: gcc-patches, Joseph Myers, Jason Merrill

On 12/15/2015 08:30 PM, David Malcolm wrote:

> I got thinking about what we'd have to do to support Perforce-style
> markers, and began to find my token-matching approach to be a little
> clunky (in conjunction with reading Martin's observations on
> c_parser_peek_nth_token).
>
> Here's a reimplementation of the patch which takes a much simpler
> approach, and avoids the need to touch the C lexer: check that we're
> not in a macro expansion and then read in the source line, and
> textually compare against the various possible conflict markers.
> This adds the requirement that the source file be readable, so it
> won't detect conflict markers in a .i file from -save-temps,

How come? Is source file defined as the one before preprocessing?

And I do think this is an unfortunate limitation (given that we often 
load .i files into cc1 for debugging and we'd ideally like that to be 
consistent with normal compilation as much as possible). I'd rather go 
with the original patch based on this.


Bernd

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Better error recovery for merge-conflict markers (v4)
  2015-12-09 17:44     ` Bernd Schmidt
  2015-12-09 20:18       ` Jeff Law
@ 2015-12-16 18:23       ` David Malcolm
  1 sibling, 0 replies; 12+ messages in thread
From: David Malcolm @ 2015-12-16 18:23 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: gcc-patches, Joseph Myers, Jason Merrill

[-- Attachment #1: Type: text/plain, Size: 3089 bytes --]

On Wed, 2015-12-09 at 18:44 +0100, Bernd Schmidt wrote:
> On 12/09/2015 05:58 PM, David Malcolm wrote:
> > On Wed, 2015-11-04 at 14:56 +0100, Bernd Schmidt wrote:
> >>
> >> This seems like fairly low impact but also low cost, so I'm fine with it
> >> in principle. I wonder whether the length of the marker is the same
> >> across all versions of patch (and VC tools)?
> >
> > It's hardcoded for GNU patch:
> [...]
>  >From what I can tell, Perforce is the outlier here.
> 
> Thanks for checking all that.
> 
> >> Just thinking out loud - I guess it would be too much to hope for to
> >> share lexers between frontends so that we need only one copy of this?
> >
> > Probably :(
> 
> Someone slap sense into me, I just thought of deriving C and C++ parsers 
> from a common base class... (no this is not a suggestion for this patch).
> 
> > Would a better wording be:
> >
> > extern short some_var; /* This line would lead to a warning due to the
> >                            duplicate name, but it is skipped when handling
> >                            the conflict marker.  */
> 
> I think so, yes.
> 
> > That said, it's not clear they're always at the beginning of a line;
> > this bazaar bug indicates that CVS (and bazaar) can emit them
> > mid-line:
> >    https://bugs.launchpad.net/bzr/+bug/36399
> 
> Ok. CVS I think we shouldn't worry about, and it looks like this is one 
> particular bug/corner case where the conflict end marker is the last 
> thing in the file. I think on the whole it's best to check for beginning 
> of the line as you've done.
> 
> > Wording-wise, should it be "merge conflict marker", rather
> > than "patch conflict marker"?
> >
> > Clang spells it:
> > "error: version control conflict marker in file"
> > http://blog.llvm.org/2010/04/amazing-feats-of-clang-error-recovery.html#merge_conflicts
> 
> Yeah, if another compiler has a similar/identical diagnostic I think we 
> should just copy that unless there's a very good reason not to.
> 
> > Rebased on top of r231445 (from yesterday).
> > Successfully bootstrapped&regrtested on x86_64-pc-linux-gnu.
> > Adds 82 new PASSes to g++.sum and 27 new PASSes to gcc.sum.
> >
> > OK for trunk?
> 
> I'm inclined to say yes since it was originally submitted in time and 
> it's hard to imagine how the change could be risky (I'll point out right 
> away that there are one or two other patches in the queue that were also 
> submitted in time which I feel should not be considered for gcc-6 at 
> this point due to risk).
> 
> Let's wait until the end of the week for objections, commit then.

Thanks.  I updated it based on the feedback above, including changing
the wording to match clang's:
     "error: version control conflict marker in file"
I replaced "patch conflict marker" with "conflict marker" in the code
and names of test cases.

I took the liberty of adding:
      gcc_assert (n > 0);
to c_parser_peek_nth_token based on Martin's feedback.

Having verified bootstrap&regrtest (on x86_64-pc-linux-gnu), I've
committed it to trunk as r231712.

I'm attaching what I committed, for reference.

[-- Attachment #2: r231712.patch --]
[-- Type: text/x-patch, Size: 16923 bytes --]

Index: gcc/c-family/ChangeLog
===================================================================
--- gcc/c-family/ChangeLog	(revision 231711)
+++ gcc/c-family/ChangeLog	(revision 231712)
@@ -1,3 +1,8 @@
+2015-12-16  David Malcolm  <dmalcolm@redhat.com>
+
+	* c-common.h (conflict_marker_get_final_tok_kind): New prototype.
+	* c-lex.c (conflict_marker_get_final_tok_kind): New function.
+
 2015-12-15  Ilya Verbin  <ilya.verbin@intel.com>
 
 	* c-common.c (c_common_attribute_table): Handle "omp declare target
Index: gcc/c-family/c-lex.c
===================================================================
--- gcc/c-family/c-lex.c	(revision 231711)
+++ gcc/c-family/c-lex.c	(revision 231712)
@@ -1263,3 +1263,29 @@
 
   return value;
 }
+
+/* Helper function for c_parser_peek_conflict_marker
+   and cp_lexer_peek_conflict_marker.
+   Given a possible conflict marker token of kind TOK1_KIND
+   consisting of a pair of characters, get the token kind for the
+   standalone final character.  */
+
+enum cpp_ttype
+conflict_marker_get_final_tok_kind (enum cpp_ttype tok1_kind)
+{
+  switch (tok1_kind)
+    {
+    default: gcc_unreachable ();
+    case CPP_LSHIFT:
+      /* "<<" and '<' */
+      return CPP_LESS;
+
+    case CPP_EQ_EQ:
+      /* "==" and '=' */
+      return CPP_EQ;
+
+    case CPP_RSHIFT:
+      /* ">>" and '>' */
+      return CPP_GREATER;
+    }
+}
Index: gcc/c-family/c-common.h
===================================================================
--- gcc/c-family/c-common.h	(revision 231711)
+++ gcc/c-family/c-common.h	(revision 231712)
@@ -1089,6 +1089,10 @@
 extern int c_gimplify_expr (tree *, gimple_seq *, gimple_seq *);
 extern tree c_build_bind_expr (location_t, tree, tree);
 
+/* In c-lex.c.  */
+extern enum cpp_ttype
+conflict_marker_get_final_tok_kind (enum cpp_ttype tok1_kind);
+
 /* In c-pch.c  */
 extern void pch_init (void);
 extern void pch_cpp_save_state (void);
Index: gcc/c/ChangeLog
===================================================================
--- gcc/c/ChangeLog	(revision 231711)
+++ gcc/c/ChangeLog	(revision 231712)
@@ -1,5 +1,13 @@
 2015-12-16  David Malcolm  <dmalcolm@redhat.com>
 
+	* c-parser.c (struct c_parser): Expand array "tokens_buf" from 2
+	to 4.
+	(c_parser_peek_nth_token): New function.
+	(c_parser_peek_conflict_marker): New function.
+	(c_parser_error): Detect conflict markers and report them as such.
+
+2015-12-16  David Malcolm  <dmalcolm@redhat.com>
+
 	* c-parser.c (c_parser_postfix_expression): Use EXPR_LOC_OR_LOC
 	to preserve range information for the primary expression
 	in the call to c_parser_postfix_expression_after_primary.
Index: gcc/c/c-parser.c
===================================================================
--- gcc/c/c-parser.c	(revision 231711)
+++ gcc/c/c-parser.c	(revision 231712)
@@ -202,8 +202,8 @@
   /* The look-ahead tokens.  */
   c_token * GTY((skip)) tokens;
   /* Buffer for look-ahead tokens.  */
-  c_token tokens_buf[2];
-  /* How many look-ahead tokens are available (0, 1 or 2, or
+  c_token tokens_buf[4];
+  /* How many look-ahead tokens are available (0 - 4, or
      more if parsing from pre-lexed tokens).  */
   unsigned int tokens_avail;
   /* True if a syntax error is being recovered from; false otherwise.
@@ -492,6 +492,23 @@
   return &parser->tokens[1];
 }
 
+/* Return a pointer to the Nth token from PARSER, reading it
+   in if necessary.  The N-1th token is already read in.  */
+
+static c_token *
+c_parser_peek_nth_token (c_parser *parser, unsigned int n)
+{
+  /* N is 1-based, not zero-based.  */
+  gcc_assert (n > 0);
+
+  if (parser->tokens_avail >= n)
+    return &parser->tokens[n - 1];
+  gcc_assert (parser->tokens_avail == n - 1);
+  c_lex_one_token (parser, &parser->tokens[n - 1]);
+  parser->tokens_avail = n;
+  return &parser->tokens[n - 1];
+}
+
 /* Return true if TOKEN can start a type name,
    false otherwise.  */
 static bool
@@ -829,6 +846,46 @@
     }
 }
 
+/* Helper function for c_parser_error.
+   Having peeked a token of kind TOK1_KIND that might signify
+   a conflict marker, peek successor tokens to determine
+   if we actually do have a conflict marker.
+   Specifically, we consider a run of 7 '<', '=' or '>' characters
+   at the start of a line as a conflict marker.
+   These come through the lexer as three pairs and a single,
+   e.g. three CPP_LSHIFT ("<<") and a CPP_LESS ('<').
+   If it returns true, *OUT_LOC is written to with the location/range
+   of the marker.  */
+
+static bool
+c_parser_peek_conflict_marker (c_parser *parser, enum cpp_ttype tok1_kind,
+			       location_t *out_loc)
+{
+  c_token *token2 = c_parser_peek_2nd_token (parser);
+  if (token2->type != tok1_kind)
+    return false;
+  c_token *token3 = c_parser_peek_nth_token (parser, 3);
+  if (token3->type != tok1_kind)
+    return false;
+  c_token *token4 = c_parser_peek_nth_token (parser, 4);
+  if (token4->type != conflict_marker_get_final_tok_kind (tok1_kind))
+    return false;
+
+  /* It must be at the start of the line.  */
+  location_t start_loc = c_parser_peek_token (parser)->location;
+  if (LOCATION_COLUMN (start_loc) != 1)
+    return false;
+
+  /* We have a conflict marker.  Construct a location of the form:
+       <<<<<<<
+       ^~~~~~~
+     with start == caret, finishing at the end of the marker.  */
+  location_t finish_loc = get_finish (token4->location);
+  *out_loc = make_location (start_loc, start_loc, finish_loc);
+
+  return true;
+}
+
 /* Issue a diagnostic of the form
       FILE:LINE: MESSAGE before TOKEN
    where TOKEN is the next token in the input stream of PARSER.
@@ -850,6 +907,20 @@
   parser->error = true;
   if (!gmsgid)
     return;
+
+  /* If this is actually a conflict marker, report it as such.  */
+  if (token->type == CPP_LSHIFT
+      || token->type == CPP_RSHIFT
+      || token->type == CPP_EQ_EQ)
+    {
+      location_t loc;
+      if (c_parser_peek_conflict_marker (parser, token->type, &loc))
+	{
+	  error_at (loc, "version control conflict marker in file");
+	  return;
+	}
+    }
+
   /* This diagnostic makes more sense if it is tagged to the line of
      the token we just peeked at.  */
   c_parser_set_source_position_from_token (token);
Index: gcc/testsuite/ChangeLog
===================================================================
--- gcc/testsuite/ChangeLog	(revision 231711)
+++ gcc/testsuite/ChangeLog	(revision 231712)
@@ -1,5 +1,20 @@
 2015-12-16  David Malcolm  <dmalcolm@redhat.com>
 
+	* c-c++-common/conflict-markers-1.c: New testcase.
+	* c-c++-common/conflict-markers-2.c: Likewise.
+	* c-c++-common/conflict-markers-3.c: Likewise.
+	* c-c++-common/conflict-markers-4.c: Likewise.
+	* c-c++-common/conflict-markers-5.c: Likewise.
+	* c-c++-common/conflict-markers-6.c: Likewise.
+	* c-c++-common/conflict-markers-7.c: Likewise.
+	* c-c++-common/conflict-markers-8.c: Likewise.
+	* c-c++-common/conflict-markers-9.c: Likewise.
+	* c-c++-common/conflict-markers-10.c: Likewise.
+	* c-c++-common/conflict-markers-11.c: Likewise.
+	* g++.dg/conflict-markers-1.C: Likewise.
+
+2015-12-16  David Malcolm  <dmalcolm@redhat.com>
+
 	* gcc.dg/cast-function-1.c (bar): Update column numbers.
 	* gcc.dg/diagnostic-range-bad-called-object.c: New test case.
 
Index: gcc/testsuite/g++.dg/conflict-markers-1.C
===================================================================
--- gcc/testsuite/g++.dg/conflict-markers-1.C	(revision 0)
+++ gcc/testsuite/g++.dg/conflict-markers-1.C	(revision 231712)
@@ -0,0 +1,13 @@
+/* Ensure that we don't complain about conflict markers on
+   valid template argument lists, valid in C++11 onwards.  */
+// { dg-options "-std=c++11" }
+
+template <typename T>
+struct foo
+{
+  T t;
+};
+
+foo <foo <foo <foo <foo <foo <foo <int
+>>>>>>> f;
+// The above line is valid C++11, and isn't a conflict marker
Index: gcc/testsuite/c-c++-common/conflict-markers-11.c
===================================================================
--- gcc/testsuite/c-c++-common/conflict-markers-11.c	(revision 0)
+++ gcc/testsuite/c-c++-common/conflict-markers-11.c	(revision 231712)
@@ -0,0 +1,14 @@
+/* Verify that we only report conflict markers at the start of lines.  */
+int p;
+
+ <<<<<<< HEAD /* { dg-error "expected identifier|expected unqualified-id" } */
+
+int q;
+
+ =======      /* { dg-error "expected identifier|expected unqualified-id" } */
+
+int r;
+
+ >>>>>>> Some commit message  /* { dg-error "expected identifier|expected unqualified-id" } */
+
+int s;
Index: gcc/testsuite/c-c++-common/conflict-markers-4.c
===================================================================
--- gcc/testsuite/c-c++-common/conflict-markers-4.c	(revision 0)
+++ gcc/testsuite/c-c++-common/conflict-markers-4.c	(revision 231712)
@@ -0,0 +1,11 @@
+/* Ensure we can handle mismatched conflict markers.  */
+
+int p;
+
+>>>>>>> Some commit message  /* { dg-error "conflict marker" } */
+
+int q;
+
+>>>>>>> Some other commit message  /* { dg-error "conflict marker" } */
+
+int r;
Index: gcc/testsuite/c-c++-common/conflict-markers-5.c
===================================================================
--- gcc/testsuite/c-c++-common/conflict-markers-5.c	(revision 0)
+++ gcc/testsuite/c-c++-common/conflict-markers-5.c	(revision 231712)
@@ -0,0 +1,11 @@
+/* Ensure we can handle mismatched conflict markers.  */
+
+int p;
+
+=======  /* { dg-error "conflict marker" } */
+
+int q;
+
+=======  /* { dg-error "conflict marker" } */
+
+int r;
Index: gcc/testsuite/c-c++-common/conflict-markers-6.c
===================================================================
--- gcc/testsuite/c-c++-common/conflict-markers-6.c	(revision 0)
+++ gcc/testsuite/c-c++-common/conflict-markers-6.c	(revision 231712)
@@ -0,0 +1,38 @@
+/* Branch coverage of conflict marker detection:
+   none of these should be reported as conflict markers.  */
+
+int a0;
+
+<< HEAD  /* { dg-error "expected" } */
+
+int a1;
+
+<<<< HEAD  /* { dg-error "expected" } */
+
+int a2;
+
+<<<<<< HEAD  /* { dg-error "expected" } */
+
+int b0;
+
+== HEAD  /* { dg-error "expected" } */
+
+int b1;
+
+==== HEAD  /* { dg-error "expected" } */
+
+int b2;
+
+====== HEAD  /* { dg-error "expected" } */
+
+int c0;
+
+>> HEAD  /* { dg-error "expected" } */
+
+int c1;
+
+>>>> HEAD  /* { dg-error "expected" } */
+
+int c2;
+
+>>>>>> HEAD  /* { dg-error "expected" } */
Index: gcc/testsuite/c-c++-common/conflict-markers-7.c
===================================================================
--- gcc/testsuite/c-c++-common/conflict-markers-7.c	(revision 0)
+++ gcc/testsuite/c-c++-common/conflict-markers-7.c	(revision 231712)
@@ -0,0 +1,6 @@
+/* It's valid to stringize the "<<<<<<<"; don't
+   report it as a conflict marker.  */
+#define str(s) #s
+const char *s = str(
+<<<<<<<
+);
Index: gcc/testsuite/c-c++-common/conflict-markers-8.c
===================================================================
--- gcc/testsuite/c-c++-common/conflict-markers-8.c	(revision 0)
+++ gcc/testsuite/c-c++-common/conflict-markers-8.c	(revision 231712)
@@ -0,0 +1,4 @@
+/* A macro that's never expanded shouldn't be reported as a
+   conflict marker.  */
+#define foo \
+<<<<<<<
Index: gcc/testsuite/c-c++-common/conflict-markers-1.c
===================================================================
--- gcc/testsuite/c-c++-common/conflict-markers-1.c	(revision 0)
+++ gcc/testsuite/c-c++-common/conflict-markers-1.c	(revision 231712)
@@ -0,0 +1,11 @@
+int p;
+
+<<<<<<< HEAD /* { dg-error "conflict marker" } */
+extern int some_var;
+=======      /* { dg-error "conflict marker" } */
+extern short some_var; /* This line would lead to a warning due to the
+			  duplicate name, but it is skipped when handling
+			  the conflict marker.  */
+>>>>>>> Some commit message  /* { dg-error "conflict marker" } */
+
+int q;
Index: gcc/testsuite/c-c++-common/conflict-markers-9.c
===================================================================
--- gcc/testsuite/c-c++-common/conflict-markers-9.c	(revision 0)
+++ gcc/testsuite/c-c++-common/conflict-markers-9.c	(revision 231712)
@@ -0,0 +1,8 @@
+/* It's valid to have
+<<<<<<<
+   inside both
+   comments (as above), and within string literals.  */
+const char *s = "\
+<<<<<<<";
+
+/* The above shouldn't be reported as errors.  */
Index: gcc/testsuite/c-c++-common/conflict-markers-2.c
===================================================================
--- gcc/testsuite/c-c++-common/conflict-markers-2.c	(revision 0)
+++ gcc/testsuite/c-c++-common/conflict-markers-2.c	(revision 231712)
@@ -0,0 +1,2 @@
+/* This should not be flagged as a conflict marker.  */
+const char *msg = "<<<<<<< ";
Index: gcc/testsuite/c-c++-common/conflict-markers-10.c
===================================================================
--- gcc/testsuite/c-c++-common/conflict-markers-10.c	(revision 0)
+++ gcc/testsuite/c-c++-common/conflict-markers-10.c	(revision 231712)
@@ -0,0 +1,25 @@
+/* { dg-options "-fdiagnostics-show-caret" } */
+
+<<<<<<< HEAD /* { dg-error "conflict marker" } */
+/* { dg-begin-multiline-output "" }
+ <<<<<<< HEAD
+ ^~~~~~~
+   { dg-end-multiline-output "" } */
+
+extern int some_var;
+
+=======      /* { dg-error "conflict marker" } */
+/* { dg-begin-multiline-output "" }
+ =======
+ ^~~~~~~
+   { dg-end-multiline-output "" } */
+
+extern short some_var; /* This line would lead to a warning due to the
+			  duplicate name, but it is skipped when handling
+			  the conflict marker.  */
+
+>>>>>>> Some commit message  /* { dg-error "conflict marker" } */
+/* { dg-begin-multiline-output "" }
+ >>>>>>>
+ ^~~~~~~
+   { dg-end-multiline-output "" } */
Index: gcc/testsuite/c-c++-common/conflict-markers-3.c
===================================================================
--- gcc/testsuite/c-c++-common/conflict-markers-3.c	(revision 0)
+++ gcc/testsuite/c-c++-common/conflict-markers-3.c	(revision 231712)
@@ -0,0 +1,11 @@
+/* Ensure we can handle unterminated conflict markers.  */
+
+int p;
+
+<<<<<<< HEAD  /* { dg-error "conflict marker" } */
+
+int q;
+
+<<<<<<< HEAD  /* { dg-error "conflict marker" } */
+
+int r;
Index: gcc/cp/ChangeLog
===================================================================
--- gcc/cp/ChangeLog	(revision 231711)
+++ gcc/cp/ChangeLog	(revision 231712)
@@ -1,3 +1,9 @@
+2015-12-16  David Malcolm  <dmalcolm@redhat.com>
+
+	* parser.c (cp_lexer_peek_conflict_marker): New function.
+	(cp_parser_error): Detect conflict markers and report them as
+	such.
+
 2015-12-15  Martin Sebor  <msebor@redhat.com>
 
 	c++/42121
Index: gcc/cp/parser.c
===================================================================
--- gcc/cp/parser.c	(revision 231711)
+++ gcc/cp/parser.c	(revision 231712)
@@ -2689,6 +2689,46 @@
   return token->keyword == keyword;
 }
 
+/* Helper function for cp_parser_error.
+   Having peeked a token of kind TOK1_KIND that might signify
+   a conflict marker, peek successor tokens to determine
+   if we actually do have a conflict marker.
+   Specifically, we consider a run of 7 '<', '=' or '>' characters
+   at the start of a line as a conflict marker.
+   These come through the lexer as three pairs and a single,
+   e.g. three CPP_LSHIFT tokens ("<<") and a CPP_LESS token ('<').
+   If it returns true, *OUT_LOC is written to with the location/range
+   of the marker.  */
+
+static bool
+cp_lexer_peek_conflict_marker (cp_lexer *lexer, enum cpp_ttype tok1_kind,
+			       location_t *out_loc)
+{
+  cp_token *token2 = cp_lexer_peek_nth_token (lexer, 2);
+  if (token2->type != tok1_kind)
+    return false;
+  cp_token *token3 = cp_lexer_peek_nth_token (lexer, 3);
+  if (token3->type != tok1_kind)
+    return false;
+  cp_token *token4 = cp_lexer_peek_nth_token (lexer, 4);
+  if (token4->type != conflict_marker_get_final_tok_kind (tok1_kind))
+    return false;
+
+  /* It must be at the start of the line.  */
+  location_t start_loc = cp_lexer_peek_token (lexer)->location;
+  if (LOCATION_COLUMN (start_loc) != 1)
+    return false;
+
+  /* We have a conflict marker.  Construct a location of the form:
+       <<<<<<<
+       ^~~~~~~
+     with start == caret, finishing at the end of the marker.  */
+  location_t finish_loc = get_finish (token4->location);
+  *out_loc = make_location (start_loc, start_loc, finish_loc);
+
+  return true;
+}
+
 /* If not parsing tentatively, issue a diagnostic of the form
       FILE:LINE: MESSAGE before TOKEN
    where TOKEN is the next token in the input stream.  MESSAGE
@@ -2713,6 +2753,19 @@
 	  return;
 	}
 
+      /* If this is actually a conflict marker, report it as such.  */
+      if (token->type == CPP_LSHIFT
+	  || token->type == CPP_RSHIFT
+	  || token->type == CPP_EQ_EQ)
+	{
+	  location_t loc;
+	  if (cp_lexer_peek_conflict_marker (parser->lexer, token->type, &loc))
+	    {
+	      error_at (loc, "version control conflict marker in file");
+	      return;
+	    }
+	}
+
       c_parse_error (gmsgid,
 		     /* Because c_parser_error does not understand
 			CPP_KEYWORD, keywords are treated like

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] Better error recovery for merge-conflict markers (v5)
  2015-12-15 23:52     ` Bernd Schmidt
@ 2015-12-16 18:33       ` David Malcolm
  0 siblings, 0 replies; 12+ messages in thread
From: David Malcolm @ 2015-12-16 18:33 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: gcc-patches, Joseph Myers, Jason Merrill

On Wed, 2015-12-16 at 00:52 +0100, Bernd Schmidt wrote:
> On 12/15/2015 08:30 PM, David Malcolm wrote:
> 
> > I got thinking about what we'd have to do to support Perforce-style
> > markers, and began to find my token-matching approach to be a little
> > clunky (in conjunction with reading Martin's observations on
> > c_parser_peek_nth_token).
> >
> > Here's a reimplementation of the patch which takes a much simpler
> > approach, and avoids the need to touch the C lexer: check that we're
> > not in a macro expansion and then read in the source line, and
> > textually compare against the various possible conflict markers.
> > This adds the requirement that the source file be readable, so it
> > won't detect conflict markers in a .i file from -save-temps,
> 
> How come? Is source file defined as the one before preprocessing?

Yes, unless you manually strip the #line directives.  So unless the
original source files are still around in the right path relative to
where you're compiling the .i file, the calls to
location_get_source_line will fail.

> And I do think this is an unfortunate limitation (given that we often 
> load .i files into cc1 for debugging and we'd ideally like that to be 
> consistent with normal compilation as much as possible). I'd rather go 
> with the original patch based on this.

(nods)  This can be a pain when debugging diagnostic_show_locus, but
there's not much that can be done about it.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-12-16 18:33 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-30 15:02 [PATCH] Better error messages for merge-conflict markers (v3) David Malcolm
2015-11-02 22:52 ` Jeff Law
2015-11-03  4:05   ` Trevor Saunders
2015-11-04 13:56 ` Bernd Schmidt
2015-12-09 16:39   ` [PATCH] Better error recovery for merge-conflict markers (v4) David Malcolm
2015-12-09 17:44     ` Bernd Schmidt
2015-12-09 20:18       ` Jeff Law
2015-12-16 18:23       ` David Malcolm
2015-12-15 19:11   ` [PATCH] Better error recovery for merge-conflict markers (v5) David Malcolm
2015-12-15 23:52     ` Bernd Schmidt
2015-12-16 18:33       ` David Malcolm
2015-12-09 23:50 ` [PATCH] Better error messages for merge-conflict markers (v3) Martin Sebor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).