From: David Malcolm <dmalcolm@redhat.com>
To: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: [RFC stage 1] Proposed new warning: -Wmisleading-indentation
Date: Thu, 16 Apr 2015 15:08:00 -0000 [thread overview]
Message-ID: <1429196485.32584.46.camel@surprise> (raw)
[-- Attachment #1: Type: text/plain, Size: 4508 bytes --]
Attached is a work-in-progress patch for a new
-Wmisleading-indentation
warning I've been experimenting with, for GCC 6.
It kind-of-works, but there are some issues, so I wanted to get feedback
on it here.
The idea is to issue a warning when the C/C++ compiler's view of the
block structure doesn't match that of a naive human reader of the code
via indentation.
For example, CVE-2014-1266 (aka "goto fail") had:
hashOut.data = hashes + SSL_MD5_DIGEST_LEN;
hashOut.length = SSL_SHA1_DIGEST_LEN;
if ((err = SSLFreeBuffer(&hashCtx)) != 0)
goto fail;
if ((err = ReadyHash(&SSLHashSHA1, &hashCtx)) != 0)
goto fail;
if ((err = SSLHashSHA1.update(&hashCtx, &clientRandom)) != 0)
goto fail;
if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0)
goto fail;
if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
goto fail;
goto fail; /* MISTAKE! THIS LINE SHOULD NOT BE HERE */
if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
goto fail;
err = sslRawVerify(...);
where the second "goto fail;" here:
if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
goto fail;
goto fail; /* MISTAKE! THIS LINE SHOULD NOT BE HERE */
is indented as if it's guarded by the conditional. It isn't, making the
real meaning of the code hard to discern, and masking a security
vulnerability (where the stray goto led to the certificate-checking
code later in the function being dead code, skipping it with err == 0,
for success).
The patch successfully issues a warning on a simplified version of this:
if ((err = foo (b)) != 0)
goto fail;
goto fail;
./Wmisleading-indentation-6.c: In function âfooâ:
./Wmisleading-indentation-6.c:15:3: warning: statement is indented as if
it were guarded by... [-Wmisleading-indentation]
goto fail;
^
./Wmisleading-indentation-6.c:13:2: note: ...this clause, but it is not
if ((err = foo (b)) != 0)
^
and on various other testcases.
That said, there are several major issues with the patch as it stands:
(A) works with the C FE, but not with C++ yet. The implementation
assumes a single lexer consuming the token stream, so it doesn't yet
work with the C++ FE's ideas of tentatively-consumed tokens
(save/commit/rollback), and with its lexer stack. I *think* that can be
fixed by relying on the fact that the integer locations of the token
stream are normally monotonic, and using that to capture and handle
those two C++ FE impl details.
(B) GTY/PCH/etc, and performance; I know that it leaks visual_block
instances, and there's probably a way to do this with much less dynamic
memory allocation, but I wanted to get it right before making it fast.
(C) no ChangeLog yet, though it's heavily commented.
(D) tabs vs spaces. This is probably the biggest can of worms.
Consider:
{
if (flagA)
if (flagB)
if (flagC)
foo ();
bar (); /* would like to warn about this */
}
If this is indented using spaces&tabs with 8-space tabs, we have:
<no indent>{
<2 spaces>if (flagA)
<4 spaces>if (flagB)
<6 spaces>if (flagC)
<1 tab>foo ();
<1 tab>bar ();
<no indent>}
What does this look like in an editor? Consider emacs, which uses uses
0-based offsets to what I call "visual columns". The column numbers for
the start of the first token for each line are reported by emacs as:
<0>{
<2>if (flagA)
<4>if (flagB)
<6>if (flagC)
<8>foo ();
<8>bar ();
<0>}
However within libcpp and gcc, in linemap's expanded_location and in
diagnostic messages, the "column" numbers are actually 1-based counts of
*characters*, so the "column" numbers emitted in diagnostics for the
start of the first token in each line are actually:
<1>{
<3>if (flagA)
<5>if (flagB)
<7>if (flagC)
<2>foo ();
<2>bar ();
<1>}
Note the transition at tab offsets, where we go from increasing visual
column numbers:
<6>if (flagC)
<8>foo ();
to a decrease in the character count:
<7>if (flagC)
<2>foo ();
Hence to handle mixed spaces+tabs, we need some way to model "visual
columns". I think this is doable (at the cost of adding a field to
cpp_token, and to c_token/cp_token), provided we can assume, per-buffer,
either
(i) a consistent value for tabs in terms of spaces, or
(ii) that we don't have mixed tabs/spaces in this buffer
An issue here is how to determine (i), or if it's OK to default to 8 and
have a command-line option (param?) to override it? (though what about,
say, each header file?)
Thoughts on this, and on the patch?
Thanks
Dave
[-- Attachment #2: misleading-indentation-v1.patch --]
[-- Type: text/x-patch, Size: 35383 bytes --]
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 30a717e..f71a33d 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1462,6 +1462,7 @@ OBJS = \
var-tracking.o \
varasm.o \
varpool.o \
+ visual-parser.o \
vmsdbgout.o \
vtable-verify.o \
web.o \
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 7d0a2cd..3795087 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -494,6 +494,10 @@ Wmain
LangEnabledBy(C ObjC C++ ObjC++,Wpedantic, 2, 0)
;
+Wmisleading-indentation
+C C++ Common Var(warn_misleading_indentation) Warning
+Warn when the indentation of the code does not reflect the block structure
+
Wmissing-braces
C ObjC C++ ObjC++ Var(warn_missing_braces) Warning LangEnabledBy(C ObjC,Wall)
Warn about possibly missing braces around initializers
diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index 2c41bf2..c4858ba 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -63,6 +63,7 @@ along with GCC; see the file COPYING3. If not see
#include "plugin.h"
#include "c-family/c-ada-spec.h"
#include "cilk.h"
+#include "visual-parser.h"
/* In grokdeclarator, distinguish syntactic contexts of declarators. */
enum decl_context
@@ -560,6 +561,10 @@ add_stmt (tree t)
recorded during statement expressions. */
if (!building_stmt_list_p ())
push_stmt_list ();
+
+ if (vis_parser)
+ vis_parser->check_stmt (t);
+
append_to_statement_list_force (t, &cur_stmt_list);
return t;
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index d0d35c5..9304e31 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -61,6 +61,7 @@ along with GCC; see the file COPYING3. If not see
#include "cgraph.h"
#include "plugin.h"
#include "omp-low.h"
+#include "visual-parser.h"
\f
/* Initialization routine for this file. */
@@ -216,7 +217,6 @@ typedef struct GTY(()) c_parser {
vec <c_token, va_gc> *cilk_simd_fn_tokens;
} c_parser;
-
/* The actual parser and external interface. ??? Does this need to be
garbage-collected? */
@@ -236,6 +236,9 @@ c_lex_one_token (c_parser *parser, c_token *token)
token->keyword = RID_MAX;
token->pragma_kind = PRAGMA_NONE;
+ if (vis_parser)
+ vis_parser->on_token (token->location);
+
switch (token->type)
{
case CPP_NAME:
@@ -5042,11 +5045,13 @@ c_parser_paren_condition (c_parser *parser)
/* Parse a statement which is a block in C99. */
static tree
-c_parser_c99_block_statement (c_parser *parser)
+c_parser_c99_block_statement (c_parser *parser, location_t guard_loc)
{
tree block = c_begin_compound_stmt (flag_isoc99);
location_t loc = c_parser_peek_token (parser)->location;
c_parser_statement (parser);
+ if (vis_parser)
+ vis_parser->on_solo_stmt (block, guard_loc);
return c_end_compound_stmt (loc, block, flag_isoc99);
}
@@ -5059,7 +5064,7 @@ c_parser_c99_block_statement (c_parser *parser)
parser->in_if_block. */
static tree
-c_parser_if_body (c_parser *parser, bool *if_p)
+c_parser_if_body (c_parser *parser, bool *if_p, location_t if_loc)
{
tree block = c_begin_compound_stmt (flag_isoc99);
location_t body_loc = c_parser_peek_token (parser)->location;
@@ -5081,7 +5086,11 @@ c_parser_if_body (c_parser *parser, bool *if_p)
else if (c_parser_next_token_is (parser, CPP_OPEN_BRACE))
add_stmt (c_parser_compound_statement (parser));
else
- c_parser_statement_after_labels (parser);
+ {
+ c_parser_statement_after_labels (parser);
+ if (vis_parser)
+ vis_parser->on_solo_stmt (block, if_loc);
+ }
return c_end_compound_stmt (body_loc, block, flag_isoc99);
}
@@ -5090,7 +5099,7 @@ c_parser_if_body (c_parser *parser, bool *if_p)
specially for the sake of -Wempty-body warnings. */
static tree
-c_parser_else_body (c_parser *parser)
+c_parser_else_body (c_parser *parser, location_t else_tok_loc)
{
location_t else_loc = c_parser_peek_token (parser)->location;
tree block = c_begin_compound_stmt (flag_isoc99);
@@ -5109,7 +5118,12 @@ c_parser_else_body (c_parser *parser)
c_parser_consume_token (parser);
}
else
- c_parser_statement_after_labels (parser);
+ {
+ c_parser_statement_after_labels (parser);
+ if (vis_parser)
+ vis_parser->on_solo_stmt (block, else_tok_loc);
+ }
+
return c_end_compound_stmt (else_loc, block, flag_isoc99);
}
@@ -5124,7 +5138,7 @@ static void
c_parser_if_statement (c_parser *parser)
{
tree block;
- location_t loc;
+ location_t if_loc, cond_loc;
tree cond;
bool first_if = false;
tree first_body, second_body;
@@ -5132,23 +5146,25 @@ c_parser_if_statement (c_parser *parser)
tree if_stmt;
gcc_assert (c_parser_next_token_is_keyword (parser, RID_IF));
+ if_loc = c_parser_peek_token (parser)->location;
c_parser_consume_token (parser);
block = c_begin_compound_stmt (flag_isoc99);
- loc = c_parser_peek_token (parser)->location;
+ cond_loc = c_parser_peek_token (parser)->location;
cond = c_parser_paren_condition (parser);
in_if_block = parser->in_if_block;
parser->in_if_block = true;
- first_body = c_parser_if_body (parser, &first_if);
+ first_body = c_parser_if_body (parser, &first_if, if_loc);
parser->in_if_block = in_if_block;
if (c_parser_next_token_is_keyword (parser, RID_ELSE))
{
+ location_t else_tok_loc = c_parser_peek_token (parser)->location;
c_parser_consume_token (parser);
- second_body = c_parser_else_body (parser);
+ second_body = c_parser_else_body (parser, else_tok_loc);
}
else
second_body = NULL_TREE;
- c_finish_if_stmt (loc, cond, first_body, second_body, first_if);
- if_stmt = c_end_compound_stmt (loc, block, flag_isoc99);
+ c_finish_if_stmt (cond_loc, cond, first_body, second_body, first_if);
+ if_stmt = c_end_compound_stmt (cond_loc, block, flag_isoc99);
/* If the if statement contains array notations, then we expand them. */
if (flag_cilkplus && contains_array_notation_expr (if_stmt))
@@ -5195,7 +5211,7 @@ c_parser_switch_statement (c_parser *parser)
c_start_case (switch_loc, switch_cond_loc, expr);
save_break = c_break_label;
c_break_label = NULL_TREE;
- body = c_parser_c99_block_statement (parser);
+ body = c_parser_c99_block_statement (parser, switch_loc);
c_finish_case (body);
if (c_break_label)
{
@@ -5220,6 +5236,7 @@ c_parser_while_statement (c_parser *parser, bool ivdep)
tree block, cond, body, save_break, save_cont;
location_t loc;
gcc_assert (c_parser_next_token_is_keyword (parser, RID_WHILE));
+ location_t while_tok_loc = c_parser_peek_token (parser)->location;
c_parser_consume_token (parser);
block = c_begin_compound_stmt (flag_isoc99);
loc = c_parser_peek_token (parser)->location;
@@ -5239,7 +5256,7 @@ c_parser_while_statement (c_parser *parser, bool ivdep)
c_break_label = NULL_TREE;
save_cont = c_cont_label;
c_cont_label = NULL_TREE;
- body = c_parser_c99_block_statement (parser);
+ body = c_parser_c99_block_statement (parser, while_tok_loc);
c_finish_loop (loc, cond, NULL, body, c_break_label, c_cont_label, true);
add_stmt (c_end_compound_stmt (loc, block, flag_isoc99));
c_break_label = save_break;
@@ -5258,6 +5275,7 @@ c_parser_do_statement (c_parser *parser, bool ivdep)
tree block, cond, body, save_break, save_cont, new_break, new_cont;
location_t loc;
gcc_assert (c_parser_next_token_is_keyword (parser, RID_DO));
+ location_t do_tok_loc = c_parser_peek_token (parser)->location;
c_parser_consume_token (parser);
if (c_parser_next_token_is (parser, CPP_SEMICOLON))
warning_at (c_parser_peek_token (parser)->location,
@@ -5269,7 +5287,7 @@ c_parser_do_statement (c_parser *parser, bool ivdep)
c_break_label = NULL_TREE;
save_cont = c_cont_label;
c_cont_label = NULL_TREE;
- body = c_parser_c99_block_statement (parser);
+ body = c_parser_c99_block_statement (parser, do_tok_loc);
c_parser_require_keyword (parser, RID_WHILE, "expected %<while%>");
new_break = c_break_label;
c_break_label = save_break;
@@ -5519,7 +5537,7 @@ c_parser_for_statement (c_parser *parser, bool ivdep)
c_break_label = NULL_TREE;
save_cont = c_cont_label;
c_cont_label = NULL_TREE;
- body = c_parser_c99_block_statement (parser);
+ body = c_parser_c99_block_statement (parser, for_loc);
if (is_foreach_statement)
objc_finish_foreach_loop (loc, object_expression, collection_expression, body, c_break_label, c_cont_label);
else
@@ -11785,7 +11803,7 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code,
add_stmt (c_end_compound_stmt (here, stmt, true));
}
else
- add_stmt (c_parser_c99_block_statement (parser));
+ add_stmt (c_parser_c99_block_statement (parser, loc));
if (c_cont_label)
{
tree t = build1 (LABEL_EXPR, void_type_node, c_cont_label);
@@ -14011,6 +14029,9 @@ c_parse_file (void)
tparser.tokens = &tparser.tokens_buf[0];
the_parser = &tparser;
+ if (warn_misleading_indentation)
+ vis_parser = new visual_parser ();
+
if (c_parser_peek_token (&tparser)->pragma_kind == PRAGMA_GCC_PCH_PREPROCESS)
c_parser_pragma_pch_preprocess (&tparser);
@@ -14024,6 +14045,10 @@ c_parse_file (void)
using_eh_for_cleanups ();
c_parser_translation_unit (the_parser);
+
+ delete vis_parser;
+ vis_parser = NULL;
+
the_parser = NULL;
}
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 46e2453..ffbc04b 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -44,6 +44,7 @@ along with GCC; see the file COPYING3. If not see
#include "parser.h"
#include "type-utils.h"
#include "omp-low.h"
+#include "visual-parser.h"
\f
/* The lexer. */
@@ -632,6 +633,11 @@ cp_lexer_new_main (void)
lexer = cp_lexer_alloc ();
+ if (warn_misleading_indentation)
+ vis_parser = new visual_parser ();
+
+ /* FIXME: delete this. */
+
/* Put the first token in the buffer. */
lexer->buffer->quick_push (token);
@@ -1030,6 +1036,9 @@ cp_lexer_consume_token (cp_lexer* lexer)
gcc_assert (token != &eof_token);
gcc_assert (!lexer->in_pragma || token->type != CPP_PRAGMA_EOL);
+ if (vis_parser)
+ vis_parser->on_token (lexer->next_token->location);
+
do
{
lexer->next_token++;
@@ -2009,9 +2018,9 @@ static void cp_parser_declaration_statement
(cp_parser *);
static tree cp_parser_implicitly_scoped_statement
- (cp_parser *, bool *);
+ (cp_parser *, bool *, location_t);
static void cp_parser_already_scoped_statement
- (cp_parser *);
+ (cp_parser *, location_t);
/* Declarations [gram.dcl.dcl] */
@@ -9843,7 +9852,8 @@ cp_parser_selection_statement (cp_parser* parser, bool *if_p)
nested_if = false;
}
else
- cp_parser_implicitly_scoped_statement (parser, &nested_if);
+ cp_parser_implicitly_scoped_statement (parser, &nested_if,
+ token->location);
parser->in_statement = in_statement;
finish_then_clause (statement);
@@ -9853,7 +9863,8 @@ cp_parser_selection_statement (cp_parser* parser, bool *if_p)
RID_ELSE))
{
/* Consume the `else' keyword. */
- cp_lexer_consume_token (parser->lexer);
+ location_t else_tok_loc
+ = cp_lexer_consume_token (parser->lexer)->location;
begin_else_clause (statement);
/* Parse the else-clause. */
if (cp_lexer_next_token_is (parser->lexer, CPP_SEMICOLON))
@@ -9867,7 +9878,8 @@ cp_parser_selection_statement (cp_parser* parser, bool *if_p)
cp_lexer_consume_token (parser->lexer);
}
else
- cp_parser_implicitly_scoped_statement (parser, NULL);
+ cp_parser_implicitly_scoped_statement (parser, NULL,
+ else_tok_loc);
finish_else_clause (statement);
@@ -9907,7 +9919,8 @@ cp_parser_selection_statement (cp_parser* parser, bool *if_p)
in_statement = parser->in_statement;
parser->in_switch_statement_p = true;
parser->in_statement |= IN_SWITCH_STMT;
- cp_parser_implicitly_scoped_statement (parser, NULL);
+ cp_parser_implicitly_scoped_statement (parser, NULL,
+ 0);
parser->in_switch_statement_p = in_switch_statement_p;
parser->in_statement = in_statement;
@@ -10451,6 +10464,7 @@ static tree
cp_parser_iteration_statement (cp_parser* parser, bool ivdep)
{
cp_token *token;
+ location_t tok_loc;
enum rid keyword;
tree statement;
unsigned char in_statement;
@@ -10460,6 +10474,8 @@ cp_parser_iteration_statement (cp_parser* parser, bool ivdep)
if (!token)
return error_mark_node;
+ tok_loc = token->location;
+
/* Remember whether or not we are already within an iteration
statement. */
in_statement = parser->in_statement;
@@ -10483,7 +10499,7 @@ cp_parser_iteration_statement (cp_parser* parser, bool ivdep)
cp_parser_require (parser, CPP_CLOSE_PAREN, RT_CLOSE_PAREN);
/* Parse the dependent statement. */
parser->in_statement = IN_ITERATION_STMT;
- cp_parser_already_scoped_statement (parser);
+ cp_parser_already_scoped_statement (parser, tok_loc);
parser->in_statement = in_statement;
/* We're done with the while-statement. */
finish_while_stmt (statement);
@@ -10498,7 +10514,7 @@ cp_parser_iteration_statement (cp_parser* parser, bool ivdep)
statement = begin_do_stmt ();
/* Parse the body of the do-statement. */
parser->in_statement = IN_ITERATION_STMT;
- cp_parser_implicitly_scoped_statement (parser, NULL);
+ cp_parser_implicitly_scoped_statement (parser, NULL, 0);
parser->in_statement = in_statement;
finish_do_body (statement);
/* Look for the `while' keyword. */
@@ -10528,7 +10544,7 @@ cp_parser_iteration_statement (cp_parser* parser, bool ivdep)
/* Parse the body of the for-statement. */
parser->in_statement = IN_ITERATION_STMT;
- cp_parser_already_scoped_statement (parser);
+ cp_parser_already_scoped_statement (parser, tok_loc);
parser->in_statement = in_statement;
/* We're done with the for-statement. */
@@ -10773,7 +10789,8 @@ cp_parser_declaration_statement (cp_parser* parser)
Returns the new statement. */
static tree
-cp_parser_implicitly_scoped_statement (cp_parser* parser, bool *if_p)
+cp_parser_implicitly_scoped_statement (cp_parser* parser, bool *if_p,
+ location_t guard_loc)
{
tree statement;
@@ -10799,6 +10816,9 @@ cp_parser_implicitly_scoped_statement (cp_parser* parser, bool *if_p)
cp_parser_statement (parser, NULL_TREE, false, if_p);
/* Finish the dummy compound-statement. */
finish_compound_stmt (statement);
+ /* FIXME. */
+ if (vis_parser)
+ vis_parser->on_solo_stmt (cur_stmt_list, guard_loc);
}
/* Return the statement. */
@@ -10811,11 +10831,15 @@ cp_parser_implicitly_scoped_statement (cp_parser* parser, bool *if_p)
scope. */
static void
-cp_parser_already_scoped_statement (cp_parser* parser)
+cp_parser_already_scoped_statement (cp_parser* parser, location_t guard_loc)
{
/* If the token is a `{', then we must take special action. */
if (cp_lexer_next_token_is_not (parser->lexer, CPP_OPEN_BRACE))
- cp_parser_statement (parser, NULL_TREE, false, NULL);
+ {
+ cp_parser_statement (parser, NULL_TREE, false, NULL);
+ if (vis_parser)
+ vis_parser->on_solo_stmt (cur_stmt_list, guard_loc);
+ }
else
{
/* Avoid calling cp_parser_compound_statement, so that we
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 886fbb8..0d05674 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -49,6 +49,7 @@ along with GCC; see the file COPYING3. If not see
#include "gimplify.h"
#include "bitmap.h"
#include "omp-low.h"
+#include "visual-parser.h"
static bool verify_constant (tree, bool, bool *, bool *);
#define VERIFY_CONSTANT(X) \
@@ -386,6 +387,9 @@ add_stmt (tree t)
STMT_IS_FULL_EXPR_P (t) = stmts_are_full_exprs_p ();
}
+ if (vis_parser)
+ vis_parser->check_stmt (t);
+
/* Add T to the statement-tree. Non-side-effect statements need to be
recorded during statement expressions. */
gcc_checking_assert (!stmt_list_stack->is_empty ());
diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation-1.c b/gcc/testsuite/c-c++-common/Wmisleading-indentation-1.c
new file mode 100644
index 0000000..630ebfe
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation-1.c
@@ -0,0 +1,12 @@
+/* { dg-options "-Wmisleading-indentation" } */
+/* { dg-do compile } */
+
+int
+foo (int flag)
+{
+ int x = 4, y = 5;
+ if (flag) /* { dg-message "3: ...this clause, but it is not" } */
+ x = 3;
+ y = 2; /* { dg-warning "statement is indented as if it were guarded by..." } */
+ return x * y;
+}
diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation-10.c b/gcc/testsuite/c-c++-common/Wmisleading-indentation-10.c
new file mode 100644
index 0000000..1531662
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation-10.c
@@ -0,0 +1,15 @@
+/* { dg-options "-Wmisleading-indentation" } */
+/* { dg-do compile } */
+extern void bar (int);
+
+void foo (int flag)
+{
+ if (flag) /* { dg-message "3: ...this clause, but it is not" } */
+ if (flag / 2)
+ {
+ bar (0);
+ bar (1);
+ }
+ bar (2); /* { dg-warning "statement is indented as if it were guarded by..." } */
+ bar (3);
+}
diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation-2.c b/gcc/testsuite/c-c++-common/Wmisleading-indentation-2.c
new file mode 100644
index 0000000..a22cd11
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation-2.c
@@ -0,0 +1,11 @@
+/* { dg-options "-Wmisleading-indentation" } */
+/* { dg-do compile } */
+
+int
+foo (int flag, int x, int y)
+{
+ if (flag) /* { dg-message "3: ...this clause, but it is not" } */
+ x++; y++; /* { dg-warning "statement is indented as if it were guarded by..." } */
+
+ return x * y;
+}
diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c b/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c
new file mode 100644
index 0000000..7048225
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation-3.c
@@ -0,0 +1,14 @@
+/* { dg-options "-Wmisleading-indentation" } */
+/* { dg-do compile } */
+
+int
+foo (int flag)
+{
+ int x = 4, y = 5;
+ if (flag)
+ x = 3;
+ else /* { dg-message "3: ...this clause, but it is not" } */
+ x = 2;
+ y = 2; /* { dg-warning "statement is indented as if it were guarded by..." } */
+ return x * y;
+}
diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation-4.c b/gcc/testsuite/c-c++-common/Wmisleading-indentation-4.c
new file mode 100644
index 0000000..7abf379
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation-4.c
@@ -0,0 +1,11 @@
+/* { dg-options "-Wmisleading-indentation" } */
+/* { dg-do compile } */
+
+void
+foo (double *a, double *b, double *c)
+{
+ int i = 0;
+ while (i < 10) /* { dg-message "3: ...this clause, but it is not" } */
+ a[i] = b[i] * c[i];
+ i++; /* { dg-warning "statement is indented as if it were guarded by..." } */
+}
diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation-5.c b/gcc/testsuite/c-c++-common/Wmisleading-indentation-5.c
new file mode 100644
index 0000000..cb6d025
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation-5.c
@@ -0,0 +1,11 @@
+/* { dg-options "-Wmisleading-indentation" } */
+/* { dg-do compile } */
+
+void
+foo (double *a, double *b, double *sum, double *prod)
+{
+ int i = 0;
+ for (i = 0; i < 10; i++) /* { dg-output "3: ...this one, but it is not" } */
+ sum[i] = a[i] * b[i];
+ prod[i] = a[i] * b[i]; /* { dg-warning "statement is indented as if it were guarded by..." } */
+}
diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation-6.c b/gcc/testsuite/c-c++-common/Wmisleading-indentation-6.c
new file mode 100644
index 0000000..3133d15
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation-6.c
@@ -0,0 +1,22 @@
+/* Based on CVE-2014-1266 aka "goto fail" */
+/* { dg-options "-Wmisleading-indentation" } */
+extern int foo (int);
+
+static int
+goto_fail(int a, int b, int c)
+{
+ int err;
+
+ /* ... */
+ if ((err = foo (a)) != 0)
+ goto fail;
+ if ((err = foo (b)) != 0) /* { dg-message "2: ...this clause, but it is not" } */
+ goto fail;
+ goto fail; /* { dg-warning "statement is indented as if it were guarded by..." } */
+ if ((err = foo (c)) != 0)
+ goto fail;
+ /* ... */
+
+fail:
+ return err;
+}
diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation-7.c b/gcc/testsuite/c-c++-common/Wmisleading-indentation-7.c
new file mode 100644
index 0000000..8d1ff00
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation-7.c
@@ -0,0 +1,15 @@
+/* { dg-options "-Wmisleading-indentation" } */
+
+extern int bar (int, int);
+
+int foo (int p, int q, int r, int s, int t)
+{
+ if (bar (p, q))
+ {
+ if (p) /* { dg-message "7: ...this clause, but it is not" } */
+ q++; r++; /* { dg-warning "statement is indented as if it were guarded by..." } */
+ s++; /* { dg-warning "statement is indented as if it were guarded by..." } */
+ t++;
+ }
+ return p + q + r + s + t;
+}
diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation-8.c b/gcc/testsuite/c-c++-common/Wmisleading-indentation-8.c
new file mode 100644
index 0000000..82c1e6b
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation-8.c
@@ -0,0 +1,6 @@
+/* { dg-options "-Wmisleading-indentation" } */
+int foo (int a, int b, int c)
+{
+ /* This should *not* be flagged as misleading indentation. */
+ if (a) return b; else return c;
+}
diff --git a/gcc/testsuite/c-c++-common/Wmisleading-indentation-9.c b/gcc/testsuite/c-c++-common/Wmisleading-indentation-9.c
new file mode 100644
index 0000000..e490dec
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wmisleading-indentation-9.c
@@ -0,0 +1,10 @@
+/* { dg-options "-Wmisleading-indentation" } */
+/* { dg-do compile } */
+extern void bar (int);
+
+void foo (int flag)
+{
+ if (flag) /* { dg-message "3: ...this clause, but it is not" } */
+ bar (0);
+ bar (1); /* { dg-warning "statement is indented as if it were guarded by..." } */
+}
diff --git a/gcc/visual-parser.c b/gcc/visual-parser.c
new file mode 100644
index 0000000..6181e1e
--- /dev/null
+++ b/gcc/visual-parser.c
@@ -0,0 +1,255 @@
+/* "Visual parser" for detecting misleading indentation.
+ Copyright (C) 2015 Free Software Foundation, Inc.
+ Contributed by David Malcolm <dmalcolm@redhat.com>.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3. If not see
+<http://www.gnu.org/licenses/>. */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h" /* For rtl.h: needs enum reg_class. */
+#include "tree.h"
+#include "stringpool.h"
+#include "attribs.h"
+#include "stor-layout.h"
+#include "varasm.h"
+#include "trans-mem.h"
+#include "langhooks.h"
+#include "input.h"
+#include "cpplib.h"
+#include "timevar.h"
+#include "c-family/c-pragma.h"
+#include "flags.h"
+#include "ggc.h"
+#include "vec.h"
+#include "target.h"
+#include "cgraph.h"
+#include "plugin.h"
+#include "omp-low.h"
+#include "visual-parser.h"
+#include "diagnostic-core.h"
+#include "tree-iterator.h"
+
+/* Ptr to the singleton instance of visual_parser. */
+visual_parser *vis_parser;
+
+/* The ctor for visual_parser. */
+visual_parser::visual_parser ()
+{
+ m_debug = false;
+ m_last_xloc.file = NULL;
+ m_last_xloc.line = 0;
+ m_last_xloc.column = 0;
+}
+
+/* Token-handling. This takes a stream of locations, examining their
+ spelling locations, and calling on_indent, on_line, on_outdent
+ accordingly. */
+
+void
+visual_parser::on_token (location_t loc)
+{
+ if (m_debug)
+ inform (loc, "on_token");
+
+ expanded_location xloc = expand_location (loc);
+ if (xloc.line != m_last_xloc.line)
+ {
+ //inform (loc, "first token on new line");
+ visual_block *curblock = get_stack_top ();
+ if (curblock)
+ {
+ /* FIXME: what about entirely empty lines???
+ Presumably they simply don't get tokens. */
+ int last_indent = curblock->get_column ();
+ if (xloc.column > last_indent)
+ {
+ /* This line starts more indented than any current
+ indentation level; begin a new "visual block". */
+ visual_block *block = new visual_block (loc, xloc.column);
+ m_block_stack.safe_push (block);
+ on_indent (loc);
+ }
+ else if (xloc.column == last_indent)
+ {
+ /* This line is at the same indentation level as before,
+ within the current "visual block". */
+ on_line (loc);
+ }
+ else
+ {
+ /* We have some amount of outdenting; how much? */
+ while (m_block_stack.length ())
+ {
+ m_block_stack.pop ();
+ on_outdent (loc);
+ if (get_stack_top ()->get_column () >= xloc.column)
+ break;
+ }
+ }
+ }
+ else
+ {
+ /* No current indentation level; start one. */
+ visual_block *block = new visual_block (loc, xloc.column);
+ m_block_stack.safe_push (block);
+ on_indent (loc);
+ }
+ }
+ m_last_xloc = xloc;
+}
+
+/* Called when we have a token that's on a new line that's more indented
+ than the token that began the last line. */
+void
+visual_parser::on_indent (location_t loc)
+{
+ if (m_debug)
+ inform (loc, "on_indent");
+}
+
+/* Called when we have a token that's on a new line that's less indented
+ than the token that began the last line. */
+
+void
+visual_parser::on_outdent (location_t loc)
+{
+ if (m_debug)
+ inform (loc, "on_outdent");
+}
+
+/* Called for the first token on a new line that's at the same indentation
+ level as the previous line. */
+void
+visual_parser::on_line (location_t loc)
+{
+ if (m_debug)
+ inform (loc, "on_line");
+ visual_block *curblock = get_stack_top ();
+ curblock->on_line (loc);
+}
+
+/* Called by the C/C++ FE when we have a guarding statement at GUARD_LOC
+ containing BLOCK, where the block wasn't written using braces, like
+ this:
+
+ guard-loc
+ |
+ V
+ if (flag)
+ foo (); <--BLOCK
+
+ so that we can detect followup statements that are within
+ the same "visual block" as the guarded statement, but which
+ aren't logically grouped within the guarding statement, such
+ as:
+
+ if (flag)
+ foo ();
+ bar ();
+
+ In the above, "bar ();" isn't guarded by the "if", but
+ misleading is in the same visual block as "foo ();". */
+
+void
+visual_parser::on_solo_stmt (tree block, location_t guard_loc)
+{
+ /* Mark the most-indented visual block as containing
+ a solo-stmt. */
+ get_stack_top ()->on_solo_stmt (block, guard_loc);
+}
+
+/* See comment above for visual_parser::on_solo_stmt.
+ Mark the most-indented visual block as (supposedly) only
+ containing a solo statement. */
+
+void
+visual_block::on_solo_stmt (tree block, location_t guard_loc)
+{
+ tree_stmt_iterator tsi = tsi_start (block);
+ tree stmt = tsi_stmt (tsi);
+
+ m_loc_solo_stmt = EXPR_LOCATION (stmt);
+ m_loc_guard = guard_loc;
+
+ m_exploc_solo_stmt
+ = expand_location_to_spelling_point (m_loc_solo_stmt);
+ m_exploc_guard
+ = expand_location_to_spelling_point (guard_loc);
+
+ /* If the solo-stmt is on a new line and more indented than
+ the guard location, mark the current visual block (which
+ presumably contains the solo stmt) for checking.
+ Doing this rejects cases such as
+ if (foo) bar (); baz ();
+ where it's not clear whether or not we ought to warn about
+ "baz ();" and hence we don't. */
+ if (m_exploc_solo_stmt.file == m_exploc_guard.file)
+ if (m_exploc_solo_stmt.line > m_exploc_guard.line)
+ if (m_exploc_solo_stmt.column > m_exploc_guard.column)
+ {
+ //inform (m_loc_stmt, "solo statement here");
+ //inform (m_loc_guard, "visually guarded here");
+ m_has_solo_stmt = true;
+ }
+}
+
+/* Check NEW_STMT for misleading indentation.
+ Called when adding NEW_STMT to a statement list.
+
+ If we have a solo statement, and we're in the same
+ visual block, and NEW_STMT is visually after the
+ solo statement, then NEW_STMT is misleadingly indented as
+ if were guarded by the guard, but it isn't.
+ Issue a warning for such a statement. */
+
+void
+visual_parser::check_stmt (tree new_stmt)
+{
+ if (m_debug)
+ inform (EXPR_LOCATION (new_stmt), "check_stmt");
+ get_stack_top ()->check_stmt (new_stmt);
+}
+
+/* See comment above for visual_parser::check_stmt. */
+
+void
+visual_block::check_stmt (tree new_stmt)
+{
+ if (!m_has_solo_stmt)
+ return;
+
+ location_t loc_new_stmt = EXPR_LOCATION (new_stmt);
+ //inform (loc_new_stmt, "checking stmt here");
+
+ expanded_location exploc_new_stmt
+ = expand_location_to_spelling_point (loc_new_stmt);
+
+ if (exploc_new_stmt.file == m_exploc_guard.file)
+ {
+ if (/* Statement is visually after the guarded stmt. */
+ (exploc_new_stmt.line == m_exploc_solo_stmt.line
+ && exploc_new_stmt.column > m_exploc_solo_stmt.column)
+ || (exploc_new_stmt.line > m_exploc_solo_stmt.line))
+ if (warning_at (loc_new_stmt,
+ OPT_Wmisleading_indentation,
+ "statement is indented as if it"
+ " were guarded by..."))
+ inform (m_loc_guard,
+ "...this clause, but it is not");
+ }
+}
diff --git a/gcc/visual-parser.h b/gcc/visual-parser.h
new file mode 100644
index 0000000..295f457
--- /dev/null
+++ b/gcc/visual-parser.h
@@ -0,0 +1,167 @@
+/* "Visual parser" for detecting misleading indentation.
+ Copyright (C) 2015 Free Software Foundation, Inc.
+ Contributed by David Malcolm <dmalcolm@redhat.com>.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3. If not see
+<http://www.gnu.org/licenses/>. */
+
+/* Forward declaration. */
+class visual_block;
+
+/* A "visual parser" for detecting misleading indentation.
+
+ This is fed three things by the frontend:
+
+ (A) a series of location_t by the frontend's tokenizer,
+ corresponding to the locations of the token stream. It uses
+ this to model a stack of "visual blocks", corresponding to
+ indentation levels within the source code.
+ For example:
+
+ Source VBlock # VBlock stack
+ ------ -------- ------------
+ void 0 [b0]
+ foo (int i) 0 [b0]
+ { 0 [b0]
+ int j; .1 [b0, b1]
+ if (i) .1 [b0, b1]
+ { ..2 [b0, b1, b2]
+ foo (0); ...3 [b0, b1, b2, b3]
+ bar (0); ...3 [b0, b1, b2, b3]
+ } ..2 [b0, b1, b2]
+ else .1 [b0, b1]
+ foo (1); ..4 [b0, b1, b4]
+ bar (1); ..4 [b0, b1, b4] <-- WARNING!
+ } 0 [b0]
+
+ Note how the call to "bar (1);" called out with "WARNING!" is
+ indented as if it's in the same block as the call to "foo (1);",
+ guarded by the "else" (both are in visual block 4), but they are
+ *not* in the same actual block as far as the real frontend
+ (and language standards) see it.
+ The purpose of this class is to issue a warning about this
+ misleading indentation.
+
+ (2) The frontend notifies the class about "solo statements", that
+ is, non-compound statements guarded by control-flow statements,
+ such as "foo (1);", a non-compound statement guarded by the else
+ clause. Misleading indentation can occur in the statement
+ immediately following such a non-compound statement, if the
+ successor statement is indented in the same way
+ (i.e. it is within the same visual block).
+
+ (3) The frontend notifiees the class about statements being added
+ to a statement list. If we have a guarded non-compound statement,
+ the new statements can be checked for misleading indentation.
+
+ Note that we can't simply use statement locations; for example, in:
+ if (flag)
+ x = 1;
+ the "if (flag)"'s location is at the open-paren, and that of the
+ assignment "x = 1;" is at the equals-sign, so any attempt to use
+ statement locations can be fooled by varying the spacing:
+
+ V
+ if (flag)
+ x = 1;
+ ^ apparent indentation relative to conditional
+
+ V
+ if (flag)
+ x = 1;
+ ^ same column as conditional
+
+ V
+ if (flag)
+ x = 1;
+ ^ apparent "outdent" relative to conditional
+
+ Hence we have to use token locations. */
+
+class GTY(()) visual_parser {
+ public:
+ visual_parser ();
+ void on_token (location_t loc);
+ void on_solo_stmt (tree block, location_t guard_loc);
+ void check_stmt (tree stmt);
+
+ private:
+ void on_newline (location_t loc);
+
+ void on_indent (location_t loc);
+ void on_outdent (location_t loc);
+ void on_line (location_t loc);
+
+ visual_block * get_stack_top () const
+ {
+ if (m_block_stack.length ())
+ return m_block_stack[m_block_stack.length () - 1];
+ else
+ return NULL;
+ }
+
+ private:
+ bool m_debug;
+ expanded_location m_last_xloc;
+ /* A stack of indentation levels. */
+ auto_vec<visual_block *> m_block_stack;
+};
+
+/* A visual block: a run of lines with the same initial indentation. */
+
+class visual_block
+{
+ public:
+ visual_block (location_t loc, int column_start)
+ : m_loc (loc),
+ m_loc_last_line (loc),
+ m_column_start (column_start),
+ m_has_solo_stmt (false)
+ {}
+
+ location_t get_location () const { return m_loc; }
+ location_t get_last_location () const { return m_loc_last_line; }
+ int get_column () const { return m_column_start; }
+
+ void on_line (location_t loc) { m_loc_last_line = loc; }
+
+ void on_solo_stmt (tree block, location_t guard_loc);
+ void check_stmt (tree stmt);
+
+ private:
+ location_t m_loc;
+ location_t m_loc_last_line;
+ int m_column_start;
+
+ /* Detection of misleading indentation.
+ If m_has_solo_stmt is true, then this visual
+ block contains a "solo statement" i.e. one within a block
+ created without braces, such as:
+ if (flag) <- guard
+ foo (); <- solo stmt in this visblock
+ Any followup statements that are in the same visual block as
+ "foo ();" are therefore misleadingly indented. */
+ bool m_has_solo_stmt;
+ location_t m_loc_solo_stmt;
+ location_t m_loc_guard;
+ expanded_location m_exploc_solo_stmt;
+ expanded_location m_exploc_guard;
+
+};
+
+/* The singleton instance of the visual_parser, created by the
+ C/C++ frontend if -Wmisleading-indentation is enabled. */
+extern visual_parser *vis_parser;
next reply other threads:[~2015-04-16 15:08 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-16 15:08 David Malcolm [this message]
2015-04-16 17:27 ` Mike Stump
2015-04-21 16:13 ` David Malcolm
2015-04-21 16:40 ` Trevor Saunders
2015-04-21 16:43 ` Manuel López-Ibáñez
2015-04-21 17:05 ` Mike Stump
2015-04-21 18:14 ` Manuel López-Ibáñez
2015-04-21 23:35 ` David Malcolm
2015-04-28 23:50 ` [PATCH 1/3] Implement -Wmisleading-indentation (v4) David Malcolm
2015-04-28 23:52 ` [PATCH 2/3] Fix spurious semicolons David Malcolm
2015-04-29 0:10 ` Jeff Law
2015-04-28 23:53 ` [PATCH 3/3] Fix indentation issues seen by -Wmisleading-indentation David Malcolm
2015-04-29 0:25 ` Jeff Law
2015-04-29 12:35 ` Mikael Morin
2015-05-05 19:44 ` David Malcolm
2015-05-06 11:38 ` Fix logic error in Fortran OpenACC parsing (was: [PATCH 3/3] Fix indentation issues seen by -Wmisleading-indentation) Thomas Schwinge
2015-05-08 11:24 ` Fix logic error in Fortran OpenACC parsing Ilmir Usmanov
2015-07-27 14:37 ` Thomas Schwinge
2015-05-11 21:23 ` [PATCH 1/3] Implement -Wmisleading-indentation (v4) Jeff Law
2015-05-12 21:28 ` David Malcolm
2015-05-12 21:47 ` David Malcolm
2015-08-18 14:13 ` PR67192 (Was: Re: [PATCH 1/3] Implement -Wmisleading-indentation (v4)) Andreas Arnez
2015-04-16 18:29 ` [RFC stage 1] Proposed new warning: -Wmisleading-indentation Manuel López-Ibáñez
2015-04-17 14:42 ` Tom Tromey
2015-04-17 16:12 ` Manuel López-Ibáñez
2015-04-28 23:13 ` Joseph Myers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1429196485.32584.46.camel@surprise \
--to=dmalcolm@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).