public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 1/5] Testsuite: add dg-{begin|end}-multiline-output commands
  2015-09-22 21:09 [PATCH 0/5] RFC: Overhaul of diagnostics (v2) David Malcolm
@ 2015-09-22 21:09 ` David Malcolm
  2015-09-25 17:22   ` Jeff Law
  2015-09-22 21:10 ` [PATCH 5/5] Add plugin to recursively dump the source-ranges in a tree (v2) David Malcolm
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 83+ messages in thread
From: David Malcolm @ 2015-09-22 21:09 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

This patch is essentially identical to v1 here:
  https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00729.html
The only change is in the ChangeLog, moving the libgo.exp
ChangeLog entry into gcc/testsuite/ChangeLog, analogous to
where Ian put it when introducing the file in r167407.

OK for trunk?

Blurb from v1 follows:

This patch adds an easy way to write tests for expected multiline
output.  For example we can test carets and underlines for
a particular diagnostic with:

/* { dg-begin-multiline-output "" }
 typedef struct _GMutex GMutex;
                ^~~~~~~
   { dg-end-multiline-output "" } */

It is used extensively by the rest of the patch kit.

multiline.exp is used by prune.exp; hence we need to load it before
prune.exp via *load_gcc_lib* for the testsuites of the various
non-"gcc" support libraries (e.g. boehm-gc).

gcc/testsuite/ChangeLog:
	* lib/multiline.exp: New file.
	* lib/prune.exp: Load multiline.exp.
	(prune_gcc_output): Call into multiline.exp to handle any
	multiline output directives.
	* lib/libgo.exp: Load multiline.exp before prune.exp, using
	load_gcc_lib.

boehm-gc/ChangeLog:
	* testsuite/lib/boehm-gc.exp: Load multiline.exp before
	prune.exp, using load_gcc_lib.

libatomic/ChangeLog:
	* testsuite/lib/libatomic.exp: Load multiline.exp before
	prune.exp, using load_gcc_lib.

libgomp/ChangeLog:
	* testsuite/lib/libgomp.exp: Load multiline.exp before prune.exp,
	using load_gcc_lib.

libitm/ChangeLog:
	* testsuite/lib/libitm.exp: Load multiline.exp before prune.exp,
	using load_gcc_lib.

libvtv/ChangeLog:
	* testsuite/lib/libvtv.exp: Load multiline.exp before prune.exp,
	using load_gcc_lib.
---
 boehm-gc/testsuite/lib/boehm-gc.exp   |   1 +
 gcc/testsuite/lib/multiline.exp       | 241 ++++++++++++++++++++++++++++++++++
 gcc/testsuite/lib/prune.exp           |   5 +
 libatomic/testsuite/lib/libatomic.exp |   1 +
 libgo/testsuite/lib/libgo.exp         |   1 +
 libgomp/testsuite/lib/libgomp.exp     |   1 +
 libitm/testsuite/lib/libitm.exp       |   1 +
 libvtv/testsuite/lib/libvtv.exp       |   1 +
 8 files changed, 252 insertions(+)
 create mode 100644 gcc/testsuite/lib/multiline.exp

diff --git a/boehm-gc/testsuite/lib/boehm-gc.exp b/boehm-gc/testsuite/lib/boehm-gc.exp
index bafe7bb..d162035 100644
--- a/boehm-gc/testsuite/lib/boehm-gc.exp
+++ b/boehm-gc/testsuite/lib/boehm-gc.exp
@@ -31,6 +31,7 @@ load_gcc_lib target-utils.exp
 # For ${tool}_exit.
 load_gcc_lib gcc-defs.exp
 # For prune_gcc_output.
+load_gcc_lib multiline.exp
 load_gcc_lib prune.exp
 
 set dg-do-what-default run
diff --git a/gcc/testsuite/lib/multiline.exp b/gcc/testsuite/lib/multiline.exp
new file mode 100644
index 0000000..eb72143
--- /dev/null
+++ b/gcc/testsuite/lib/multiline.exp
@@ -0,0 +1,241 @@
+#   Copyright (C) 2015 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+# Testing of multiline output
+
+# We have pre-existing testcases like this:
+#   |typedef struct _GMutex GMutex; // { dg-message "previously declared here"}
+# (using "|" here to indicate the start of a line),
+# generating output like this:
+#   |gcc/testsuite/g++.dg/diagnostic/wrong-tag-1.C:4:16: note: 'struct _GMutex' was previously declared here
+# where the location of the dg-message determines the expected line at
+# which the error should be reported.
+#
+# To handle rich error-reporting, we want to be able to verify that we
+# get output like this:
+#   |gcc/testsuite/g++.dg/diagnostic/wrong-tag-1.C:4:16: note: 'struct _GMutex' was previously declared here
+#   | typedef struct _GMutex GMutex; // { dg-message "previously declared here"}
+#   |                ^~~~~~~
+# where the compiler's first line of output is as before, but in
+# which it then echoes the source lines, adding annotations.
+#
+# We want to be able to write testcases that verify that the
+# emitted source-and-annotations are sane.
+#
+# A complication here is that the source lines contain comments
+# containing DejaGnu directives (such as the "dg-message" above).
+#
+# We punt this somewhat by only matching the beginnings of lines.
+# so that we can write e.g.
+#   |/* { dg-begin-multiline-output "" }
+#   | typedef struct _GMutex GMutex;
+#   |                ^~~~~~~
+#   |   { dg-end-multiline-output "" } */
+# to have the testsuite verify the expected output.
+
+############################################################################
+# Global variables.  Although global, these are intended to only be used from
+# within multiline.exp.
+############################################################################
+
+# The line number of the last dg-begin-multiline-output directive.
+set _multiline_last_beginning_line -1
+
+# A list of lists of strings.
+set _multiline_expected_outputs []
+
+############################################################################
+# Exported functions.
+############################################################################
+
+# Mark the beginning of an expected multiline output
+# All lines between this and the next dg-end-multiline-output are
+# expected to be seen.
+
+proc dg-begin-multiline-output { args } {
+    global _multiline_last_beginning_line
+    verbose "dg-begin-multiline-output: args: $args" 3
+    set line [expr [lindex $args 0] + 1]
+    set _multiline_last_beginning_line $line
+}
+
+# Mark the end of an expected multiline output
+# All lines up to here since the last dg-begin-multiline-output are
+# expected to be seen.
+
+proc dg-end-multiline-output { args } {
+    global _multiline_last_beginning_line
+    verbose "dg-end-multiline-output: args: $args" 3
+    set line [expr [lindex $args 0] - 1]
+    verbose "multiline output lines: $_multiline_last_beginning_line-$line" 3
+
+    upvar 1 prog prog
+    verbose "prog: $prog" 3
+    # "prog" now contains the filename
+    # Load it and split it into lines
+
+    set lines [_get_lines $prog $_multiline_last_beginning_line $line]
+    set _multiline_last_beginning_line -1
+
+    verbose "lines: $lines" 3
+    global _multiline_expected_outputs
+    lappend _multiline_expected_outputs $lines
+    verbose "within dg-end-multiline-output: _multiline_expected_outputs: $_multiline_expected_outputs" 3
+}
+
+# Hook to be called by prune.exp's prune_gcc_output to
+# look for the expected multiline outputs, pruning them,
+# reporting PASS for those that are found, and FAIL for
+# those that weren't found.
+#
+# It returns a pruned version of its output.
+#
+# It also clears the list of expected multiline outputs.
+
+proc handle-multiline-outputs { text } {
+    global _multiline_expected_outputs
+    set index 0
+    foreach multiline $_multiline_expected_outputs {
+	verbose "  multiline: $multiline" 4
+	set rexp [_build_multiline_regex $multiline $index]
+	verbose "rexp: ${rexp}" 4
+	# Escape newlines in $rexp so that we can print them in
+	# pass/fail results.
+	set escaped_regex [string map {"\n" "\\n"} $rexp]
+	verbose "escaped_regex: ${escaped_regex}" 4
+
+	# Use "regsub" to attempt to prune the pattern from $text
+	if {[regsub -line $rexp $text "" text]} {
+	    # Success; the multiline pattern was pruned.
+	    pass "expected multiline pattern $index was found: \"$escaped_regex\""
+	} else {
+	    fail "expected multiline pattern $index not found: \"$escaped_regex\""
+	}
+
+	set index [expr $index + 1]
+    }
+
+    # Clear the list of expected multiline outputs
+    set _multiline_expected_outputs []
+
+    return $text
+}
+
+############################################################################
+# Internal functions
+############################################################################
+
+# Load FILENAME and extract the lines from FIRST_LINE
+# to LAST_LINE (inclusive) as a list of strings.
+
+proc _get_lines { filename first_line last_line } {
+    verbose "_get_lines" 3
+    verbose "  filename: $filename" 3
+    verbose "  first_line: $first_line" 3
+    verbose "  last_line: $last_line" 3
+
+    set fp [open $filename r]
+    set file_data [read $fp]
+    close $fp
+    set data [split $file_data "\n"]
+    set linenum 1
+    set lines []
+    foreach line $data {
+	verbose "line $linenum: $line" 4
+	if { $linenum >= $first_line && $linenum <= $last_line } {
+	    lappend lines $line
+	}
+	set linenum [expr $linenum + 1]
+    }
+
+    return $lines
+}
+
+# Convert $multiline from a list of strings to a multiline regex
+# We need to support matching arbitrary followup text on each line,
+# to deal with comments containing containing DejaGnu directives.
+
+proc _build_multiline_regex { multiline index } {
+    verbose "_build_multiline_regex: $multiline $index" 4
+
+    set rexp ""
+    foreach line $multiline {
+	verbose "  line: $line" 4
+
+	# We need to escape "^" and other regexp metacharacters.
+	set line [string map {"^" "\\^"
+	                      "(" "\\("
+	                      ")" "\\)"
+	                      "[" "\\["
+	                      "]" "\\]"
+	                      "." "\\."
+	                      "\\" "\\\\"
+	                      "?" "\\?"
+	                      "+" "\\+"
+	                      "*" "\\*"
+	                      "|" "\\|"} $line]
+
+	append rexp $line
+	if {[string match "*^" $line] || [string match "*~" $line]} {
+	    # Assume a line containing a caret/range.  This must be
+	    # an exact match.
+	} elseif {[string match "*\\|" $line]} {
+	    # Assume a source line with a right-margin.  Support
+	    # arbitrary text in place of any whitespace before the
+	    # right-margin, to deal with comments containing containing
+	    # DejaGnu directives.
+
+	    # Remove final "\|":
+	    set rexp [string range $rexp 0 [expr [string length $rexp] - 3]]
+
+	    # Trim off trailing whitespace:
+	    set old_length [string length $rexp]
+	    set rexp [string trimright $rexp]
+	    set new_length [string length $rexp]
+
+	    # Replace the trimmed whitespace with "." chars to match anything:
+	    set ws [string repeat "." [expr $old_length - $new_length]]
+	    set rexp "${rexp}${ws}"
+
+	    # Add back the trailing '\|':
+	    set rexp "${rexp}\\|"
+	} else {
+	    # Assume that we have a quoted source line.
+	    # Support arbitrary followup text on each line,
+	    # to deal with comments containing containing DejaGnu
+	    # directives.
+	    append rexp ".*"
+	}
+	append rexp "\n"
+    }
+
+    # dg.exp's dg-test trims leading whitespace from the output
+    # in this line:
+    #   set comp_output [string trimleft $comp_output]
+    # so we can't rely on the exact leading whitespace for the
+    # first line in the *first* multiline regex.
+    #
+    # Trim leading whitespace from the regexp, replacing it with
+    # a "\s*", to match zero or more whitespace characters.
+    if { $index == 0 } {
+	set rexp [string trimleft $rexp]
+	set rexp "\\s*$rexp"
+    }
+
+    verbose "rexp: $rexp" 4
+
+    return $rexp
+}
diff --git a/gcc/testsuite/lib/prune.exp b/gcc/testsuite/lib/prune.exp
index 8e4c203..fa10043 100644
--- a/gcc/testsuite/lib/prune.exp
+++ b/gcc/testsuite/lib/prune.exp
@@ -16,6 +16,8 @@
 
 # Prune messages from gcc that aren't useful.
 
+load_lib multiline.exp
+
 if ![info exists TEST_ALWAYS_FLAGS] {
     set TEST_ALWAYS_FLAGS ""
 }
@@ -68,6 +70,9 @@ proc prune_gcc_output { text } {
     # Ignore harmless warnings from Xcode 4.0.
     regsub -all "(^|\n)\[^\n\]*ld: warning: could not create compact unwind for\[^\n\]*" $text "" text
 
+    # Call into multiline.exp to handle any multiline output directives.
+    set text [handle-multiline-outputs $text]
+
     #send_user "After:$text\n"
 
     return $text
diff --git a/libatomic/testsuite/lib/libatomic.exp b/libatomic/testsuite/lib/libatomic.exp
index 0491c18..cafab54 100644
--- a/libatomic/testsuite/lib/libatomic.exp
+++ b/libatomic/testsuite/lib/libatomic.exp
@@ -37,6 +37,7 @@ load_gcc_lib scandump.exp
 load_gcc_lib scanrtl.exp
 load_gcc_lib scantree.exp
 load_gcc_lib scanipa.exp
+load_gcc_lib multiline.exp
 load_gcc_lib prune.exp
 load_gcc_lib target-libpath.exp
 load_gcc_lib wrapper.exp
diff --git a/libgo/testsuite/lib/libgo.exp b/libgo/testsuite/lib/libgo.exp
index 7031f63..1b0f26a 100644
--- a/libgo/testsuite/lib/libgo.exp
+++ b/libgo/testsuite/lib/libgo.exp
@@ -39,6 +39,7 @@ proc load_gcc_lib { filename } {
     set loaded_libs($filename) ""
 }
 
+load_gcc_lib multiline.exp
 load_gcc_lib prune.exp
 load_gcc_lib target-libpath.exp
 load_gcc_lib wrapper.exp
diff --git a/libgomp/testsuite/lib/libgomp.exp b/libgomp/testsuite/lib/libgomp.exp
index f04b163..1040c29 100644
--- a/libgomp/testsuite/lib/libgomp.exp
+++ b/libgomp/testsuite/lib/libgomp.exp
@@ -14,6 +14,7 @@ load_lib dg.exp
 # loaded until ${tool}_target_compile is defined since it uses that
 # to determine default LTO options.
 
+load_gcc_lib multiline.exp
 load_gcc_lib prune.exp
 load_gcc_lib target-libpath.exp
 load_gcc_lib wrapper.exp
diff --git a/libitm/testsuite/lib/libitm.exp b/libitm/testsuite/lib/libitm.exp
index 1361d56..0416296 100644
--- a/libitm/testsuite/lib/libitm.exp
+++ b/libitm/testsuite/lib/libitm.exp
@@ -28,6 +28,7 @@ load_lib dg.exp
 # loaded until ${tool}_target_compile is defined since it uses that
 # to determine default LTO options.
 
+load_gcc_lib multiline.exp
 load_gcc_lib prune.exp
 load_gcc_lib target-libpath.exp
 load_gcc_lib wrapper.exp
diff --git a/libvtv/testsuite/lib/libvtv.exp b/libvtv/testsuite/lib/libvtv.exp
index aefcbd2..edf5fdd 100644
--- a/libvtv/testsuite/lib/libvtv.exp
+++ b/libvtv/testsuite/lib/libvtv.exp
@@ -28,6 +28,7 @@ load_lib dg.exp
 # loaded until ${tool}_target_compile is defined since it uses that
 # to determine default LTO options.
 
+load_gcc_lib multiline.exp
 load_gcc_lib prune.exp
 load_gcc_lib target-libpath.exp
 load_gcc_lib wrapper.exp
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 0/5] RFC: Overhaul of diagnostics (v2)
@ 2015-09-22 21:09 David Malcolm
  2015-09-22 21:09 ` [PATCH 1/5] Testsuite: add dg-{begin|end}-multiline-output commands David Malcolm
                   ` (6 more replies)
  0 siblings, 7 replies; 83+ messages in thread
From: David Malcolm @ 2015-09-22 21:09 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

This is an updated version of this patch kit:
  https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00726.html
It's still at the level of an RFC/work-in-progress; I'm posting for
feedback rather than for formal approval at this time (though the
first two patches are perhaps ready).

For the sake of simplicity, for now I've eliminated anything that
isn't about getting us underlines under expression ranges.  I've also
reduced the scope to just the C frontend.

It captures source ranges for C expressions as they are parsed within
c_expr, and stores them for some trees within GENERIC: in the latter
case only for those that already have a location_t i.e. for all
"compound expressions", but not for e.g. INTEGER_CSTs and VAR_DECLs.
It does this by expanding the ad-hoc lookaside data to include a
source_range (as per Jakub's suggestion).

Doing it this way avoids the need to introduce any new tree nodes, or
to add any fields to any existing tree types.

As in v1 of the kit, the ranges for tokens are stashed into new fields
within the tokens.  I'm thinking for v3 of the kit that that might be
redundant, and that it may be better to stash the token ranges into the
location_t (via the ad-hoc table) immediately as the tokens
are lexed.

The benefit of that approach is (a) the conceptual simplicity that
everything could simply use a location_t, which would become both
a caret location plus a range surrounding it and (b) we'd be able to get
a range from a location_t in the rich_location, and hence (I hope) many
diagnostics would get range underlining "for free".

The drawback is that it could bloat the ad-hoc table.  Can the ad-hoc
table ever get smaller, or does it only ever get inserted into?
An idea I had is that we could stash short ranges directly into the
32 bits of location_t, by offsetting the per-column-bits somewhat.
That way short ranges wouldn't need to use the ad-hoc table, and
(I hope) most tokens could use this optimization.

My plan is to investigate the impact these patches have on the time
and memory consumption of the compiler, and to get some stats on
whether the location_t packing idea is worth it (and maybe
investigate how big an impact going to 64 bits for location_t would
be).

Bootstraps&regrtests; adds 143 PASS results to gcc.sum.
(this v2 patch kit is on top of r227977; v1 was on top of r227562)

Thoughts?

Dave

[BTW, I'm going to be on vacation and away from email from this
Saturday, the 26th through to October 5th]

David Malcolm (5):
  Testsuite: add dg-{begin|end}-multiline-output commands
  Reimplement diagnostic_show_locus, introducing rich_location classes
    (v2)
  Implement token range tracking within libcpp and the C FE (v2)
  Implement tree expression tracking in C FE (v2)
  Add plugin to recursively dump the source-ranges in a tree (v2)

 boehm-gc/testsuite/lib/boehm-gc.exp                |   1 +
 gcc/Makefile.in                                    |   1 +
 gcc/c-family/c-common.c                            |  25 +-
 gcc/c-family/c-common.h                            |   4 +-
 gcc/c-family/c-lex.c                               |   9 +-
 gcc/c-family/c-pragma.h                            |   4 +-
 gcc/c/c-decl.c                                     |   3 +-
 gcc/c/c-errors.c                                   |  12 +-
 gcc/c/c-objc-common.c                              |   2 +-
 gcc/c/c-parser.c                                   |  95 ++-
 gcc/c/c-tree.h                                     |  11 +
 gcc/c/c-typeck.c                                   |  10 +
 gcc/cp/error.c                                     |   5 +-
 gcc/cp/parser.c                                    |   3 +-
 gcc/diagnostic-color.c                             |   5 +-
 gcc/diagnostic-core.h                              |   8 +
 gcc/diagnostic-show-locus.c                        | 700 ++++++++++++++++++++-
 gcc/diagnostic.c                                   | 196 +++++-
 gcc/diagnostic.h                                   |  48 +-
 gcc/fortran/cpp.c                                  |  13 +-
 gcc/fortran/error.c                                |  34 +-
 gcc/gcc-rich-location.c                            |  86 +++
 gcc/gcc-rich-location.h                            |  47 ++
 gcc/genmatch.c                                     |  27 +-
 gcc/gimple.h                                       |   6 +-
 gcc/input.c                                        |   7 +
 gcc/pretty-print.c                                 |  21 +
 gcc/pretty-print.h                                 |  25 +-
 gcc/print-tree.c                                   |  21 +
 gcc/rtl-error.c                                    |   3 +-
 .../gcc.dg/plugin/diagnostic-test-expressions-1.c  | 422 +++++++++++++
 .../gcc.dg/plugin/diagnostic-test-show-locus-bw.c  | 124 ++++
 .../plugin/diagnostic-test-show-locus-color.c      | 131 ++++
 .../gcc.dg/plugin/diagnostic-test-show-trees-1.c   |  65 ++
 .../gcc.dg/plugin/diagnostic_plugin_show_trees.c   | 174 +++++
 .../plugin/diagnostic_plugin_test_show_locus.c     | 285 +++++++++
 .../diagnostic_plugin_test_tree_expression_range.c | 159 +++++
 gcc/testsuite/gcc.dg/plugin/plugin.exp             |   7 +
 gcc/testsuite/lib/gcc-dg.exp                       |   1 +
 gcc/testsuite/lib/multiline.exp                    | 241 +++++++
 gcc/testsuite/lib/prune.exp                        |   5 +
 gcc/tree-cfg.c                                     |   9 +-
 gcc/tree-diagnostic.c                              |   2 +-
 gcc/tree-inline.c                                  |   5 +-
 gcc/tree-pretty-print.c                            |   2 +-
 gcc/tree.c                                         |  40 +-
 gcc/tree.h                                         |  40 ++
 libatomic/testsuite/lib/libatomic.exp              |   1 +
 libcpp/errors.c                                    |   7 +-
 libcpp/include/cpplib.h                            |   8 +-
 libcpp/include/line-map.h                          | 220 ++++++-
 libcpp/lex.c                                       |  14 +
 libcpp/line-map.c                                  | 156 ++++-
 libgo/testsuite/lib/libgo.exp                      |   1 +
 libgomp/testsuite/lib/libgomp.exp                  |   1 +
 libitm/testsuite/lib/libitm.exp                    |   1 +
 libvtv/testsuite/lib/libvtv.exp                    |   1 +
 57 files changed, 3380 insertions(+), 174 deletions(-)
 create mode 100644 gcc/gcc-rich-location.c
 create mode 100644 gcc/gcc-rich-location.h
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-bw.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-color.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-trees-1.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_show_trees.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c
 create mode 100644 gcc/testsuite/lib/multiline.exp

-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 3/5] Implement token range tracking within libcpp and the C FE (v2)
  2015-09-22 21:09 [PATCH 0/5] RFC: Overhaul of diagnostics (v2) David Malcolm
  2015-09-22 21:09 ` [PATCH 1/5] Testsuite: add dg-{begin|end}-multiline-output commands David Malcolm
  2015-09-22 21:10 ` [PATCH 5/5] Add plugin to recursively dump the source-ranges in a tree (v2) David Malcolm
@ 2015-09-22 21:10 ` David Malcolm
  2015-09-25  9:58   ` Dodji Seketeli
  2015-09-22 21:33 ` [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2) David Malcolm
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 83+ messages in thread
From: David Malcolm @ 2015-09-22 21:10 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

This is an updated version of:
  https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00736.html

Changes in V2 of the patch:
  * c_lex_with_flags: don't write through the range ptr if it's NULL
  * don't add any fields to the C++ frontend's cp_token for now
  * libcpp/lex.c: prevent usage of stale/uninitialized data in
    _cpp_temp_token and _cpp_lex_direct.

This patch adds source *range* information to libcpp's cpp_token, and to
c_token in the C frontend.

As noted before, to minimize churn, I kept the existing location_t
fields, though in theory these are always just equal to the start of
the source range.

cpplib.h's struct cpp_token had this comment:

  /* A preprocessing token.  This has been carefully packed and should
     occupy 16 bytes on 32-bit hosts and 24 bytes on 64-bit hosts.  */

which, like the v1 equivalent, this patch invalidates.

See the cover-letter for this patch kit which describes how we might
go back to using just a location_t, and stashing the range inside the
location_t.  I'm doing it this way for now to allow for more
flexibility as I benchmark and explore implementation options.

gcc/c-family/ChangeLog:
	* c-lex.c (c_lex_with_flags): Add "range" param, and write back
	to *range with the range of the libcpp token if non-NULL.
	* c-pragma.h (c_lex_with_flags): Add "range" param.

gcc/c/ChangeLog:
	* c-parser.c (struct c_token): Add "range" field.
	(c_lex_one_token): Write back to token->range in call to
	c_lex_with_flags.

gcc/cp/ChangeLog:
	* parser.c (cp_lexer_get_preprocessor_token): Update call to
	c_lex_with_flags to pass NULL for range ptr.

libcpp/ChangeLog:
	* include/cpplib.h (struct cpp_token): Add src_range field.
	* lex.c (_cpp_lex_direct): Set up the src_range on the token.
---
 gcc/c-family/c-lex.c    |  9 +++++++--
 gcc/c-family/c-pragma.h |  4 ++--
 gcc/c/c-parser.c        |  6 +++++-
 gcc/cp/parser.c         |  3 ++-
 libcpp/include/cpplib.h |  4 +++-
 libcpp/lex.c            | 14 ++++++++++++++
 6 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index 55ceb20..57a626e 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -380,11 +380,14 @@ c_common_has_attribute (cpp_reader *pfile)
 }
 \f
 /* Read a token and return its type.  Fill *VALUE with its value, if
-   applicable.  Fill *CPP_FLAGS with the token's flags, if it is
+   applicable.  Fill *LOC with the source location of the token.
+   If non-NULL, fill *RANGE with the source range of the token.
+   Fill *CPP_FLAGS with the token's flags, if it is
    non-NULL.  */
 
 enum cpp_ttype
-c_lex_with_flags (tree *value, location_t *loc, unsigned char *cpp_flags,
+c_lex_with_flags (tree *value, location_t *loc, source_range *range,
+		  unsigned char *cpp_flags,
 		  int lex_flags)
 {
   static bool no_more_pch;
@@ -397,6 +400,8 @@ c_lex_with_flags (tree *value, location_t *loc, unsigned char *cpp_flags,
  retry:
   tok = cpp_get_token_with_location (parse_in, loc);
   type = tok->type;
+  if (range)
+    *range = tok->src_range;
 
  retry_after_at:
   switch (type)
diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h
index f6e1090..3b94e44 100644
--- a/gcc/c-family/c-pragma.h
+++ b/gcc/c-family/c-pragma.h
@@ -225,8 +225,8 @@ extern enum cpp_ttype pragma_lex (tree *, location_t *loc = NULL);
 /* This is not actually available to pragma parsers.  It's merely a
    convenient location to declare this function for c-lex, after
    having enum cpp_ttype declared.  */
-extern enum cpp_ttype c_lex_with_flags (tree *, location_t *, unsigned char *,
-					int);
+extern enum cpp_ttype c_lex_with_flags (tree *, location_t *, source_range *,
+					unsigned char *, int);
 
 extern void c_pp_lookup_pragma (unsigned int, const char **, const char **);
 
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 2fab3f0..5edf563 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -170,6 +170,8 @@ struct GTY (()) c_token {
   ENUM_BITFIELD (pragma_kind) pragma_kind : 8;
   /* The location at which this token was found.  */
   location_t location;
+  /* The source range at which this token was found.  */
+  source_range range;
   /* The value associated with this token, if any.  */
   tree value;
 };
@@ -239,7 +241,9 @@ c_lex_one_token (c_parser *parser, c_token *token)
 {
   timevar_push (TV_LEX);
 
-  token->type = c_lex_with_flags (&token->value, &token->location, NULL,
+  token->type = c_lex_with_flags (&token->value, &token->location,
+				  &token->range,
+				  NULL,
 				  (parser->lex_untranslated_string
 				   ? C_LEX_STRING_NO_TRANSLATE : 0));
   token->id_kind = C_ID_NONE;
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 0134189..9423755 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -764,7 +764,8 @@ cp_lexer_get_preprocessor_token (cp_lexer *lexer, cp_token *token)
 
    /* Get a new token from the preprocessor.  */
   token->type
-    = c_lex_with_flags (&token->u.value, &token->location, &token->flags,
+    = c_lex_with_flags (&token->u.value, &token->location,
+                        NULL, &token->flags,
 			lexer == NULL ? 0 : C_LEX_STRING_NO_JOIN);
   token->keyword = RID_MAX;
   token->pragma_kind = PRAGMA_NONE;
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index a2bdfa0..0b1a403 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -235,9 +235,11 @@ struct GTY(()) cpp_identifier {
 };
 
 /* A preprocessing token.  This has been carefully packed and should
-   occupy 16 bytes on 32-bit hosts and 24 bytes on 64-bit hosts.  */
+   occupy 16 bytes on 32-bit hosts and 24 bytes on 64-bit hosts.
+   FIXME: the above comment is no longer true with this patch.  */
 struct GTY(()) cpp_token {
   source_location src_loc;	/* Location of first char of token.  */
+  source_range src_range;	/* Source range covered by the token.  */
   ENUM_BITFIELD(cpp_ttype) type : CHAR_BIT;  /* token type */
   unsigned short flags;		/* flags - see above */
 
diff --git a/libcpp/lex.c b/libcpp/lex.c
index 0aa1090..a6f16b2 100644
--- a/libcpp/lex.c
+++ b/libcpp/lex.c
@@ -2169,6 +2169,8 @@ _cpp_temp_token (cpp_reader *pfile)
 
   result = pfile->cur_token++;
   result->src_loc = old->src_loc;
+  result->src_range.m_start = old->src_loc;
+  result->src_range.m_finish = old->src_loc;
   return result;
 }
 
@@ -2365,6 +2367,13 @@ _cpp_lex_direct (cpp_reader *pfile)
     result->src_loc = linemap_position_for_column (pfile->line_table,
 					  CPP_BUF_COLUMN (buffer, buffer->cur));
 
+  /* The token's src_range begins here.  */
+  result->src_range.m_start = result->src_loc;
+
+  /* Ensure m_finish is also initialized, in case we bail out above
+     via a "goto fresh_line;" below.  */
+  result->src_range.m_finish = result->src_loc;
+
   switch (c)
     {
     case ' ': case '\t': case '\f': case '\v': case '\0':
@@ -2723,6 +2732,11 @@ _cpp_lex_direct (cpp_reader *pfile)
       break;
     }
 
+  /* The token's src_range ends here.  */
+  result->src_range.m_finish =
+    linemap_position_for_column (pfile->line_table,
+				 CPP_BUF_COLUMN (buffer, buffer->cur));
+
   return result;
 }
 
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 5/5] Add plugin to recursively dump the source-ranges in a tree (v2)
  2015-09-22 21:09 [PATCH 0/5] RFC: Overhaul of diagnostics (v2) David Malcolm
  2015-09-22 21:09 ` [PATCH 1/5] Testsuite: add dg-{begin|end}-multiline-output commands David Malcolm
@ 2015-09-22 21:10 ` David Malcolm
  2015-09-28  8:23   ` Dodji Seketeli
  2015-09-22 21:10 ` [PATCH 3/5] Implement token range tracking within libcpp and the C FE (v2) David Malcolm
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 83+ messages in thread
From: David Malcolm @ 2015-09-22 21:10 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

This patch adds a test plugin that recurses down an expression tree,
printing diagnostics showing the ranges of each node in the tree.

It corresponds to:
  [PATCH 15/22] Add plugin to recursively dump the source-ranges in a tree
    https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00741.html
from v1 of the patch kit.

Changes in v2:
  * the output no longer contains the PARAM_DECL and INTEGER_CST
    leaves since we no longer have range data for them; updated
    the expected output accordingly.
  * slightly updated to eliminate use of SOURCE_RANGE

Updated screenshot:
  https://dmalcolm.fedorapeople.org/gcc/2015-09-22/diagnostic-test-show-trees-1.html

gcc/testsuite/ChangeLog:
	* gcc.dg/plugin/diagnostic-test-show-trees-1.c: New file.
	* gcc.dg/plugin/diagnostic_plugin_show_trees.c: New file.
	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add
	diagnostic_plugin_show_trees.c and diagnostic-test-show-trees-1.c.
---
 .../gcc.dg/plugin/diagnostic-test-show-trees-1.c   |  65 ++++++++
 .../gcc.dg/plugin/diagnostic_plugin_show_trees.c   | 174 +++++++++++++++++++++
 gcc/testsuite/gcc.dg/plugin/plugin.exp             |   2 +
 3 files changed, 241 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-trees-1.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_show_trees.c

diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-trees-1.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-trees-1.c
new file mode 100644
index 0000000..7473a07
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-trees-1.c
@@ -0,0 +1,65 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdiagnostics-show-caret" } */
+
+/* This is an example file for use with
+   diagnostic_plugin_show_trees.c.
+
+   The plugin handles "__show_tree" by recursively dumping
+   the internal structure of the second input argument.
+
+   We want to accept an expression of any type.  To do this in C, we
+   use variadic arguments, but C requires at least one argument before
+   the ellipsis, so we have a dummy one.  */
+
+extern void __show_tree (int dummy, ...);
+
+extern double sqrt (double x);
+
+void test_quadratic (double a, double b, double c)
+{
+  __show_tree (0,
+     (-b + sqrt (b * b - 4 * a * c))
+     / (2 * a));
+
+/* { dg-begin-multiline-output "" }
+      (-b + sqrt (b * b - 4 * a * c))
+      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+      / (2 * a));
+      ^~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+/* { dg-begin-multiline-output "" }
+      (-b + sqrt (b * b - 4 * a * c))
+      ~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+/* { dg-begin-multiline-output "" }
+      (-b + sqrt (b * b - 4 * a * c))
+            ^~~~~~~~~~~~~~~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+/* { dg-begin-multiline-output "" }
+      (-b + sqrt (b * b - 4 * a * c))
+                  ~~~~~~^~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+/* { dg-begin-multiline-output "" }
+      (-b + sqrt (b * b - 4 * a * c))
+                  ~~^~~
+   { dg-end-multiline-output "" } */
+
+/* { dg-begin-multiline-output "" }
+      (-b + sqrt (b * b - 4 * a * c))
+                          ~~~~~~^~~
+   { dg-end-multiline-output "" } */
+
+/* { dg-begin-multiline-output "" }
+      (-b + sqrt (b * b - 4 * a * c))
+                          ~~^~~
+   { dg-end-multiline-output "" } */
+
+/* { dg-begin-multiline-output "" }
+      / (2 * a));
+        ~~~^~~~
+   { dg-end-multiline-output "" } */
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_show_trees.c b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_show_trees.c
new file mode 100644
index 0000000..5a911c1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_show_trees.c
@@ -0,0 +1,174 @@
+/* This plugin recursively dumps the source-code location ranges of
+   expressions, at the pre-gimplification tree stage.  */
+/* { dg-options "-O" } */
+
+#include "gcc-plugin.h"
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "stringpool.h"
+#include "toplev.h"
+#include "basic-block.h"
+#include "hash-table.h"
+#include "vec.h"
+#include "ggc.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "internal-fn.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "tree.h"
+#include "tree-pass.h"
+#include "intl.h"
+#include "plugin-version.h"
+#include "diagnostic.h"
+#include "context.h"
+#include "gcc-rich-location.h"
+#include "print-tree.h"
+
+/*
+  Hack: fails with linker error:
+./diagnostic_plugin_show_trees.so: undefined symbol: _ZN17gcc_rich_location8add_exprEP9tree_node
+  since nothing in the tree is using gcc_rich_location::add_expr yet.
+
+  I've tried various workarounds (adding DEBUG_FUNCTION to the
+  method, taking its address), but can't seem to fix it that way.
+  So as a nasty workaround, the following material is copied&pasted
+  from gcc-rich-location.c: */
+
+static bool
+get_range_for_expr (tree expr, location_range *r)
+{
+  if (EXPR_HAS_RANGE (expr))
+    {
+      source_range sr = EXPR_LOCATION_RANGE (expr);
+
+      /* Do we have meaningful data?  */
+      if (sr.m_start && sr.m_finish)
+	{
+	  r->m_start = expand_location (sr.m_start);
+	  r->m_finish = expand_location (sr.m_finish);
+	  return true;
+	}
+    }
+
+  return false;
+}
+
+/* Add a range to the rich_location, covering expression EXPR. */
+
+void
+gcc_rich_location::add_expr (tree expr)
+{
+  gcc_assert (expr);
+
+  location_range r;
+  r.m_show_caret_p = false;
+  if (get_range_for_expr (expr, &r))
+    add_range (&r);
+}
+
+/* FIXME: end of material taken from gcc-rich-location.c */
+
+int plugin_is_GPL_compatible;
+
+static void
+show_tree (tree node)
+{
+  if (!CAN_HAVE_RANGE_P (node))
+    return;
+
+  gcc_rich_location richloc (EXPR_LOCATION (node));
+  richloc.add_expr (node);
+
+  if (richloc.get_num_locations () < 2)
+    {
+      error_at_rich_loc (&richloc, "range not found");
+      return;
+    }
+
+  enum tree_code code = TREE_CODE (node);
+
+  location_range *range = richloc.get_range (1);
+  inform_at_rich_loc (&richloc,
+		      "%s at range %i:%i-%i:%i",
+		      get_tree_code_name (code),
+		      range->m_start.line,
+		      range->m_start.column,
+		      range->m_finish.line,
+		      range->m_finish.column);
+
+  /* Recurse.  */
+  int min_idx = 0;
+  int max_idx = TREE_OPERAND_LENGTH (node);
+  switch (code)
+    {
+    case CALL_EXPR:
+      min_idx = 3;
+      break;
+
+    default:
+      break;
+    }
+
+  for (int i = min_idx; i < max_idx; i++)
+    show_tree (TREE_OPERAND (node, i));
+}
+
+tree
+cb_walk_tree_fn (tree * tp, int * walk_subtrees,
+		 void * data ATTRIBUTE_UNUSED)
+{
+  if (TREE_CODE (*tp) != CALL_EXPR)
+    return NULL_TREE;
+
+  tree call_expr = *tp;
+  tree fn = CALL_EXPR_FN (call_expr);
+  if (TREE_CODE (fn) != ADDR_EXPR)
+    return NULL_TREE;
+  fn = TREE_OPERAND (fn, 0);
+  if (TREE_CODE (fn) != FUNCTION_DECL)
+    return NULL_TREE;
+  if (strcmp (IDENTIFIER_POINTER (DECL_NAME (fn)), "__show_tree"))
+    return NULL_TREE;
+
+  /* Get arg 1; print it! */
+  tree arg = CALL_EXPR_ARG (call_expr, 1);
+
+  show_tree (arg);
+
+  return NULL_TREE;
+}
+
+static void
+callback (void *gcc_data, void *user_data)
+{
+  tree fndecl = (tree)gcc_data;
+  walk_tree (&DECL_SAVED_TREE (fndecl), cb_walk_tree_fn, NULL, NULL);
+}
+
+int
+plugin_init (struct plugin_name_args *plugin_info,
+	     struct plugin_gcc_version *version)
+{
+  struct register_pass_info pass_info;
+  const char *plugin_name = plugin_info->base_name;
+  int argc = plugin_info->argc;
+  struct plugin_argument *argv = plugin_info->argv;
+
+  if (!plugin_default_version_check (version, &gcc_version))
+    return 1;
+
+  register_callback (plugin_name,
+		     PLUGIN_PRE_GENERICIZE,
+		     callback,
+		     NULL);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/plugin.exp b/gcc/testsuite/gcc.dg/plugin/plugin.exp
index b7efcf5..f1155ee 100644
--- a/gcc/testsuite/gcc.dg/plugin/plugin.exp
+++ b/gcc/testsuite/gcc.dg/plugin/plugin.exp
@@ -68,6 +68,8 @@ set plugin_test_list [list \
 	  diagnostic-test-show-locus-color.c } \
     { diagnostic_plugin_test_tree_expression_range.c \
 	  diagnostic-test-expressions-1.c } \
+    { diagnostic_plugin_show_trees.c \
+	  diagnostic-test-show-trees-1.c } \
 ]
 
 foreach plugin_test $plugin_test_list {
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2)
  2015-09-22 21:09 [PATCH 0/5] RFC: Overhaul of diagnostics (v2) David Malcolm
                   ` (2 preceding siblings ...)
  2015-09-22 21:10 ` [PATCH 3/5] Implement token range tracking within libcpp and the C FE (v2) David Malcolm
@ 2015-09-22 21:33 ` David Malcolm
  2015-09-25  9:49   ` Dodji Seketeli
  2015-09-22 22:23 ` [PATCH 4/5] Implement tree expression tracking in C FE (v2) David Malcolm
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 83+ messages in thread
From: David Malcolm @ 2015-09-22 21:33 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

This is v2 of this patch; an earlier version was sent as:
  https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00737.html
Thanks for the comments so far.

Changes in v2:
* Added a big descriptive comment for class rich_location in
  libcpp/include/line-map.h.
* Implemented x-offset support for very long lines, emulating the
  behavior of the old printer.
* Eliminated UTF-8/box-drawing and captions.  Captions were cute but
  weren't "fully baked".  Without them, box-drawing isn't really
  needed, and I think I prefer the ASCII look, with the actual
  "caret" character, and '~' makes it easier to count characters
  compared to a box-drawing line, in my terminal's font, at least.
  Doing so greatly simplifies the new locus-printing code.
* Bulletproofing of the locus-printing, filtering out ranges where
  any part of the location isn't in the same file as the primary
  location.
* Fix 'half-open' comment in line-map.h (ranges are closed)
* Eliminating (or postponing) the gcc-rich-location.h/c subclass
  of rich_location since without captions they're less useful.

This patch rewrites diagnostic_show_locus so that it can display
underlined source ranges in addition to the main caret.

It does this by introducing a new "rich_location" class, containing
a location and (potentially) some source ranges.  These are to be
allocated on the stack when generating diagnostics.

The patch reworks various diagnostics machinery to use a
rich_location * rather than a source_location.  The "override_column"
machinery is largely eliminated.

The patch unit-tests the new diagnostic printer using a plugin, which
injects calls to print various diagnostics on some dummy source code,
and verifies the expected output, for both
black&white vs colored output; screenshots can be seen here:
 https://dmalcolm.fedorapeople.org/gcc/2015-09-21/diagnostic-test-show-locus-bw.html
 https://dmalcolm.fedorapeople.org/gcc/2015-09-21/diagnostic-test-show-locus-color.html

Diagnostics already have a "severity color": errors default to bold red,
warnings to bold magenta, notes to bold cyan.  I chose to use this
severity color when coloring range 0, and after some experiments it
became natural to also use the color for the caret, since this is
notionally part of range 0.  Hence the "caret" color name goes away
from diagnostic-color.c, and we gain two new color names: "range1" and
"range2", for additional ranges.  Based on the discussion in that file
I chose green and blue (both non-bold) for these ranges.

I've tried it with all of the built-in color schemes in GNOME Terminal,
and it seems sane ("Black on light yellow", "Black on white",
"Grey on black", "Green on black", "White on black").

The patch also tweaks diagnostic_report_diagnostic, so that the final
option_text for warnings is colored, using the effective severity, e.g.:
  [-Wformat]
   ^^^^^^^^ colored with the "warning" color
or:
  [-Werror=format]
   ^^^^^^^^^^^^^^ colored with the "error" color

This patch bootstraps and passes regression testing, and I believe it
implements everything the old implementation did.

Other questions and notes:

* The Fortran frontend has its own logic for printing multiple
locations, repeatedly calling in to diagnostic_print_caret_line.
I hope the new printing logic is suitable for use by Fortran, but I
wanted to keep the job of "introducing range-capable printing logic"
separate from that of "updating Fortran diagnostics to use it",
since I'm not very familiar with Fortran, and what is desirable
there.  Hence to faithfully preserve the existing behavior, I
introduced a flag into the diagnostic_context:
  "frontend_calls_diagnostic_print_caret_line_p"
which is set by the Fortran frontend, and makes diagnostic_show_locus
use the existing printing logic.  Hopefully that's acceptable,
say, as a migration path.

*  I tried losing the "_at_richloc" suffix to "warning_at_richloc" etc,
and just having an overload (e.g. of "warning_at"), but we then run
into lots of calls to e.g.
  warning_at (0,
where the "0" means "UNKNOWN_LOCATION", and this would become a
compilation errors, due to ambiguity of the overload (0 location_t
vs a NULL rich_location *).
The call sites of the above form could be changed to explicitly use
UNKNOWN_LOCATION instead of 0, if desired.

OK for trunk?

gcc/ChangeLog:
	* diagnostic-color.c (color_dict): Eliminate "caret"; add "range1"
	and "range2".
	(parse_gcc_colors): Update comment to describe default GCC_COLORS.
	* diagnostic-core.h (warning_at_rich_loc): New declaration.
	(error_at_rich_loc): New declaration.
	(permerror_at_rich_loc): New declaration.
	(inform_at_rich_loc): New declaration.
	* diagnostic-show-locus.c (class colorizer): New class.
	(class layout_point): New class.
	(class layout_range): New class.
	(class layout): New class.
	(colorizer::colorizer): New ctor.
	(colorizer::~colorizer): New dtor.
	(colorizer::set_state): New method.
	(colorizer::begin_state): New method.
	(colorizer::finish_state): New method.
	(layout_range::layout_range): New ctor.
	(layout_range::contains_point): New method.
	(get_line_width_without_trailing_whitespace): New function.
	(layout::layout): New ctor.
	(layout::print_line): New method.
	(layout::get_state_at_point): New method.
	(layout::get_x_bound_for_row): New method.
	(show_ruler): New function.
	(diagnostic_show_locus): Call new function diagnostic_print_ranges,
	falling back to diagnostic_print_caret_line if the frontend has
	set frontend_calls_diagnostic_print_caret_line_p on the
	diagnostic_context.
	(diagnostic_print_ranges): New function.
	* diagnostic.c (diagnostic_initialize): Replace
	MAX_LOCATIONS_PER_MESSAGE with rich_location::MAX_RANGES.
	(diagnostic_set_info_translated): Convert param from location_t
	to rich_location *.  Eliminate calls to set_location on the
	message in favor of storing the rich_location ptr there.
	(diagnostic_set_info): Convert param from location_t to
	rich_location *.
	(diagnostic_build_prefix): Break out array into...
	(diagnostic_kind_color): New variable.
	(diagnostic_get_color_for_kind): New function.
	(diagnostic_report_diagnostic): Colorize the option_text
	using the color for the severity.
	(diagnostic_append_note): Update for change in signature of
	diagnostic_set_info.
	(diagnostic_append_note_at_rich_loc): New function.
	(emit_diagnostic): Update for change in signature of
	diagnostic_set_info.
	(inform): Likewise.
	(inform_at_rich_loc): New function.
	(inform_n): Update for change in signature of diagnostic_set_info.
	(warning): Likewise.
	(warning_at): Likewise.
	(warning_at_rich_loc): New function.
	(warning_n): Update for change in signature of diagnostic_set_info.
	(pedwarn): Likewise.
	(permerror): Likewise.
	(permerror_at_rich_loc): New function.
	(error): Update for change in signature of diagnostic_set_info.
	(error_n): Likewise.
	(error_at): Likewise.
	(error_at_rich_loc): New function.
	(sorry): Update for change in signature of diagnostic_set_info.
	(fatal_error): Likewise.
	(internal_error): Likewise.
	(internal_error_no_backtrace): Likewise.
	(source_range::debug): New function.
	* diagnostic.h (struct diagnostic_info): Eliminate field
	"override_column".  Add field "richloc".
	(struct diagnostic_context): Convert MAX_LOCATIONS_PER_MESSAGE to
	rich_location::MAX_RANGES.  Add field
	"frontend_calls_diagnostic_print_caret_line_p".
	(diagnostic_override_column): Eliminate this macro.
	(diagnostic_set_info): Convert param from location_t to
	rich_location *.
	(diagnostic_set_info_translated): Likewise.
	(diagnostic_append_note_at_rich_loc): New function.
	(diagnostic_num_locations): New function.
	(diagnostic_expand_location): Get the location from the
	rich_location.
	(diagnostic_get_color_for_kind): New declaration.
	* genmatch.c (linemap_client_expand_location_to_spelling_point): New.
	(error_cb): Update for change in signature of "error" callback.
	(fatal_at): Likewise.
	(warning_at): Likewise.
	* input.c (linemap_client_expand_location_to_spelling_point): New.
	* pretty-print.c (text_info::set_range): New method.
	(text_info::get_location): New method.
	* pretty-print.h (MAX_LOCATIONS_PER_MESSAGE): Eliminate this macro.
	(struct text_info): Eliminate "locations" array in favor of
	"m_richloc", a rich_location *.
	(textinfo::set_location): Add a "caret_p" param, and reimplement
	in terms of a call to set_range.
	(textinfo::get_location): Eliminate inline implementation in favor of
	an out-of-line reimplementation.
	(textinfo::set_range): New method.
	* rtl-error.c (diagnostic_for_asm): Update for change in signature
	of diagnostic_set_info.
	* tree-diagnostic.c (default_tree_printer): Update for new
	"caret_p" param for textinfo::set_location.
	* tree-pretty-print.c (percent_K_format): Likewise.

gcc/c-family/ChangeLog:
	* c-common.c (c_cpp_error): Convert parameter from location_t to
	rich_location *.  Eliminate the "column_override" parameter and
	the call to diagnostic_override_column.
	Update the "done_lexing" clause to set range 0
	on the rich_location, rather than overwriting a location_t.
	* c-common.h (c_cpp_error): Convert parameter from location_t to
	rich_location *.  Eliminate the "column_override" parameter.

gcc/c/ChangeLog:
	* c-decl.c (warn_defaults_to): Update for change in signature
	of diagnostic_set_info.
	* c-errors.c (pedwarn_c99): Likewise.
	(pedwarn_c90): Likewise.
	* c-objc-common.c (c_tree_printer): Update for new "caret_p" param
	for textinfo::set_location.

gcc/cp/ChangeLog:
	* error.c (cp_printer): Update for new "caret_p" param for
	textinfo::set_location.
	(pedwarn_cxx98): Update for change in signature of
	diagnostic_set_info.

gcc/fortran/ChangeLog:
	* cpp.c (cb_cpp_error): Convert parameter from location_t to
	rich_location *.  Eliminate the "column_override" parameter.
	* error.c (gfc_warning): Update for change in signature of
	diagnostic_set_info.
	(gfc_format_decoder): Update handling of %C/%L for changes
	to struct text_info.
	(gfc_diagnostic_starter): Use richloc when determining whether to
	print one locus or two.
	(gfc_warning_now_at): Update for change in signature of
	diagnostic_set_info.
	(gfc_warning_now): Likewise.
	(gfc_error_now): Likewise.
	(gfc_fatal_error): Likewise.
	(gfc_error): Likewise.
	(gfc_internal_error): Likewise.
	(gfc_diagnostics_init): Set
	frontend_calls_diagnostic_print_caret_line_p.

gcc/testsuite/ChangeLog:
	* gcc.dg/plugin/diagnostic-test-show-locus-bw.c: New file.
	* gcc.dg/plugin/diagnostic-test-show-locus-color.c: New file.
	* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c: New file.
	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add the above.
	* lib/gcc-dg.exp: Load multiline.exp.

libcpp/ChangeLog:
	* errors.c (cpp_diagnostic): Update for change in signature
	of "error" callback.
	(cpp_diagnostic_with_line): Likewise, calling override_column
	on the rich_location.
	* include/cpplib.h (struct cpp_callbacks): Within "error"
	callback, convert param from source_location to rich_location *,
	and drop column_override param.
	* include/line-map.h (struct source_range): New struct.
	(struct location_range): New struct.
	(class rich_location): New class.
	(linemap_client_expand_location_to_spelling_point): New declaration.
	* line-map.c (rich_location::rich_location): New ctors.
	(rich_location::lazily_expand_location): New method.
	(rich_location::override_column): New method.
	(rich_location::add_range): New methods.
	(rich_location::set_range): New method.
---
 gcc/c-family/c-common.c                            |  15 +-
 gcc/c-family/c-common.h                            |   4 +-
 gcc/c/c-decl.c                                     |   3 +-
 gcc/c/c-errors.c                                   |  12 +-
 gcc/c/c-objc-common.c                              |   2 +-
 gcc/cp/error.c                                     |   5 +-
 gcc/diagnostic-color.c                             |   5 +-
 gcc/diagnostic-core.h                              |   8 +
 gcc/diagnostic-show-locus.c                        | 700 ++++++++++++++++++++-
 gcc/diagnostic.c                                   | 196 +++++-
 gcc/diagnostic.h                                   |  48 +-
 gcc/fortran/cpp.c                                  |  13 +-
 gcc/fortran/error.c                                |  34 +-
 gcc/genmatch.c                                     |  27 +-
 gcc/input.c                                        |   7 +
 gcc/pretty-print.c                                 |  21 +
 gcc/pretty-print.h                                 |  25 +-
 gcc/rtl-error.c                                    |   3 +-
 .../gcc.dg/plugin/diagnostic-test-show-locus-bw.c  | 124 ++++
 .../plugin/diagnostic-test-show-locus-color.c      | 131 ++++
 .../plugin/diagnostic_plugin_test_show_locus.c     | 285 +++++++++
 gcc/testsuite/gcc.dg/plugin/plugin.exp             |   3 +
 gcc/testsuite/lib/gcc-dg.exp                       |   1 +
 gcc/tree-diagnostic.c                              |   2 +-
 gcc/tree-pretty-print.c                            |   2 +-
 libcpp/errors.c                                    |   7 +-
 libcpp/include/cpplib.h                            |   4 +-
 libcpp/include/line-map.h                          | 207 ++++++
 libcpp/line-map.c                                  | 130 ++++
 29 files changed, 1890 insertions(+), 134 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-bw.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-color.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 4b922bf..ded23d3 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -10451,15 +10451,14 @@ c_option_controlling_cpp_error (int reason)
 /* Callback from cpp_error for PFILE to print diagnostics from the
    preprocessor.  The diagnostic is of type LEVEL, with REASON set
    to the reason code if LEVEL is represents a warning, at location
-   LOCATION unless this is after lexing and the compiler's location
-   should be used instead, with column number possibly overridden by
-   COLUMN_OVERRIDE if not zero; MSG is the translated message and AP
+   RICHLOC unless this is after lexing and the compiler's location
+   should be used instead; MSG is the translated message and AP
    the arguments.  Returns true if a diagnostic was emitted, false
    otherwise.  */
 
 bool
 c_cpp_error (cpp_reader *pfile ATTRIBUTE_UNUSED, int level, int reason,
-	     location_t location, unsigned int column_override,
+	     rich_location *richloc,
 	     const char *msg, va_list *ap)
 {
   diagnostic_info diagnostic;
@@ -10500,11 +10499,11 @@ c_cpp_error (cpp_reader *pfile ATTRIBUTE_UNUSED, int level, int reason,
       gcc_unreachable ();
     }
   if (done_lexing)
-    location = input_location;
+    richloc->set_range (0,
+			source_range::from_location (input_location),
+			true, true);
   diagnostic_set_info_translated (&diagnostic, msg, ap,
-				  location, dlevel);
-  if (column_override)
-    diagnostic_override_column (&diagnostic, column_override);
+				  richloc, dlevel);
   diagnostic_override_option_index (&diagnostic,
                                     c_option_controlling_cpp_error (reason));
   ret = report_diagnostic (&diagnostic);
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 74d1bc1..bb17fcc 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -981,9 +981,9 @@ extern void init_c_lex (void);
 
 extern void c_cpp_builtins (cpp_reader *);
 extern void c_cpp_builtins_optimize_pragma (cpp_reader *, tree, tree);
-extern bool c_cpp_error (cpp_reader *, int, int, location_t, unsigned int,
+extern bool c_cpp_error (cpp_reader *, int, int, rich_location *,
 			 const char *, va_list *)
-     ATTRIBUTE_GCC_DIAG(6,0);
+     ATTRIBUTE_GCC_DIAG(5,0);
 extern int c_common_has_attribute (cpp_reader *);
 
 extern bool parse_optimize_options (tree, bool);
diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index a110226..9af447c 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -5285,9 +5285,10 @@ warn_defaults_to (location_t location, int opt, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
                        flag_isoc99 ? DK_PEDWARN : DK_WARNING);
   diagnostic.option_index = opt;
   report_diagnostic (&diagnostic);
diff --git a/gcc/c/c-errors.c b/gcc/c/c-errors.c
index e5fbf05..0f8b933 100644
--- a/gcc/c/c-errors.c
+++ b/gcc/c/c-errors.c
@@ -42,13 +42,14 @@ pedwarn_c99 (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool warned = false;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
   /* If desired, issue the C99/C11 compat warning, which is more specific
      than -pedantic.  */
   if (warn_c99_c11_compat > 0)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 			   (pedantic && !flag_isoc11)
 			   ? DK_PEDWARN : DK_WARNING);
       diagnostic.option_index = OPT_Wc99_c11_compat;
@@ -60,7 +61,7 @@ pedwarn_c99 (location_t location, int opt, const char *gmsgid, ...)
   /* For -pedantic outside C11, issue a pedwarn.  */
   else if (pedantic && !flag_isoc11)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_PEDWARN);
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_PEDWARN);
       diagnostic.option_index = opt;
       warned = report_diagnostic (&diagnostic);
     }
@@ -80,6 +81,7 @@ pedwarn_c90 (location_t location, int opt, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
   /* Warnings such as -Wvla are the most specific ones.  */
@@ -90,7 +92,7 @@ pedwarn_c90 (location_t location, int opt, const char *gmsgid, ...)
         goto out;
       else if (opt_var > 0)
 	{
-	  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+	  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 			       (pedantic && !flag_isoc99)
 			       ? DK_PEDWARN : DK_WARNING);
 	  diagnostic.option_index = opt;
@@ -102,7 +104,7 @@ pedwarn_c90 (location_t location, int opt, const char *gmsgid, ...)
      specific than -pedantic.  */
   if (warn_c90_c99_compat > 0)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 			   (pedantic && !flag_isoc99)
 			   ? DK_PEDWARN : DK_WARNING);
       diagnostic.option_index = OPT_Wc90_c99_compat;
@@ -114,7 +116,7 @@ pedwarn_c90 (location_t location, int opt, const char *gmsgid, ...)
   /* For -pedantic outside C99, issue a pedwarn.  */
   else if (pedantic && !flag_isoc99)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_PEDWARN);
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_PEDWARN);
       diagnostic.option_index = opt;
       report_diagnostic (&diagnostic);
     }
diff --git a/gcc/c/c-objc-common.c b/gcc/c/c-objc-common.c
index 47fd7de..1e601f9 100644
--- a/gcc/c/c-objc-common.c
+++ b/gcc/c/c-objc-common.c
@@ -101,7 +101,7 @@ c_tree_printer (pretty_printer *pp, text_info *text, const char *spec,
     {
       t = va_arg (*text->args_ptr, tree);
       if (set_locus)
-	text->set_location (0, DECL_SOURCE_LOCATION (t));
+	text->set_location (0, DECL_SOURCE_LOCATION (t), true);
     }
 
   switch (*spec)
diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index faf8744..19ca8c3 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -3554,7 +3554,7 @@ cp_printer (pretty_printer *pp, text_info *text, const char *spec,
 
   pp_string (pp, result);
   if (set_locus && t != NULL)
-    text->set_location (0, location_of (t));
+    text->set_location (0, location_of (t), true);
   return true;
 #undef next_tree
 #undef next_tcode
@@ -3668,9 +3668,10 @@ pedwarn_cxx98 (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 		       (cxx_dialect == cxx98) ? DK_PEDWARN : DK_WARNING);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
diff --git a/gcc/diagnostic-color.c b/gcc/diagnostic-color.c
index 3fe49b2..d848dfc 100644
--- a/gcc/diagnostic-color.c
+++ b/gcc/diagnostic-color.c
@@ -164,7 +164,8 @@ static struct color_cap color_dict[] =
   { "warning", SGR_SEQ (COLOR_BOLD COLOR_SEPARATOR COLOR_FG_MAGENTA),
 	       7, false },
   { "note", SGR_SEQ (COLOR_BOLD COLOR_SEPARATOR COLOR_FG_CYAN), 4, false },
-  { "caret", SGR_SEQ (COLOR_BOLD COLOR_SEPARATOR COLOR_FG_GREEN), 5, false },
+  { "range1", SGR_SEQ (COLOR_FG_GREEN), 6, false },
+  { "range2", SGR_SEQ (COLOR_FG_BLUE), 6, false },
   { "locus", SGR_SEQ (COLOR_BOLD), 5, false },
   { "quote", SGR_SEQ (COLOR_BOLD), 5, false },
   { NULL, NULL, 0, false }
@@ -195,7 +196,7 @@ colorize_stop (bool show_color)
 }
 
 /* Parse GCC_COLORS.  The default would look like:
-   GCC_COLORS='error=01;31:warning=01;35:note=01;36:caret=01;32:locus=01:quote=01'
+   GCC_COLORS='error=01;31:warning=01;35:note=01;36:range1=32:range2=34;locus=01:quote=01'
    No character escaping is needed or supported.  */
 static bool
 parse_gcc_colors (void)
diff --git a/gcc/diagnostic-core.h b/gcc/diagnostic-core.h
index 66d2e42..a8a7c37 100644
--- a/gcc/diagnostic-core.h
+++ b/gcc/diagnostic-core.h
@@ -63,18 +63,26 @@ extern bool warning_n (location_t, int, int, const char *, const char *, ...)
     ATTRIBUTE_GCC_DIAG(4,6) ATTRIBUTE_GCC_DIAG(5,6);
 extern bool warning_at (location_t, int, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,4);
+extern bool warning_at_rich_loc (rich_location *, int, const char *, ...)
+    ATTRIBUTE_GCC_DIAG(3,4);
 extern void error (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
 extern void error_n (location_t, int, const char *, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,5) ATTRIBUTE_GCC_DIAG(4,5);
 extern void error_at (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
+extern void error_at_rich_loc (rich_location *, const char *, ...)
+  ATTRIBUTE_GCC_DIAG(2,3);
 extern void fatal_error (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3)
      ATTRIBUTE_NORETURN;
 /* Pass one of the OPT_W* from options.h as the second parameter.  */
 extern bool pedwarn (location_t, int, const char *, ...)
      ATTRIBUTE_GCC_DIAG(3,4);
 extern bool permerror (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
+extern bool permerror_at_rich_loc (rich_location *, const char *,
+				   ...) ATTRIBUTE_GCC_DIAG(2,3);
 extern void sorry (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
 extern void inform (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
+extern void inform_at_rich_loc (rich_location *, const char *,
+				...) ATTRIBUTE_GCC_DIAG(2,3);
 extern void inform_n (location_t, int, const char *, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,5) ATTRIBUTE_GCC_DIAG(4,5);
 extern void verbatim (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
index 147a2b8..985c5a5 100644
--- a/gcc/diagnostic-show-locus.c
+++ b/gcc/diagnostic-show-locus.c
@@ -36,6 +36,13 @@ along with GCC; see the file COPYING3.  If not see
 # include <sys/ioctl.h>
 #endif
 
+static void
+show_ruler (diagnostic_context *context, int max_width, int x_offset);
+
+static void
+diagnostic_print_ranges (diagnostic_context * context,
+			 const diagnostic_info *diagnostic);
+
 /* If LINE is longer than MAX_WIDTH, and COLUMN is not smaller than
    MAX_WIDTH by some margin, then adjust the start of the line such
    that the COLUMN is smaller than MAX_WIDTH minus the margin.  The
@@ -60,11 +67,639 @@ adjust_line (const char *line, int line_width,
   return line;
 }
 
-/* Print the physical source line corresponding to the location of
-   this diagnostic, and a caret indicating the precise column.  This
-   function only prints two caret characters if the two locations
-   given by DIAGNOSTIC are on the same line according to
-   diagnostic_same_line().  */
+/* Classes for rendering source code and diagnostics, within an
+   anonymous namespace.
+   The work is done by "class layout", which embeds and uses
+   "class colorizer" and "class layout_range" to get things done.  */
+
+namespace {
+
+/* A class to inject colorization codes when printing the diagnostic locus,
+   tracking state as it goes.  */
+
+class colorizer
+{
+ public:
+  colorizer (diagnostic_context *context,
+	     const diagnostic_info *diagnostic);
+  ~colorizer ();
+
+  void set_range (int range_idx) { set_state (range_idx); }
+  void set_normal_text () { set_state (STATE_NORMAL_TEXT); }
+
+ private:
+  void set_state (int state);
+  void begin_state (int state);
+  void finish_state (int state);
+
+ private:
+  static const int STATE_NORMAL_TEXT = -1;
+
+  diagnostic_context *m_context;
+  const diagnostic_info *m_diagnostic;
+  int m_current_state;
+  const char *m_caret_cs;
+  const char *m_caret_ce;
+  const char *m_range1_cs;
+  const char *m_range2_cs;
+  const char *m_range_ce;
+};
+
+/* A point within a layout_range; similar to an expanded_location,
+   but after filtering on file.  */
+
+class layout_point
+{
+ public:
+  layout_point (const expanded_location &exploc)
+  : m_line (exploc.line),
+    m_column (exploc.column) {}
+
+  int m_line;
+  int m_column;
+};
+
+/* A class for use by "class layout" below: a filtered location_range.  */
+
+class layout_range
+{
+ public:
+  layout_range (const location_range *loc_range);
+
+  bool contains_point (int row, int column) const;
+
+  layout_point m_start;
+  layout_point m_finish;
+  bool m_show_caret_p;
+  layout_point m_caret;
+};
+
+/* A class to control the overall layout when printing a diagnostic.
+
+   The layout is determined within the constructor.
+   It is then printed by repeatedly calling the "print_line" method.
+   Each such call can print two lines: one for the source line itself,
+   and potentially an "annotation" line, containing carets/underlines.
+
+   We assume we have disjoint ranges.  */
+
+class layout
+{
+ public:
+  layout (diagnostic_context *context,
+	  const diagnostic_info *diagnostic);
+
+  int get_first_line () const { return m_first_line; }
+  int get_last_line () const { return m_last_line; }
+
+  void print_line (int row);
+
+ private:
+  bool
+  get_state_at_point (/* Inputs.  */
+		      int row, int column,
+		      int first_non_ws, int last_non_ws,
+		      /* Outputs.  */
+		      int *out_range_idx,
+		      bool *out_draw_caret_p);
+
+  int
+  get_x_bound_for_row (int row, int caret_column,
+		       int last_non_ws);
+
+ private:
+  diagnostic_context *m_context;
+  pretty_printer *m_pp;
+  diagnostic_t m_diagnostic_kind;
+  expanded_location m_exploc;
+  colorizer m_colorizer;
+  auto_vec <layout_range> m_layout_ranges;
+  int m_first_line;
+  int m_last_line;
+  int m_x_offset;
+};
+
+/* Implementation of "class colorizer".  */
+
+/* The constructor for "colorizer".  Lookup and store color codes for the
+   different kinds of things we might need to print.  */
+
+colorizer::colorizer (diagnostic_context *context,
+		      const diagnostic_info *diagnostic) :
+  m_context (context),
+  m_diagnostic (diagnostic),
+  m_current_state (STATE_NORMAL_TEXT)
+{
+  m_caret_ce = colorize_stop (pp_show_color (context->printer));
+  m_range1_cs = colorize_start (pp_show_color (context->printer), "range1");
+  m_range2_cs = colorize_start (pp_show_color (context->printer), "range2");
+  m_range_ce = colorize_stop (pp_show_color (context->printer));
+}
+
+/* The destructor for "colorize".  If colorization is on, print a code to
+   turn it off.  */
+
+colorizer::~colorizer ()
+{
+  finish_state (m_current_state);
+}
+
+/* Update state, printing color codes if necessary if there's a state
+   change.  */
+
+void
+colorizer::set_state (int new_state)
+{
+  if (m_current_state != new_state)
+    {
+      finish_state (m_current_state);
+      m_current_state = new_state;
+      begin_state (new_state);
+    }
+}
+
+/* Turn on any colorization for STATE.  */
+
+void
+colorizer::begin_state (int state)
+{
+  switch (state)
+    {
+    case STATE_NORMAL_TEXT:
+      break;
+
+    case 0:
+      /* Make range 0 be the same color as the "kind" text
+	 (error vs warning vs note).  */
+      pp_string
+	(m_context->printer,
+	 colorize_start (pp_show_color (m_context->printer),
+			 diagnostic_get_color_for_kind (m_diagnostic->kind)));
+      break;
+
+    case 1:
+      pp_string (m_context->printer, m_range1_cs);
+      break;
+
+    case 2:
+      pp_string (m_context->printer, m_range2_cs);
+      break;
+
+    default:
+      /* We don't expect more than 3 ranges per diagnostic.  */
+      gcc_unreachable ();
+      break;
+    }
+}
+
+/* Turn off any colorization for STATE.  */
+
+void
+colorizer::finish_state (int state)
+{
+  switch (state)
+    {
+    case STATE_NORMAL_TEXT:
+      break;
+
+    case 0:
+      pp_string (m_context->printer, m_caret_ce);
+      break;
+
+    default:
+      /* Within a range.  */
+      gcc_assert (state > 0);
+      pp_string (m_context->printer, m_range_ce);
+      break;
+    }
+}
+
+/* Implementation of class layout_range.  */
+
+/* The constructor for class layout_range.
+   Initialize various layout_point fields from expanded_location
+   equivalents; we've already filtered on file.  */
+
+layout_range::layout_range (const location_range *loc_range)
+: m_start (loc_range->m_start),
+  m_finish (loc_range->m_finish),
+  m_show_caret_p (loc_range->m_show_caret_p),
+  m_caret (loc_range->m_caret)
+{
+}
+
+/* Is (column, row) within the given range?
+   We've already filtered on the file.
+
+   Ranges are closed (both limits are within the range).
+
+   Example A: a single-line range:
+     start:  (col=22, line=2)
+     finish: (col=38, line=2)
+
+  |00000011111111112222222222333333333344444444444
+  |34567890123456789012345678901234567890123456789
+--+-----------------------------------------------
+01|bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
+02|bbbbbbbbbbbbbbbbbbbSwwwwwwwwwwwwwwwFaaaaaaaaaaa
+03|aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+
+   Example B: a multiline range with
+     start:  (col=14, line=3)
+     finish: (col=08, line=5)
+
+  |00000011111111112222222222333333333344444444444
+  |34567890123456789012345678901234567890123456789
+--+-----------------------------------------------
+01|bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
+02|bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
+03|bbbbbbbbbbbSwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
+04|wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
+05|wwwwwFaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+06|aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+--+-----------------------------------------------
+
+   Legend:
+   - 'b' indicates a point *before* the range
+   - 'S' indicates the start of the range
+   - 'w' indicates a point within the range
+   - 'F' indicates the finish of the range (which is
+	 within it).
+   - 'a' indicates a subsequent point *after* the range.  */
+
+bool
+layout_range::contains_point (int row, int column) const
+{
+  gcc_assert (m_start.m_line <= m_finish.m_line);
+  /* ...but the equivalent isn't true for the columns;
+     consider example B in the comment above.  */
+
+  if (row < m_start.m_line)
+    /* Points before the first line of the range are
+       outside it (corresponding to line 01 in example A
+       and lines 01 and 02 in example B above).  */
+    return false;
+
+  if (row == m_start.m_line)
+    /* On same line as start of range (corresponding
+       to line 02 in example A and line 03 in example B).  */
+    {
+      if (column < m_start.m_column)
+	/* Points on the starting line of the range, but
+	   before the column in which it begins.  */
+	return false;
+
+      if (row < m_finish.m_line)
+	/* This is a multiline range; the point
+	   is within it (corresponds to line 03 in example B
+	   from column 14 onwards) */
+	return true;
+      else
+	{
+	  /* This is a single-line range.  */
+	  gcc_assert (row == m_finish.m_line);
+	  return column <= m_finish.m_column;
+	}
+    }
+
+  /* The point is in a line beyond that containing the
+     start of the range: lines 03 onwards in example A,
+     and lines 04 onwards in example B.  */
+  gcc_assert (row > m_start.m_line);
+
+  if (row > m_finish.m_line)
+    /* The point is beyond the final line of the range
+       (lines 03 onwards in example A, and lines 06 onwards
+       in example B).  */
+    return false;
+
+  if (row < m_finish.m_line)
+    {
+      /* The point is in a line that's fully within a multiline
+	 range (e.g. line 04 in example B).  */
+      gcc_assert (m_start.m_line < m_finish.m_line);
+      return true;
+    }
+
+  gcc_assert (row ==  m_finish.m_line);
+
+  return column <= m_finish.m_column;
+}
+
+/* Given a source line LINE of length LINE_WIDTH, determine the width
+   without any trailing whitespace.  */
+
+static int
+get_line_width_without_trailing_whitespace (const char *line, int line_width)
+{
+  int result = line_width;
+  while (result > 0)
+    {
+      char ch = line[result - 1];
+      if (ch == ' ' || ch == '\t')
+	result--;
+      else
+	break;
+    }
+  gcc_assert (result >= 0);
+  gcc_assert (result <= line_width);
+  gcc_assert (result == 0 ||
+	      (line[result - 1] != ' '
+	       && line[result -1] != '\t'));
+  return result;
+}
+
+/* Implementation of class layout.  */
+
+/* Constructor for class layout.
+
+   Filter the ranges from the rich_location to those that we can
+   sanely print, populating m_layout_ranges.
+   Determine the range of lines that we will print.
+   Determine m_x_offset, to ensure that the primary caret
+   will fit within the max_width provided by the diagnostic_context.  */
+
+layout::layout (diagnostic_context * context,
+		const diagnostic_info *diagnostic)
+: m_context (context),
+  m_pp (context->printer),
+  m_diagnostic_kind (diagnostic->kind),
+  m_exploc (diagnostic->richloc->lazily_expand_location ()),
+  m_colorizer (context, diagnostic),
+  m_layout_ranges (rich_location::MAX_RANGES),
+  m_first_line (m_exploc.line),
+  m_last_line  (m_exploc.line),
+  m_x_offset (0)
+{
+  rich_location *richloc = diagnostic->richloc;
+  for (unsigned int idx = 0; idx < richloc->get_num_locations (); idx++)
+    {
+      /* This diagnostic printer can only cope with "sufficiently sane" ranges.
+	 Ignore any ranges that are awkward to handle.  */
+      location_range *loc_range = richloc->get_range (idx);
+
+      /* If any part of the range isn't in the same file as the primary
+	 location of this diagnostic, ignore the range.  */
+      if (loc_range->m_start.file != m_exploc.file)
+	continue;
+      if (loc_range->m_finish.file != m_exploc.file)
+	continue;
+      if (loc_range->m_show_caret_p)
+	if (loc_range->m_finish.file != m_exploc.file)
+	  continue;
+
+      /* Passed all the tests; add the range to m_layout_ranges so that
+	 it will be printed.  */
+      layout_range ri (loc_range);
+      m_layout_ranges.safe_push (ri);
+
+      /* Update m_first_line/m_last_line if necessary.  */
+      if (loc_range->m_start.line < m_first_line)
+	m_first_line = loc_range->m_start.line;
+      if (loc_range->m_finish.line > m_last_line)
+	m_last_line = loc_range->m_finish.line;
+    }
+
+  /* Adjust m_x_offset.
+     Center the primary caret to fit in max_width; all columns
+     will be adjusted accordingly.  */
+  int max_width = m_context->caret_max_width;
+  int line_width;
+  const char *line = location_get_source_line (m_exploc.file, m_exploc.line,
+					       &line_width);
+  if (line && m_exploc.column <= line_width)
+    {
+      int right_margin = CARET_LINE_MARGIN;
+      int column = m_exploc.column;
+      right_margin = MIN (line_width - column, right_margin);
+      right_margin = max_width - right_margin;
+      if (line_width >= max_width && column > right_margin)
+	m_x_offset = column - right_margin;
+      gcc_assert (m_x_offset >= 0);
+    }
+
+  if (0)
+    show_ruler (context, line_width, m_x_offset);
+}
+
+/* Print text describing a line of source code.
+   This typically prints two lines:
+
+   (1) the source code itself, colorized at any ranges, and
+   (2) an annotation line containing any carets/underlines
+   describing the ranges.  */
+
+void
+layout::print_line (int row)
+{
+  int line_width;
+  const char *line = location_get_source_line (m_exploc.file, row,
+					       &line_width);
+  if (!line)
+    return;
+
+  line += m_x_offset;
+
+  m_colorizer.set_normal_text ();
+
+  /* Step 1: print the source code line.  */
+
+  /* We will stop printing at any trailing whitespace.  */
+  line_width
+    = get_line_width_without_trailing_whitespace (line,
+						  line_width);
+  pp_space (m_pp);
+  int first_non_ws = INT_MAX;
+  int last_non_ws = 0;
+  int column;
+  for (column = 1 + m_x_offset; column <= line_width; column++)
+    {
+      bool in_range_p;
+      int range_idx;
+      bool draw_caret_p;
+      in_range_p = get_state_at_point (row, column,
+				       0, INT_MAX,
+				       &range_idx, &draw_caret_p);
+      if (in_range_p)
+	m_colorizer.set_range (range_idx);
+      else
+	m_colorizer.set_normal_text ();
+      char c = *line == '\t' ? ' ' : *line;
+      if (c == '\0')
+	c = ' ';
+      if (c != ' ')
+	{
+	  last_non_ws = column;
+	  if (first_non_ws == INT_MAX)
+	    first_non_ws = column;
+	}
+      pp_character (m_pp, c);
+      line++;
+    }
+  pp_newline (m_pp);
+
+  /* Step 2: print a line consisting of the caret/underlines for the
+     given source line.  */
+  int x_bound = get_x_bound_for_row (row, m_exploc.column,
+				     last_non_ws);
+
+  pp_space (m_pp);
+  for (int column = 1 + m_x_offset; column < x_bound; column++)
+    {
+      bool in_range_p;
+      int range_idx;
+      bool draw_caret_p;
+      in_range_p = get_state_at_point (row, column,
+				       first_non_ws, last_non_ws,
+				       &range_idx, &draw_caret_p);
+      if (in_range_p)
+	{
+	  /* Within a range.  Draw either the caret or an underline.  */
+	  m_colorizer.set_range (range_idx);
+	  if (draw_caret_p)
+	    /* Draw the caret.  */
+	    pp_character (m_pp, m_context->caret_chars[range_idx]);
+	  else
+	    pp_character (m_pp, '~');
+	}
+      else
+	{
+	  /* Not in a range.  */
+	  m_colorizer.set_normal_text ();
+	  pp_character (m_pp, ' ');
+	}
+    }
+  pp_newline (m_pp);
+}
+
+/* Return true if (ROW/COLUMN) is within a range of the layout.
+   If it returns true, OUT_RANGE_IDX and OUT_DRAW_CARET_P are
+   written to, with the range index, and whether we should draw
+   the caret at (ROW/COLUMN) (as opposed to an underline).  */
+
+bool
+layout::get_state_at_point (/* Inputs.  */
+			    int row, int column,
+			    int first_non_ws, int last_non_ws,
+			    /* Outputs.  */
+			    int *out_range_idx,
+			    bool *out_draw_caret_p)
+{
+  /* Within a multiline range, don't display any underline or caret
+     in any leading or trailing whitespace on a line.  */
+  if (column < first_non_ws || column > last_non_ws)
+    return false;
+
+  layout_range *range;
+  int i;
+  FOR_EACH_VEC_ELT (m_layout_ranges, i, range)
+    {
+      if (0)
+	fprintf (stderr,
+		 "range ( (%i, %i), (%i, %i))->contains_point (%i, %i): %s\n",
+		 range->m_start.m_line,
+		 range->m_start.m_column,
+		 range->m_finish.m_line,
+		 range->m_finish.m_column,
+		 row,
+		 column,
+		 range->contains_point (row, column) ? "true" : "false");
+
+      if (range->contains_point (row, column))
+	{
+	  *out_range_idx = i;
+
+	  /* Are we at the range's caret?  is it visible? */
+	  *out_draw_caret_p = false;
+	  if (row == range->m_caret.m_line
+	      && column == range->m_caret.m_column)
+	    *out_draw_caret_p = range->m_show_caret_p;
+
+	  /* We are within a range.  */
+	  return true;
+	}
+    }
+
+  return false;
+}
+
+/* Get the column beyond the rightmost one that could contain a caret or
+   range marker, given that we stop rendering at trailing whitespace.  */
+
+int
+layout::get_x_bound_for_row (int row, int caret_column,
+			     int last_non_ws)
+{
+  int result = caret_column + 1;
+
+  layout_range *range;
+  int i;
+  FOR_EACH_VEC_ELT (m_layout_ranges, i, range)
+    {
+      if (row >= range->m_start.m_line)
+	{
+	  if (range->m_finish.m_line == row)
+	    {
+	      /* On the final line within a range; ensure that
+		 we render up to the end of the range.  */
+	      if (result <= range->m_finish.m_column)
+		result = range->m_finish.m_column + 1;
+	    }
+	  else if (row < range->m_finish.m_line)
+	    {
+	      /* Within a multiline range; ensure that we render up to the
+		 last non-whitespace column.  */
+	      if (result <= last_non_ws)
+		result = last_non_ws + 1;
+	    }
+	}
+    }
+
+  return result;
+}
+
+} /* End of anonymous namespace.  */
+
+/* For debugging layout issues in diagnostic_show_locus and friends,
+   render a ruler giving column numbers (after the 1-column indent).  */
+
+static void
+show_ruler (diagnostic_context *context, int max_width, int x_offset)
+{
+  /* Hundreds.  */
+  if (max_width > 99)
+    {
+      pp_space (context->printer);
+      for (int column = 1 + x_offset; column < max_width; column++)
+	if (0 == column % 10)
+	  pp_character (context->printer, '0' + (column / 100) % 10);
+	else
+	  pp_space (context->printer);
+      pp_newline (context->printer);
+    }
+
+  /* Tens.  */
+  pp_space (context->printer);
+  for (int column = 1 + x_offset; column < max_width; column++)
+    if (0 == column % 10)
+      pp_character (context->printer, '0' + (column / 10) % 10);
+    else
+      pp_space (context->printer);
+  pp_newline (context->printer);
+
+  /* Units.  */
+  pp_space (context->printer);
+  for (int column = 1 + x_offset; column < max_width; column++)
+    pp_character (context->printer, '0' + (column % 10));
+  pp_newline (context->printer);
+}
+
+/* Print the physical source code corresponding to the location of
+   this diagnostic, with additional annotations.
+   If CONTEXT has set frontend_calls_diagnostic_print_caret_line_p,
+   the code is printed using diagnostic_print_caret_line; otherwise
+   it is printed using diagnostic_print_ranges.  */
+
 void
 diagnostic_show_locus (diagnostic_context * context,
 		       const diagnostic_info *diagnostic)
@@ -75,16 +710,25 @@ diagnostic_show_locus (diagnostic_context * context,
     return;
 
   context->last_location = diagnostic_location (diagnostic, 0);
-  expanded_location s0 = diagnostic_expand_location (diagnostic, 0);
-  expanded_location s1 = { };
-  /* Zero-initialized. This is checked later by diagnostic_print_caret_line.  */
 
-  if (diagnostic_location (diagnostic, 1) > BUILTINS_LOCATION)
-    s1 = diagnostic_expand_location (diagnostic, 1);
+  if (context->frontend_calls_diagnostic_print_caret_line_p)
+    {
+      /* The GCC 5 routine. */
+      expanded_location s0 = diagnostic_expand_location (diagnostic, 0);
+      expanded_location s1 = { };
+      /* Zero-initialized. This is checked later by
+	 diagnostic_print_caret_line.  */
+
+      if (diagnostic_num_locations (diagnostic) >= 2)
+	s1 = diagnostic->message.m_richloc->get_range (1)->m_start;
 
-  diagnostic_print_caret_line (context, s0, s1,
-			       context->caret_chars[0],
-			       context->caret_chars[1]);
+      diagnostic_print_caret_line (context, s0, s1,
+				   context->caret_chars[0],
+				   context->caret_chars[1]);
+    }
+  else
+    /* The GCC 6 routine.  */
+    diagnostic_print_ranges (context, diagnostic);
 }
 
 /* Print (part) of the source line given by xloc1 with caret1 pointing
@@ -164,3 +808,33 @@ diagnostic_print_caret_line (diagnostic_context * context,
   pp_set_prefix (context->printer, saved_prefix);
   pp_needs_newline (context->printer) = true;
 }
+
+/* Print all source lines covered by the locations and any ranges
+   within DIAGNOSTIC, displaying one or more carets and zero or more
+   underlines as appropriate.  */
+
+static void
+diagnostic_print_ranges (diagnostic_context * context,
+			 const diagnostic_info *diagnostic)
+{
+  pp_newline (context->printer);
+
+  const char *saved_prefix = pp_get_prefix (context->printer);
+  pp_set_prefix (context->printer, NULL);
+
+  {
+    layout layout (context, diagnostic);
+    int last_line = layout.get_last_line ();
+    for (int row = layout.get_first_line ();
+	 row <= last_line;
+	 row++)
+      layout.print_line (row);
+
+    /* The closing scope here leads to the dtor for layout and thus
+       colorizer being called here, which affects the precise
+       place where colorization is turned off in the unittest
+       for colorized output.  */
+  }
+
+  pp_set_prefix (context->printer, saved_prefix);
+}
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 831859a..5fe6627 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -144,7 +144,7 @@ diagnostic_initialize (diagnostic_context *context, int n_opts)
     context->classify_diagnostic[i] = DK_UNSPECIFIED;
   context->show_caret = false;
   diagnostic_set_caret_max_width (context, pp_line_cutoff (context->printer));
-  for (i = 0; i < MAX_LOCATIONS_PER_MESSAGE; i++)
+  for (i = 0; i < rich_location::MAX_RANGES; i++)
     context->caret_chars[i] = '^';
   context->show_option_requested = false;
   context->abort_on_error = false;
@@ -234,16 +234,15 @@ diagnostic_finish (diagnostic_context *context)
    translated.  */
 void
 diagnostic_set_info_translated (diagnostic_info *diagnostic, const char *msg,
-				va_list *args, location_t location,
+				va_list *args, rich_location *richloc,
 				diagnostic_t kind)
 {
+  gcc_assert (richloc);
   diagnostic->message.err_no = errno;
   diagnostic->message.args_ptr = args;
   diagnostic->message.format_spec = msg;
-  diagnostic->message.set_location (0, location);
-  for (int i = 1; i < MAX_LOCATIONS_PER_MESSAGE; i++)
-    diagnostic->message.set_location (i, UNKNOWN_LOCATION);
-  diagnostic->override_column = 0;
+  diagnostic->message.m_richloc = richloc;
+  diagnostic->richloc = richloc;
   diagnostic->kind = kind;
   diagnostic->option_index = 0;
 }
@@ -252,10 +251,27 @@ diagnostic_set_info_translated (diagnostic_info *diagnostic, const char *msg,
    translated.  */
 void
 diagnostic_set_info (diagnostic_info *diagnostic, const char *gmsgid,
-		     va_list *args, location_t location,
+		     va_list *args, rich_location *richloc,
 		     diagnostic_t kind)
 {
-  diagnostic_set_info_translated (diagnostic, _(gmsgid), args, location, kind);
+  gcc_assert (richloc);
+  diagnostic_set_info_translated (diagnostic, _(gmsgid), args, richloc, kind);
+}
+
+static const char *const diagnostic_kind_color[] = {
+#define DEFINE_DIAGNOSTIC_KIND(K, T, C) (C),
+#include "diagnostic.def"
+#undef DEFINE_DIAGNOSTIC_KIND
+  NULL
+};
+
+/* Get a color name for diagnostics of type KIND
+   Result could be NULL.  */
+
+const char *
+diagnostic_get_color_for_kind (diagnostic_t kind)
+{
+  return diagnostic_kind_color[kind];
 }
 
 /* Return a malloc'd string describing a location.  The caller is
@@ -270,12 +286,6 @@ diagnostic_build_prefix (diagnostic_context *context,
 #undef DEFINE_DIAGNOSTIC_KIND
     "must-not-happen"
   };
-  static const char *const diagnostic_kind_color[] = {
-#define DEFINE_DIAGNOSTIC_KIND(K, T, C) (C),
-#include "diagnostic.def"
-#undef DEFINE_DIAGNOSTIC_KIND
-    NULL
-  };
   gcc_assert (diagnostic->kind < DK_LAST_DIAGNOSTIC_KIND);
 
   const char *text = _(diagnostic_kind_text[diagnostic->kind]);
@@ -771,10 +781,14 @@ diagnostic_report_diagnostic (diagnostic_context *context,
 
       if (option_text)
 	{
+	  const char *cs
+	    = colorize_start (pp_show_color (context->printer),
+			      diagnostic_kind_color[diagnostic->kind]);
+	  const char *ce = colorize_stop (pp_show_color (context->printer));
 	  diagnostic->message.format_spec
 	    = ACONCAT ((diagnostic->message.format_spec,
 			" ", 
-			"[", option_text, "]",
+			"[", cs, option_text, ce, "]",
 			NULL));
 	  free (option_text);
 	}
@@ -854,9 +868,40 @@ diagnostic_append_note (diagnostic_context *context,
   diagnostic_info diagnostic;
   va_list ap;
   const char *saved_prefix;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_NOTE);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_NOTE);
+  if (context->inhibit_notes_p)
+    {
+      va_end (ap);
+      return;
+    }
+  saved_prefix = pp_get_prefix (context->printer);
+  pp_set_prefix (context->printer,
+                 diagnostic_build_prefix (context, &diagnostic));
+  pp_newline (context->printer);
+  pp_format (context->printer, &diagnostic.message);
+  pp_output_formatted_text (context->printer);
+  pp_destroy_prefix (context->printer);
+  pp_set_prefix (context->printer, saved_prefix);
+  diagnostic_show_locus (context, &diagnostic);
+  va_end (ap);
+}
+
+/* Same as diagnostic_append_note, but at RICHLOC. */
+
+void
+diagnostic_append_note_at_rich_loc (diagnostic_context *context,
+				    rich_location *richloc,
+				    const char * gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
+  const char *saved_prefix;
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, richloc, DK_NOTE);
   if (context->inhibit_notes_p)
     {
       va_end (ap);
@@ -881,16 +926,17 @@ emit_diagnostic (diagnostic_t kind, location_t location, int opt,
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
   if (kind == DK_PERMERROR)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 			   permissive_error_kind (global_dc));
       diagnostic.option_index = permissive_error_option (global_dc);
     }
   else {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location, kind);
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, kind);
       if (kind == DK_WARNING || kind == DK_PEDWARN)
 	diagnostic.option_index = opt;
   }
@@ -907,9 +953,23 @@ inform (location_t location, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_NOTE);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_NOTE);
+  report_diagnostic (&diagnostic);
+  va_end (ap);
+}
+
+/* Same as "inform", but at RICHLOC.  */
+void
+inform_at_rich_loc (rich_location *richloc, const char *gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, richloc, DK_NOTE);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -922,11 +982,12 @@ inform_n (location_t location, int n, const char *singular_gmsgid,
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
 
   va_start (ap, plural_gmsgid);
   diagnostic_set_info_translated (&diagnostic,
                                   ngettext (singular_gmsgid, plural_gmsgid, n),
-                                  &ap, location, DK_NOTE);
+                                  &ap, &richloc, DK_NOTE);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -940,9 +1001,10 @@ warning (int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_WARNING);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_WARNING);
   diagnostic.option_index = opt;
 
   ret = report_diagnostic (&diagnostic);
@@ -960,9 +1022,27 @@ warning_at (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_WARNING);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_WARNING);
+  diagnostic.option_index = opt;
+  ret = report_diagnostic (&diagnostic);
+  va_end (ap);
+  return ret;
+}
+
+/* Same as warning at, but using RICHLOC.  */
+
+bool
+warning_at_rich_loc (rich_location *richloc, int opt, const char *gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
+  bool ret;
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, richloc, DK_WARNING);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
   va_end (ap);
@@ -980,11 +1060,13 @@ warning_n (location_t location, int opt, int n, const char *singular_gmsgid,
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, plural_gmsgid);
   diagnostic_set_info_translated (&diagnostic,
                                   ngettext (singular_gmsgid, plural_gmsgid, n),
-                                  &ap, location, DK_WARNING);
+                                  &ap, &richloc, DK_WARNING
+);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
   va_end (ap);
@@ -1010,9 +1092,10 @@ pedwarn (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,  DK_PEDWARN);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,  DK_PEDWARN);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
   va_end (ap);
@@ -1032,9 +1115,28 @@ permerror (location_t location, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
+                       permissive_error_kind (global_dc));
+  diagnostic.option_index = permissive_error_option (global_dc);
+  ret = report_diagnostic (&diagnostic);
+  va_end (ap);
+  return ret;
+}
+
+/* Same as "permerror", but at RICHLOC.  */
+
+bool
+permerror_at_rich_loc (rich_location *richloc, const char *gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
+  bool ret;
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, richloc,
                        permissive_error_kind (global_dc));
   diagnostic.option_index = permissive_error_option (global_dc);
   ret = report_diagnostic (&diagnostic);
@@ -1049,9 +1151,10 @@ error (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_ERROR);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ERROR);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -1064,11 +1167,12 @@ error_n (location_t location, int n, const char *singular_gmsgid,
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
 
   va_start (ap, plural_gmsgid);
   diagnostic_set_info_translated (&diagnostic,
                                   ngettext (singular_gmsgid, plural_gmsgid, n),
-                                  &ap, location, DK_ERROR);
+                                  &ap, &richloc, DK_ERROR);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -1079,9 +1183,25 @@ error_at (location_t loc, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (loc);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, loc, DK_ERROR);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ERROR);
+  report_diagnostic (&diagnostic);
+  va_end (ap);
+}
+
+/* Same as above, but use RICH_LOC.  */
+
+void
+error_at_rich_loc (rich_location *rich_loc, const char *gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, rich_loc,
+		       DK_ERROR);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -1094,9 +1214,10 @@ sorry (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_SORRY);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_SORRY);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -1117,9 +1238,10 @@ fatal_error (location_t loc, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (loc);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, loc, DK_FATAL);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_FATAL);
   report_diagnostic (&diagnostic);
   va_end (ap);
 
@@ -1135,9 +1257,10 @@ internal_error (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_ICE);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ICE);
   report_diagnostic (&diagnostic);
   va_end (ap);
 
@@ -1152,9 +1275,10 @@ internal_error_no_backtrace (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_ICE_NOBT);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ICE_NOBT);
   report_diagnostic (&diagnostic);
   va_end (ap);
 
@@ -1218,3 +1342,11 @@ real_abort (void)
 {
   abort ();
 }
+
+void
+source_range::debug (const char *msg) const
+{
+  rich_location richloc (m_start);
+  richloc.add_range (m_start, m_finish);
+  inform_at_rich_loc (&richloc, "%s", msg);
+}
diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h
index 7fcb6a8..66a867c 100644
--- a/gcc/diagnostic.h
+++ b/gcc/diagnostic.h
@@ -29,10 +29,12 @@ along with GCC; see the file COPYING3.  If not see
    list in diagnostic.def.  */
 struct diagnostic_info
 {
-  /* Text to be formatted. It also contains the location(s) for this
-     diagnostic.  */
+  /* Text to be formatted.  */
   text_info message;
-  unsigned int override_column;
+
+  /* The location at which the diagnostic is to be reported.  */
+  rich_location *richloc;
+
   /* Auxiliary data for client.  */
   void *x_data;
   /* The kind of diagnostic it is about.  */
@@ -102,8 +104,8 @@ struct diagnostic_context
   /* Maximum width of the source line printed.  */
   int caret_max_width;
 
-  /* Characters used for caret diagnostics.  */
-  char caret_chars[MAX_LOCATIONS_PER_MESSAGE];
+  /* Character used for caret diagnostics.  */
+  char caret_chars[rich_location::MAX_RANGES];
 
   /* True if we should print the command line option which controls
      each diagnostic, if known.  */
@@ -181,6 +183,11 @@ struct diagnostic_context
   int lock;
 
   bool inhibit_notes_p;
+
+  /* Does the frontend make calls to diagnostic_print_caret_line?
+     If so, we fall back to the old implementation of
+     diagnostic_show_locus.  */
+  bool frontend_calls_diagnostic_print_caret_line_p;
 };
 
 static inline void
@@ -252,10 +259,6 @@ extern diagnostic_context *global_dc;
 
 #define report_diagnostic(D) diagnostic_report_diagnostic (global_dc, D)
 
-/* Override the column number to be used for reporting a
-   diagnostic.  */
-#define diagnostic_override_column(DI, COL) (DI)->override_column = (COL)
-
 /* Override the option index to be used for reporting a
    diagnostic.  */
 #define diagnostic_override_option_index(DI, OPTIDX) \
@@ -279,13 +282,17 @@ extern bool diagnostic_report_diagnostic (diagnostic_context *,
 					  diagnostic_info *);
 #ifdef ATTRIBUTE_GCC_DIAG
 extern void diagnostic_set_info (diagnostic_info *, const char *, va_list *,
-				 location_t, diagnostic_t) ATTRIBUTE_GCC_DIAG(2,0);
+				 rich_location *, diagnostic_t) ATTRIBUTE_GCC_DIAG(2,0);
 extern void diagnostic_set_info_translated (diagnostic_info *, const char *,
-					    va_list *, location_t,
+					    va_list *, rich_location *,
 					    diagnostic_t)
      ATTRIBUTE_GCC_DIAG(2,0);
 extern void diagnostic_append_note (diagnostic_context *, location_t,
                                     const char *, ...) ATTRIBUTE_GCC_DIAG(3,4);
+extern void diagnostic_append_note_at_rich_loc (diagnostic_context *,
+						rich_location *,
+						const char *, ...)
+  ATTRIBUTE_GCC_DIAG(3,4);
 #endif
 extern char *diagnostic_build_prefix (diagnostic_context *, const diagnostic_info *);
 void default_diagnostic_starter (diagnostic_context *, diagnostic_info *);
@@ -306,6 +313,14 @@ diagnostic_location (const diagnostic_info * diagnostic, int which = 0)
   return diagnostic->message.get_location (which);
 }
 
+/* Return the number of locations to be printed in DIAGNOSTIC.  */
+
+static inline unsigned int
+diagnostic_num_locations (const diagnostic_info * diagnostic)
+{
+  return diagnostic->message.m_richloc->get_num_locations ();
+}
+
 /* Expand the location of this diagnostic. Use this function for
    consistency.  Parameter WHICH specifies which location. By default,
    expand the first one.  */
@@ -313,12 +328,7 @@ diagnostic_location (const diagnostic_info * diagnostic, int which = 0)
 static inline expanded_location
 diagnostic_expand_location (const diagnostic_info * diagnostic, int which = 0)
 {
-  expanded_location s
-    = expand_location_to_spelling_point (diagnostic_location (diagnostic,
-							      which));
-  if (which == 0 && diagnostic->override_column)
-    s.column = diagnostic->override_column;
-  return s;
+  return diagnostic->richloc->get_range (which)->m_caret;
 }
 
 /* This is somehow the right-side margin of a caret line, that is, we
@@ -344,6 +354,10 @@ diagnostic_print_caret_line (diagnostic_context * context,
 			     expanded_location xloc2,
 			     char caret1, char caret2);
 
+
+extern const char *
+diagnostic_get_color_for_kind (diagnostic_t kind);
+
 /* Pure text formatting support functions.  */
 extern char *file_name_as_prefix (diagnostic_context *, const char *);
 
diff --git a/gcc/fortran/cpp.c b/gcc/fortran/cpp.c
index daffc20..92dc584 100644
--- a/gcc/fortran/cpp.c
+++ b/gcc/fortran/cpp.c
@@ -149,9 +149,9 @@ static void cb_include (cpp_reader *, source_location, const unsigned char *,
 static void cb_ident (cpp_reader *, source_location, const cpp_string *);
 static void cb_used_define (cpp_reader *, source_location, cpp_hashnode *);
 static void cb_used_undef (cpp_reader *, source_location, cpp_hashnode *);
-static bool cb_cpp_error (cpp_reader *, int, int, location_t, unsigned int,
+static bool cb_cpp_error (cpp_reader *, int, int, rich_location *,
 			  const char *, va_list *)
-     ATTRIBUTE_GCC_DIAG(6,0);
+     ATTRIBUTE_GCC_DIAG(5,0);
 void pp_dir_change (cpp_reader *, const char *);
 
 static int dump_macro (cpp_reader *, cpp_hashnode *, void *);
@@ -1026,13 +1026,12 @@ cb_used_define (cpp_reader *pfile, source_location line ATTRIBUTE_UNUSED,
 /* Callback from cpp_error for PFILE to print diagnostics from the
    preprocessor.  The diagnostic is of type LEVEL, with REASON set
    to the reason code if LEVEL is represents a warning, at location
-   LOCATION, with column number possibly overridden by COLUMN_OVERRIDE
-   if not zero; MSG is the translated message and AP the arguments.
+   RICHLOC; MSG is the translated message and AP the arguments.
    Returns true if a diagnostic was emitted, false otherwise.  */
 
 static bool
 cb_cpp_error (cpp_reader *pfile ATTRIBUTE_UNUSED, int level, int reason,
-	      location_t location, unsigned int column_override,
+	      rich_location *richloc,
 	      const char *msg, va_list *ap)
 {
   diagnostic_info diagnostic;
@@ -1067,9 +1066,7 @@ cb_cpp_error (cpp_reader *pfile ATTRIBUTE_UNUSED, int level, int reason,
       gcc_unreachable ();
     }
   diagnostic_set_info_translated (&diagnostic, msg, ap,
-				  location, dlevel);
-  if (column_override)
-    diagnostic_override_column (&diagnostic, column_override);
+				  richloc, dlevel);
   if (reason == CPP_W_WARNING_DIRECTIVE)
     diagnostic_override_option_index (&diagnostic, OPT_Wcpp);
   ret = report_diagnostic (&diagnostic);
diff --git a/gcc/fortran/error.c b/gcc/fortran/error.c
index 3825751..3d9deb0 100644
--- a/gcc/fortran/error.c
+++ b/gcc/fortran/error.c
@@ -773,6 +773,7 @@ gfc_warning (int opt, const char *gmsgid, va_list ap)
   va_copy (argp, ap);
 
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
   bool fatal_errors = global_dc->fatal_errors;
   pretty_printer *pp = global_dc->printer;
   output_buffer *tmp_buffer = pp->buffer;
@@ -787,7 +788,7 @@ gfc_warning (int opt, const char *gmsgid, va_list ap)
       --werrorcount;
     }
 
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION,
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc,
 		       DK_WARNING);
   diagnostic.option_index = opt;
   bool ret = report_diagnostic (&diagnostic);
@@ -938,10 +939,12 @@ gfc_format_decoder (pretty_printer *pp,
 	/* If location[0] != UNKNOWN_LOCATION means that we already
 	   processed one of %C/%L.  */
 	int loc_num = text->get_location (0) == UNKNOWN_LOCATION ? 0 : 1;
-	text->set_location (loc_num,
-			    linemap_position_for_loc_and_offset (line_table,
-								 loc->lb->location,
-								 offset));
+	source_range range
+	  = source_range::from_location (
+	      linemap_position_for_loc_and_offset (line_table,
+						   loc->lb->location,
+						   offset));
+	text->set_range (loc_num, range, true);
 	pp_string (pp, result[loc_num]);
 	return true;
       }
@@ -1075,7 +1078,7 @@ gfc_diagnostic_starter (diagnostic_context *context,
 
   expanded_location s1 = diagnostic_expand_location (diagnostic);
   expanded_location s2;
-  bool one_locus = diagnostic_location (diagnostic, 1) == UNKNOWN_LOCATION;
+  bool one_locus = diagnostic->richloc->get_num_locations () < 2;
   bool same_locus = false;
 
   if (!one_locus) 
@@ -1173,10 +1176,11 @@ gfc_warning_now_at (location_t loc, int opt, const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (loc);
   bool ret;
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, loc, DK_WARNING);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc, DK_WARNING);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
   va_end (argp);
@@ -1190,10 +1194,11 @@ gfc_warning_now (int opt, const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
   bool ret;
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION,
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc,
 		       DK_WARNING);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
@@ -1209,11 +1214,12 @@ gfc_error_now (const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
 
   error_buffer.flag = true;
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION, DK_ERROR);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc, DK_ERROR);
   report_diagnostic (&diagnostic);
   va_end (argp);
 }
@@ -1226,9 +1232,10 @@ gfc_fatal_error (const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION, DK_FATAL);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc, DK_FATAL);
   report_diagnostic (&diagnostic);
   va_end (argp);
 
@@ -1291,6 +1298,7 @@ gfc_error (const char *gmsgid, va_list ap)
     }
 
   diagnostic_info diagnostic;
+  rich_location richloc (UNKNOWN_LOCATION);
   bool fatal_errors = global_dc->fatal_errors;
   pretty_printer *pp = global_dc->printer;
   output_buffer *tmp_buffer = pp->buffer;
@@ -1306,7 +1314,7 @@ gfc_error (const char *gmsgid, va_list ap)
       --errorcount;
     }
 
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION, DK_ERROR);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &richloc, DK_ERROR);
   report_diagnostic (&diagnostic);
 
   if (buffered_p)
@@ -1336,9 +1344,10 @@ gfc_internal_error (const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION, DK_ICE);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc, DK_ICE);
   report_diagnostic (&diagnostic);
   va_end (argp);
 
@@ -1472,6 +1481,7 @@ gfc_diagnostics_init (void)
   diagnostic_format_decoder (global_dc) = gfc_format_decoder;
   global_dc->caret_chars[0] = '1';
   global_dc->caret_chars[1] = '2';
+  global_dc->frontend_calls_diagnostic_print_caret_line_p = true;
   pp_warning_buffer = new (XNEW (output_buffer)) output_buffer ();
   pp_warning_buffer->flush_p = false;
   /* pp_error_buffer is statically allocated.  This simplifies memory
diff --git a/gcc/genmatch.c b/gcc/genmatch.c
index 102a635..6bfde06 100644
--- a/gcc/genmatch.c
+++ b/gcc/genmatch.c
@@ -53,14 +53,23 @@ unsigned verbose;
 
 static struct line_maps *line_table;
 
+expanded_location
+linemap_client_expand_location_to_spelling_point (source_location loc)
+{
+  const struct line_map_ordinary *map;
+  loc = linemap_resolve_location (line_table, loc, LRK_SPELLING_LOCATION, &map);
+  return linemap_expand_location (line_table, map, loc);
+}
+
 static bool
 #if GCC_VERSION >= 4001
-__attribute__((format (printf, 6, 0)))
+__attribute__((format (printf, 5, 0)))
 #endif
-error_cb (cpp_reader *, int errtype, int, source_location location,
-	  unsigned int, const char *msg, va_list *ap)
+error_cb (cpp_reader *, int errtype, int, rich_location *richloc,
+	  const char *msg, va_list *ap)
 {
   const line_map_ordinary *map;
+  source_location location = richloc->get_loc ();
   linemap_resolve_location (line_table, location, LRK_SPELLING_LOCATION, &map);
   expanded_location loc = linemap_expand_location (line_table, map, location);
   fprintf (stderr, "%s:%d:%d %s: ", loc.file, loc.line, loc.column,
@@ -102,9 +111,10 @@ __attribute__((format (printf, 2, 3)))
 #endif
 fatal_at (const cpp_token *tk, const char *msg, ...)
 {
+  rich_location richloc (tk->src_loc);
   va_list ap;
   va_start (ap, msg);
-  error_cb (NULL, CPP_DL_FATAL, 0, tk->src_loc, 0, msg, &ap);
+  error_cb (NULL, CPP_DL_FATAL, 0, &richloc, msg, &ap);
   va_end (ap);
 }
 
@@ -114,9 +124,10 @@ __attribute__((format (printf, 2, 3)))
 #endif
 fatal_at (source_location loc, const char *msg, ...)
 {
+  rich_location richloc (loc);
   va_list ap;
   va_start (ap, msg);
-  error_cb (NULL, CPP_DL_FATAL, 0, loc, 0, msg, &ap);
+  error_cb (NULL, CPP_DL_FATAL, 0, &richloc, msg, &ap);
   va_end (ap);
 }
 
@@ -126,9 +137,10 @@ __attribute__((format (printf, 2, 3)))
 #endif
 warning_at (const cpp_token *tk, const char *msg, ...)
 {
+  rich_location richloc (tk->src_loc);
   va_list ap;
   va_start (ap, msg);
-  error_cb (NULL, CPP_DL_WARNING, 0, tk->src_loc, 0, msg, &ap);
+  error_cb (NULL, CPP_DL_WARNING, 0, &richloc, msg, &ap);
   va_end (ap);
 }
 
@@ -138,9 +150,10 @@ __attribute__((format (printf, 2, 3)))
 #endif
 warning_at (source_location loc, const char *msg, ...)
 {
+  rich_location richloc (loc);
   va_list ap;
   va_start (ap, msg);
-  error_cb (NULL, CPP_DL_WARNING, 0, loc, 0, msg, &ap);
+  error_cb (NULL, CPP_DL_WARNING, 0, &richloc, msg, &ap);
   va_end (ap);
 }
 
diff --git a/gcc/input.c b/gcc/input.c
index e7302a4..bdba20f 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -751,6 +751,13 @@ expand_location_to_spelling_point (source_location loc)
   return expand_location_1 (loc, /*expansion_point_p=*/false);
 }
 
+expanded_location
+linemap_client_expand_location_to_spelling_point (source_location loc)
+{
+  return expand_location_to_spelling_point (loc);
+}
+
+
 /* If LOCATION is in a system header and if it is a virtual location for
    a token coming from the expansion of a macro, unwind it to the
    location of the expansion point of the macro.  Otherwise, just return
diff --git a/gcc/pretty-print.c b/gcc/pretty-print.c
index fdc7b4d..fe50df8 100644
--- a/gcc/pretty-print.c
+++ b/gcc/pretty-print.c
@@ -31,6 +31,27 @@ along with GCC; see the file COPYING3.  If not see
 #include <iconv.h>
 #endif
 
+/* Overwrite the range within this text_info's rich_location.
+   For use e.g. when implementing "+" in client format decoders.  */
+
+void
+text_info::set_range (unsigned int idx, source_range range, bool caret_p)
+{
+  gcc_checking_assert (m_richloc);
+  m_richloc->set_range (idx, range, caret_p, true);
+}
+
+location_t
+text_info::get_location (unsigned int index_of_location) const
+{
+  gcc_checking_assert (m_richloc);
+
+  if (index_of_location == 0)
+    return m_richloc->get_loc ();
+  else
+    return UNKNOWN_LOCATION;
+}
+
 // Default construct an output buffer.
 
 output_buffer::output_buffer ()
diff --git a/gcc/pretty-print.h b/gcc/pretty-print.h
index 36d4e37..d10272c 100644
--- a/gcc/pretty-print.h
+++ b/gcc/pretty-print.h
@@ -27,11 +27,6 @@ along with GCC; see the file COPYING3.  If not see
 /* Maximum number of format string arguments.  */
 #define PP_NL_ARGMAX   30
 
-/* Maximum number of locations associated to each message.  If
-   location 'i' is UNKNOWN_LOCATION, then location 'i+1' is not
-   valid.  */
-#define MAX_LOCATIONS_PER_MESSAGE 2
-
 /* The type of a text to be formatted according a format specification
    along with a list of things.  */
 struct text_info
@@ -40,21 +35,17 @@ struct text_info
   va_list *args_ptr;
   int err_no;  /* for %m */
   void **x_data;
+  rich_location *m_richloc;
 
-  inline void set_location (unsigned int index_of_location, location_t loc)
+  inline void set_location (unsigned int idx, location_t loc, bool caret_p)
   {
-    gcc_checking_assert (index_of_location < MAX_LOCATIONS_PER_MESSAGE);
-    this->locations[index_of_location] = loc;
+    source_range src_range;
+    src_range.m_start = loc;
+    src_range.m_finish = loc;
+    set_range (idx, src_range, caret_p);
   }
-
-  inline location_t get_location (unsigned int index_of_location) const
-  {
-    gcc_checking_assert (index_of_location < MAX_LOCATIONS_PER_MESSAGE);
-    return this->locations[index_of_location];
-  }
-
-private:
-  location_t locations[MAX_LOCATIONS_PER_MESSAGE];
+  void set_range (unsigned int idx, source_range range, bool caret_p);
+  location_t get_location (unsigned int index_of_location) const;
 };
 
 /* How often diagnostics are prefixed by their locations:
diff --git a/gcc/rtl-error.c b/gcc/rtl-error.c
index 8b9b391..d28be1d 100644
--- a/gcc/rtl-error.c
+++ b/gcc/rtl-error.c
@@ -69,9 +69,10 @@ diagnostic_for_asm (const rtx_insn *insn, const char *msg, va_list *args_ptr,
 		    diagnostic_t kind)
 {
   diagnostic_info diagnostic;
+  rich_location richloc (location_for_asm (insn));
 
   diagnostic_set_info (&diagnostic, msg, args_ptr,
-		       location_for_asm (insn), kind);
+		       &richloc, kind);
   report_diagnostic (&diagnostic);
 }
 
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-bw.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-bw.c
new file mode 100644
index 0000000..b3fe9d8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-bw.c
@@ -0,0 +1,124 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdiagnostics-show-caret" } */
+
+/* This is a collection of unittests for diagnostic_show_locus;
+   see the overview in diagnostic_plugin_test_show_locus.c.
+
+   In particular, note the discussion of why we need a very long line here:
+01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
+   and that we can't use macros in this file.  */
+
+void test_simple (void)
+{
+#if 0
+  myvar = myvar.x; /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   myvar = myvar.x;
+           ~~~~~^~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_simple_2 (void)
+{
+#if 0
+  x = first_function () + second_function ();  /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   x = first_function () + second_function ();
+       ~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+
+void test_multiline (void)
+{
+#if 0
+  x = (first_function ()
+       + second_function ()); /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   x = (first_function ()
+        ~~~~~~~~~~~~~~~~~
+        + second_function ());
+        ^ ~~~~~~~~~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_many_lines (void)
+{
+#if 0
+  x = (first_function_with_a_very_long_name (lorem, ipsum, dolor, sit, amet,
+                                            consectetur, adipiscing, elit,
+                                            sed, eiusmod, tempor,
+                                            incididunt, ut, labore, et,
+                                            dolore, magna, aliqua)
+       + second_function_with_a_very_long_name (lorem, ipsum, dolor, sit, /* { dg-warning "test" } */
+                                                amet, consectetur,
+                                                adipiscing, elit, sed,
+                                                eiusmod, tempor, incididunt,
+                                                ut, labore, et, dolore,
+                                                magna, aliqua));
+
+/* { dg-begin-multiline-output "" }
+   x = (first_function_with_a_very_long_name (lorem, ipsum, dolor, sit, amet,
+        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                             consectetur, adipiscing, elit,
+                                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                             sed, eiusmod, tempor,
+                                             ~~~~~~~~~~~~~~~~~~~~~
+                                             incididunt, ut, labore, et,
+                                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                             dolore, magna, aliqua)
+                                             ~~~~~~~~~~~~~~~~~~~~~~
+        + second_function_with_a_very_long_name (lorem, ipsum, dolor, sit,
+        ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                                 amet, consectetur,
+                                                 ~~~~~~~~~~~~~~~~~~
+                                                 adipiscing, elit, sed,
+                                                 ~~~~~~~~~~~~~~~~~~~~~~
+                                                 eiusmod, tempor, incididunt,
+                                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                                 ut, labore, et, dolore,
+                                                 ~~~~~~~~~~~~~~~~~~~~~~~
+                                                 magna, aliqua));
+                                                 ~~~~~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_richloc_from_proper_range (void)
+{
+#if 0
+  float f = 98.6f; /* { dg-warning "test" } */
+/* { dg-begin-multiline-output "" }
+   float f = 98.6f;
+             ^~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_caret_within_proper_range (void)
+{
+#if 0
+  float f = foo * bar; /* { dg-warning "17: test" } */
+/* { dg-begin-multiline-output "" }
+   float f = foo * bar;
+             ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_very_wide_line (void)
+{
+#if 0
+                                                                                float f = foo * bar; /* { dg-warning "95: test" } */
+/* { dg-begin-multiline-output "" }
+                                              float f = foo * bar;
+                                                        ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-color.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-color.c
new file mode 100644
index 0000000..68fc1b5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-color.c
@@ -0,0 +1,131 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdiagnostics-show-caret -fplugin-arg-diagnostic_plugin_test_show_locus-color" } */
+
+/* This is a collection of unittests for diagnostic_show_locus;
+   see the overview in diagnostic_plugin_test_show_locus.c.
+
+   In particular, note the discussion of why we need a very long line here:
+01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
+   and that we can't use macros in this file.  */
+
+void test_simple (void)
+{
+#if 0
+  myvar = myvar.x; /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   myvar = ^[[32m^[[Kmyvar^[[m^[[K^[[01;35m^[[K.^[[m^[[K^[[34m^[[Kx^[[m^[[K;
+           ^[[32m^[[K~~~~~^[[m^[[K^[[01;35m^[[K^^[[m^[[K^[[34m^[[K~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_simple_2 (void)
+{
+#if 0
+  x = first_function () + second_function ();  /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   x = ^[[32m^[[Kfirst_function ()^[[m^[[K ^[[01;35m^[[K+^[[m^[[K ^[[34m^[[Ksecond_function ()^[[m^[[K;
+       ^[[32m^[[K~~~~~~~~~~~~~~~~~^[[m^[[K ^[[01;35m^[[K^^[[m^[[K ^[[34m^[[K~~~~~~~~~~~~~~~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+
+void test_multiline (void)
+{
+#if 0
+  x = (first_function ()
+       + second_function ()); /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   x = (^[[32m^[[Kfirst_function ()
+ ^[[m^[[K       ^[[32m^[[K~~~~~~~~~~~~~~~~~
+^[[m^[[K        ^[[01;35m^[[K+^[[m^[[K ^[[34m^[[Ksecond_function ()^[[m^[[K);
+        ^[[01;35m^[[K^^[[m^[[K ^[[34m^[[K~~~~~~~~~~~~~~~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_many_lines (void)
+{
+#if 0
+  x = (first_function_with_a_very_long_name (lorem, ipsum, dolor, sit, amet,
+                                            consectetur, adipiscing, elit,
+                                            sed, eiusmod, tempor,
+                                            incididunt, ut, labore, et,
+                                            dolore, magna, aliqua)
+       + second_function_with_a_very_long_name (lorem, ipsum, dolor, sit, /* { dg-warning "test" } */
+                                                amet, consectetur,
+                                                adipiscing, elit, sed,
+                                                eiusmod, tempor, incididunt,
+                                                ut, labore, et, dolore,
+                                                magna, aliqua));
+
+/* { dg-begin-multiline-output "" }
+   x = (^[[32m^[[Kfirst_function_with_a_very_long_name (lorem, ipsum, dolor, sit, amet,
+ ^[[m^[[K       ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[32m^[[K                                            consectetur, adipiscing, elit,
+ ^[[m^[[K                                            ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[32m^[[K                                            sed, eiusmod, tempor,
+ ^[[m^[[K                                            ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[32m^[[K                                            incididunt, ut, labore, et,
+ ^[[m^[[K                                            ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[32m^[[K                                            dolore, magna, aliqua)
+ ^[[m^[[K                                            ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K        ^[[01;35m^[[K+^[[m^[[K ^[[34m^[[Ksecond_function_with_a_very_long_name (lorem, ipsum, dolor, sit,
+ ^[[m^[[K       ^[[01;35m^[[K^^[[m^[[K ^[[34m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                amet, consectetur,
+ ^[[m^[[K                                                ^[[34m^[[K~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                adipiscing, elit, sed,
+ ^[[m^[[K                                                ^[[34m^[[K~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                eiusmod, tempor, incididunt,
+ ^[[m^[[K                                                ^[[34m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                ut, labore, et, dolore,
+ ^[[m^[[K                                                ^[[34m^[[K~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                magna, aliqua)^[[m^[[K);
+                                                 ^[[34m^[[K~~~~~~~~~~~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_richloc_from_proper_range (void)
+{
+#if 0
+  float f = 98.6f; /* { dg-warning "test" } */
+/* { dg-begin-multiline-output "" }
+   float f = ^[[01;35m^[[K98.6f^[[m^[[K;
+             ^[[01;35m^[[K^~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_caret_within_proper_range (void)
+{
+#if 0
+  float f = foo * bar; /* { dg-warning "17: test" } */
+/* { dg-begin-multiline-output "" }
+   float f = ^[[01;35m^[[Kfoo * bar^[[m^[[K;
+             ^[[01;35m^[[K~~~~^~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_very_wide_line (void)
+{
+#if 0
+                                                                                float f = foo * bar; /* { dg-warning "95: test" } */
+/* { dg-begin-multiline-output "" }
+                                              float f = ^[[01;35m^[[Kfoo * bar^[[m^[[K;
+                                                        ^[[01;35m^[[K~~~~^~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
new file mode 100644
index 0000000..1542da6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
@@ -0,0 +1,285 @@
+/* { dg-options "-O" } */
+
+/* This plugin exercises the diagnostics-printing code.
+
+   The goal is to unit-test the range-printing code without needing any
+   correct range data within the compiler's IR.  We can't use any real
+   diagnostics for this, so we have to fake it, hence this plugin.
+
+   There are two test files used with this code:
+
+     diagnostic-test-show-locus-ascii-bw.c
+     ..........................-ascii-color.c
+
+   to exercise uncolored vs colored output by supplying plugin arguments
+   to hack in the desired behavior:
+
+     -fplugin-arg-diagnostic_plugin_test_show_locus-color
+
+   The test files contain functions, but the body of each
+   function is disabled using the preprocessor.  The plugin detects
+   the functions by name, and inject diagnostics within them, using
+   hard-coded locations relative to the top of each function.
+
+   The plugin uses a function "get_loc" below to map from line/column
+   numbers to source_location, and this relies on input_location being in
+   the same ordinary line_map as the locations in question.  The plugin
+   runs after parsing, so input_location will be at the end of the file.
+
+   This need for all of the test code to be in a single ordinary line map
+   means that each test file needs to have a very long line near the top
+   (potentially to cover the extra byte-count of colorized data),
+   to ensure that further very long lines don't start a new linemap.
+   This also means that we can't use macros in the test files.  */
+
+#include "gcc-plugin.h"
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "stringpool.h"
+#include "toplev.h"
+#include "basic-block.h"
+#include "hash-table.h"
+#include "vec.h"
+#include "ggc.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "internal-fn.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "tree.h"
+#include "tree-pass.h"
+#include "intl.h"
+#include "plugin-version.h"
+#include "diagnostic.h"
+#include "context.h"
+#include "print-tree.h"
+
+int plugin_is_GPL_compatible;
+
+const pass_data pass_data_test_show_locus =
+{
+  GIMPLE_PASS, /* type */
+  "test_show_locus", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_NONE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  0, /* todo_flags_finish */
+};
+
+class pass_test_show_locus : public gimple_opt_pass
+{
+public:
+  pass_test_show_locus(gcc::context *ctxt)
+    : gimple_opt_pass(pass_data_test_show_locus, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  bool gate (function *) { return true; }
+  virtual unsigned int execute (function *);
+
+}; // class pass_test_show_locus
+
+/* Given LINE_NUM and COL_NUM, generate a source_location in the
+   current file, relative to input_location.  This relies on the
+   location being expressible in the same ordinary line_map as
+   input_location (which is typically at the end of the source file
+   when this is called).  Hence the test files we compile with this
+   plugin must have an initial very long line (to avoid long lines
+   starting a new line map), and must not use macros.
+
+   COL_NUM uses the Emacs convention of 0-based column numbers.  */
+
+static source_location
+get_loc (unsigned int line_num, unsigned int col_num)
+{
+  /* Use input_location to get the relevant line_map */
+  const struct line_map_ordinary *line_map
+    = (const line_map_ordinary *)(linemap_lookup (line_table,
+						  input_location));
+
+  /* Convert from 0-based column numbers to 1-based column numbers.  */
+  source_location loc
+    = linemap_position_for_line_and_column (line_map,
+					    line_num, col_num + 1);
+
+  return loc;
+}
+
+/* Was "color" passed in as a plugin argument?  */
+static bool force_show_locus_color = false;
+
+/* We want to verify the colorized output of diagnostic_show_locus,
+   but turning on colorization for everything confuses "dg-warning" etc.
+   Hence we special-case it within this plugin by using this modified
+   version of default_diagnostic_finalizer, which, if "color" is
+   passed in as a plugin argument turns on colorization, but just
+   for diagnostic_show_locus.  */
+
+static void
+custom_diagnostic_finalizer (diagnostic_context *context,
+			     diagnostic_info *diagnostic)
+{
+  bool old_show_color = pp_show_color (context->printer);
+  if (force_show_locus_color)
+    pp_show_color (context->printer) = true;
+  diagnostic_show_locus (context, diagnostic);
+  pp_show_color (context->printer) = old_show_color;
+
+  pp_destroy_prefix (context->printer);
+  pp_newline_and_flush (context->printer);
+}
+
+/* Exercise the diagnostic machinery to emit various warnings,
+   for use by diagnostic-test-show-locus-*.c.
+
+   We inject each warning relative to the start of a function,
+   which avoids lots of hardcoded absolute locations.  */
+
+static void
+test_show_locus (function *fun)
+{
+  tree fndecl = fun->decl;
+  tree identifier = DECL_NAME (fndecl);
+  const char *fnname = IDENTIFIER_POINTER (identifier);
+  location_t fnstart = fun->function_start_locus;
+  int fnstart_line = LOCATION_LINE (fnstart);
+
+  diagnostic_finalizer (global_dc) = custom_diagnostic_finalizer;
+
+  /* Hardcode the "terminal width", to verify the behavior of
+     very wide lines.  */
+  global_dc->caret_max_width = 70;
+
+  if (0 == strcmp (fnname, "test_simple"))
+    {
+      const int line = fnstart_line + 2;
+      rich_location richloc (get_loc (line, 15));
+      richloc.add_range (get_loc (line, 10), get_loc (line, 14));
+      richloc.add_range (get_loc (line, 16), get_loc (line, 16));
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  if (0 == strcmp (fnname, "test_simple_2"))
+    {
+      const int line = fnstart_line + 2;
+      rich_location richloc (get_loc (line, 24));
+      richloc.add_range (get_loc (line, 6),
+			 get_loc (line, 22));
+      richloc.add_range (get_loc (line, 26),
+			 get_loc (line, 43));
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  if (0 == strcmp (fnname, "test_multiline"))
+    {
+      const int line = fnstart_line + 2;
+      rich_location richloc (get_loc (line + 1, 7));
+      richloc.add_range (get_loc (line, 7),
+			 get_loc (line, 23));
+      richloc.add_range (get_loc (line + 1, 9),
+			 get_loc (line + 1, 26));
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  if (0 == strcmp (fnname, "test_many_lines"))
+    {
+      const int line = fnstart_line + 2;
+      rich_location richloc (get_loc (line + 5, 7));
+      richloc.add_range (get_loc (line, 7),
+			 get_loc (line + 4, 65));
+      richloc.add_range (get_loc (line + 5, 9),
+			 get_loc (line + 10, 61));
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  /* Example of a rich_location constructed directly from a
+     source_range where the range is larger than one character.  */
+  if (0 == strcmp (fnname, "test_richloc_from_proper_range"))
+    {
+      const int line = fnstart_line + 2;
+      source_range src_range;
+      src_range.m_start = get_loc (line, 12);
+      src_range.m_finish = get_loc (line, 16);
+      rich_location richloc (src_range);
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  /* Example of a single-range location where the range starts
+     before the caret.  */
+  if (0 == strcmp (fnname, "test_caret_within_proper_range"))
+    {
+      const int line = fnstart_line + 2;
+      location_t caret = get_loc (line, 16);
+      source_range src_range;
+      src_range.m_start = get_loc (line, 12);
+      src_range.m_finish = get_loc (line, 20);
+      rich_location richloc (caret);
+      richloc.set_range (0, src_range, true, false);
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  /* Example of a very wide line, where the information of interest
+     is beyond the width of the terminal (hardcoded above).  */
+  if (0 == strcmp (fnname, "test_very_wide_line"))
+    {
+      const int line = fnstart_line + 2;
+      location_t caret = get_loc (line, 94);
+      source_range src_range;
+      src_range.m_start = get_loc (line, 90);
+      src_range.m_finish = get_loc (line, 98);
+      rich_location richloc (caret);
+      richloc.set_range (0, src_range, true, false);
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+}
+
+unsigned int
+pass_test_show_locus::execute (function *fun)
+{
+  test_show_locus (fun);
+  return 0;
+}
+
+static gimple_opt_pass *
+make_pass_test_show_locus (gcc::context *ctxt)
+{
+  return new pass_test_show_locus (ctxt);
+}
+
+int
+plugin_init (struct plugin_name_args *plugin_info,
+	     struct plugin_gcc_version *version)
+{
+  struct register_pass_info pass_info;
+  const char *plugin_name = plugin_info->base_name;
+  int argc = plugin_info->argc;
+  struct plugin_argument *argv = plugin_info->argv;
+
+  if (!plugin_default_version_check (version, &gcc_version))
+    return 1;
+
+  for (int i = 0; i < argc; i++)
+    {
+      if (0 == strcmp (argv[i].key, "color"))
+	force_show_locus_color = true;
+    }
+
+  pass_info.pass = make_pass_test_show_locus (g);
+  pass_info.reference_pass_name = "ssa";
+  pass_info.ref_pass_instance_number = 1;
+  pass_info.pos_op = PASS_POS_INSERT_AFTER;
+  register_callback (plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL,
+		     &pass_info);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/plugin.exp b/gcc/testsuite/gcc.dg/plugin/plugin.exp
index 39fab6e..941bccc 100644
--- a/gcc/testsuite/gcc.dg/plugin/plugin.exp
+++ b/gcc/testsuite/gcc.dg/plugin/plugin.exp
@@ -63,6 +63,9 @@ set plugin_test_list [list \
     { start_unit_plugin.c start_unit-test-1.c } \
     { finish_unit_plugin.c finish_unit-test-1.c } \
     { wide-int_plugin.c wide-int-test-1.c } \
+    { diagnostic_plugin_test_show_locus.c \
+	  diagnostic-test-show-locus-bw.c \
+	  diagnostic-test-show-locus-color.c } \
 ]
 
 foreach plugin_test $plugin_test_list {
diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp
index 7c1ab85..8cc1d87 100644
--- a/gcc/testsuite/lib/gcc-dg.exp
+++ b/gcc/testsuite/lib/gcc-dg.exp
@@ -29,6 +29,7 @@ load_lib libgloss.exp
 load_lib target-libpath.exp
 load_lib torture-options.exp
 load_lib fortran-modules.exp
+load_lib multiline.exp
 
 # We set LC_ALL and LANG to C so that we get the same error messages as expected.
 setenv LC_ALL C
diff --git a/gcc/tree-diagnostic.c b/gcc/tree-diagnostic.c
index 135f142..02009d8 100644
--- a/gcc/tree-diagnostic.c
+++ b/gcc/tree-diagnostic.c
@@ -289,7 +289,7 @@ default_tree_printer (pretty_printer *pp, text_info *text, const char *spec,
     }
 
   if (set_locus)
-    text->set_location (0, DECL_SOURCE_LOCATION (t));
+    text->set_location (0, DECL_SOURCE_LOCATION (t), true);
 
   if (DECL_P (t))
     {
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 7cd1fe7..3c34d51 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -3602,7 +3602,7 @@ void
 percent_K_format (text_info *text)
 {
   tree t = va_arg (*text->args_ptr, tree), block;
-  text->set_location (0, EXPR_LOCATION (t));
+  text->set_location (0, EXPR_LOCATION (t), true);
   gcc_assert (pp_ti_abstract_origin (text) != NULL);
   block = TREE_BLOCK (t);
   *pp_ti_abstract_origin (text) = NULL;
diff --git a/libcpp/errors.c b/libcpp/errors.c
index a33196e..c351c11 100644
--- a/libcpp/errors.c
+++ b/libcpp/errors.c
@@ -57,7 +57,8 @@ cpp_diagnostic (cpp_reader * pfile, int level, int reason,
 
   if (!pfile->cb.error)
     abort ();
-  ret = pfile->cb.error (pfile, level, reason, src_loc, 0, _(msgid), ap);
+  rich_location richloc (src_loc);
+  ret = pfile->cb.error (pfile, level, reason, &richloc, _(msgid), ap);
 
   return ret;
 }
@@ -139,7 +140,9 @@ cpp_diagnostic_with_line (cpp_reader * pfile, int level, int reason,
   
   if (!pfile->cb.error)
     abort ();
-  ret = pfile->cb.error (pfile, level, reason, src_loc, column, _(msgid), ap);
+  rich_location richloc (src_loc);
+  richloc.override_column (column);
+  ret = pfile->cb.error (pfile, level, reason, &richloc, _(msgid), ap);
 
   return ret;
 }
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 5eaea6b..a2bdfa0 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -573,9 +573,9 @@ struct cpp_callbacks
 
   /* Called to emit a diagnostic.  This callback receives the
      translated message.  */
-  bool (*error) (cpp_reader *, int, int, source_location, unsigned int,
+  bool (*error) (cpp_reader *, int, int, rich_location *,
 		 const char *, va_list *)
-       ATTRIBUTE_FPTR_PRINTF(6,0);
+       ATTRIBUTE_FPTR_PRINTF(5,0);
 
   /* Callbacks for when a macro is expanded, or tested (whether
      defined or not at the time) in #ifdef, #ifndef or "defined".  */
diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index bc747c1..bd73780 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -118,6 +118,35 @@ typedef unsigned int linenum_type;
   libcpp/location-example.txt.  */
 typedef unsigned int source_location;
 
+/* A range of source locations.
+
+   Ranges are closed:
+   m_start is the first location within the range,
+   m_finish is the last location within the range.
+
+   We may need a more compact way to store these, but for now,
+   let's do it the simple way, as a pair.  */
+struct GTY(()) source_range
+{
+  source_location m_start;
+  source_location m_finish;
+
+  void debug (const char *msg) const;
+
+  /* We avoid using constructors, since various structs that
+     don't yet have constructors will embed instances of
+     source_range.  */
+
+  /* Make a source_range from a source_location.  */
+  static source_range from_location (source_location loc)
+  {
+    source_range result;
+    result.m_start = loc;
+    result.m_finish = loc;
+    return result;
+  }
+};
+
 /* Memory allocation function typedef.  Works like xrealloc.  */
 typedef void *(*line_map_realloc) (void *, size_t);
 
@@ -1015,6 +1044,175 @@ typedef struct
   bool sysp;
 } expanded_location;
 
+/* Both gcc and emacs number source *lines* starting at 1, but
+   they have differing conventions for *columns*.
+
+   GCC uses a 1-based convention for source columns,
+   whereas Emacs's M-x column-number-mode uses a 0-based convention.
+
+   For example, an error in the initial, left-hand
+   column of source line 3 is reported by GCC as:
+
+      some-file.c:3:1: error: ...etc...
+
+   On navigating to the location of that error in Emacs
+   (e.g. via "next-error"),
+   the locus is reported in the Mode Line
+   (assuming M-x column-number-mode) as:
+
+     some-file.c   10%   (3, 0)
+
+   i.e. "3:1:" in GCC corresponds to "(3, 0)" in Emacs.  */
+
+/* Ranges are closed
+   m_start is the first location within the range, and
+   m_finish is the last location within the range.  */
+struct location_range
+{
+  expanded_location m_start;
+  expanded_location m_finish;
+
+  /* Should a caret be drawn for this range?  Typically this is
+     true for the 0th range, and false for subsequent ranges,
+     but the Fortran frontend overrides this for rendering things like:
+
+       x = x + y
+           1   2
+       Error: Shapes for operands at (1) and (2) are not conformable
+
+     where "1" and "2" are notionally carets.  */
+  bool m_show_caret_p;
+  expanded_location m_caret;
+};
+
+/* A "rich" source code location, for use when printing diagnostics.
+   A rich_location has one or more ranges, each optionally with
+   a caret.   Typically the zeroth range has a caret; other ranges
+   sometimes have carets.
+
+   The "primary" location of a rich_location is the caret of range 0,
+   used for determining the line/column when printing diagnostic
+   text, such as:
+
+      some-file.c:3:1: error: ...etc...
+
+   Additional ranges may be added to help the user identify other
+   pertinent clauses in a diagnostic.
+
+   rich_location instances are intended to be allocated on the stack
+   when generating diagnostics, and to be short-lived.
+
+   Examples of rich locations
+   --------------------------
+
+   Example A
+   *********
+      int i = "foo";
+              ^
+   This "rich" location is simply a single range (range 0), with
+   caret = start = finish at the given point.
+
+   Example B
+   *********
+      a = (foo && bar)
+          ~~~~~^~~~~~~
+   This rich location has a single range (range 0), with the caret
+   at the first "&", and the start/finish at the parentheses.
+   Compare with example C below.
+
+   Example C
+   *********
+      a = (foo && bar)
+           ~~~ ^~ ~~~
+   This rich location has three ranges:
+   - Range 0 has its caret and start location at the first "&" and
+     end at the second "&.
+   - Range 1 has its start and finish at the "f" and "o" of "foo";
+     the caret is not flagged for display, but is perhaps at the "f"
+     of "foo".
+   - Similarly, range 2 has its start and finish at the "b" and "r" of
+     "bar"; the caret is not flagged for display, but is perhaps at the
+     "b" of "bar".
+   Compare with example B above.
+
+   Example D (Fortran frontend)
+   ****************************
+       x = x + y
+           1   2
+   This rich location has range 0 at "1", and range 1 at "2".
+   Both are flagged for caret display.  Both ranges have start/finish
+   equal to their caret point.  The frontend overrides the diagnostic
+   context's default caret character for these ranges.
+
+   Example E
+   *********
+      printf ("arg0: %i  arg1: %s arg2: %i",
+                               ^~
+              100, 101, 102);
+                   ~~~
+   This rich location has two ranges:
+   - range 0 is at the "%s" with start = caret = "%" and finish at
+     the "s".
+   - range 1 has start/finish covering the "101" and is not flagged for
+     caret printing; it is perhaps at the start of "101".  */
+
+class rich_location
+{
+ public:
+  /* Constructors.  */
+
+  /* Constructing from a location.  */
+  rich_location (source_location loc);
+
+  /* Constructing from a source_range.  */
+  rich_location (source_range src_range);
+
+  /* Accessors.  */
+  source_location get_loc () const { return m_loc; }
+
+  source_location *get_loc_addr () { return &m_loc; }
+
+  void
+  add_range (source_location start, source_location finish,
+	     bool show_caret_p = false);
+
+  void
+  add_range (source_range src_range,
+	     bool show_caret_p = false);
+
+  void
+  add_range (location_range *src_range);
+
+  void
+  set_range (unsigned int idx, source_range src_range,
+	     bool show_caret_p, bool overwrite_loc_p);
+
+  unsigned int get_num_locations () const { return m_num_ranges; }
+
+  location_range *get_range (unsigned int idx)
+  {
+    linemap_assert (idx < m_num_ranges);
+    return &m_ranges[idx];
+  }
+
+  expanded_location lazily_expand_location ();
+
+  void
+  override_column (int column);
+
+public:
+  static const int MAX_RANGES = 3;
+
+protected:
+  source_location m_loc;
+
+  unsigned int m_num_ranges;
+  location_range m_ranges[MAX_RANGES];
+
+  bool m_have_expanded_location;
+  expanded_location m_expanded_location;
+};
+
 /* This is enum is used by the function linemap_resolve_location
    below.  The meaning of the values is explained in the comment of
    that function.  */
@@ -1158,4 +1356,13 @@ void linemap_dump (FILE *, struct line_maps *, unsigned, bool);
    specifies how many macro maps to dump.  */
 void line_table_dump (FILE *, struct line_maps *, unsigned int, unsigned int);
 
+/* The rich_location class requires a way to expand source_location instances.
+   We would directly use expand_location_to_spelling_point, which is
+   implemented in gcc/input.c, but we also need to use it for rich_location
+   within genmatch.c.
+   Hence we require client code of libcpp to implement the following
+   symbol.  */
+extern expanded_location
+linemap_client_expand_location_to_spelling_point (source_location );
+
 #endif /* !LIBCPP_LINE_MAP_H  */
diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index 3d82e9b..a6fa782 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -1752,3 +1752,133 @@ line_table_dump (FILE *stream, struct line_maps *set, unsigned int num_ordinary,
       fprintf (stream, "\n");
     }
 }
+
+/* class rich_location.  */
+
+/* Construct a rich_location with location LOC as its initial range.  */
+
+rich_location::rich_location (source_location loc) :
+  m_loc (loc),
+  m_num_ranges (0),
+  m_have_expanded_location (false)
+{
+  /* Set up the 0th range: */
+  add_range (loc, loc, true);
+  m_ranges[0].m_caret = lazily_expand_location ();
+}
+
+/* Construct a rich_location with source_range SRC_RANGE as its
+   initial range.  */
+
+rich_location::rich_location (source_range src_range)
+: m_loc (src_range.m_start),
+  m_num_ranges (0),
+  m_have_expanded_location (false)
+{
+  /* Set up the 0th range: */
+  add_range (src_range, true);
+}
+
+/* Get an expanded_location for this rich_location's primary
+   location.  */
+
+expanded_location
+rich_location::lazily_expand_location ()
+{
+  if (!m_have_expanded_location)
+    {
+      m_expanded_location
+	= linemap_client_expand_location_to_spelling_point (m_loc);
+      m_have_expanded_location = true;
+    }
+
+  return m_expanded_location;
+}
+
+/* Set the column of the primary location.  */
+
+void
+rich_location::override_column (int column)
+{
+  lazily_expand_location ();
+  m_expanded_location.column = column;
+}
+
+/* Add the given range.  */
+
+void
+rich_location::add_range (source_location start, source_location finish,
+			  bool show_caret_p)
+{
+  linemap_assert (m_num_ranges < MAX_RANGES);
+
+  location_range *range = &m_ranges[m_num_ranges++];
+  range->m_start = linemap_client_expand_location_to_spelling_point (start);
+  range->m_finish = linemap_client_expand_location_to_spelling_point (finish);
+  range->m_caret = range->m_start;
+  range->m_show_caret_p = show_caret_p;
+}
+
+/* Add the given range.  */
+
+void
+rich_location::add_range (source_range src_range, bool show_caret_p)
+{
+  linemap_assert (m_num_ranges < MAX_RANGES);
+
+  add_range (src_range.m_start, src_range.m_finish, show_caret_p);
+}
+
+void
+rich_location::add_range (location_range *src_range)
+{
+  linemap_assert (m_num_ranges < MAX_RANGES);
+
+  m_ranges[m_num_ranges++] = *src_range;
+}
+
+/* Add or overwrite the range given by IDX.  It must either
+   overwrite an existing range, or add one *exactly* on the end of
+   the array.
+
+   This is primarily for use by gcc when implementing diagnostic
+   format decoders e.g. the "+" in the C/C++ frontends, for handling
+   format codes like "%q+D" (which writes the source location of a
+   tree back into range 0 of the rich_location).
+
+   If SHOW_CARET_P is true, then the range should be rendered with
+   a caret at its starting location.  This
+   is for use by the Fortran frontend, for implementing the
+   "%C" and "%L" format codes.  */
+
+void
+rich_location::set_range (unsigned int idx, source_range src_range,
+			  bool show_caret_p, bool overwrite_loc_p)
+{
+  linemap_assert (idx < MAX_RANGES);
+
+  /* We can either overwrite an existing range, or add one exactly
+     on the end of the array.  */
+  linemap_assert (idx <= m_num_ranges);
+
+  location_range *locrange = &m_ranges[idx];
+  locrange->m_start
+    = linemap_client_expand_location_to_spelling_point (src_range.m_start);
+  locrange->m_finish
+    = linemap_client_expand_location_to_spelling_point (src_range.m_finish);
+
+  locrange->m_show_caret_p = show_caret_p;
+  if (overwrite_loc_p)
+    locrange->m_caret = locrange->m_start;
+
+  /* Are we adding a range onto the end?  */
+  if (idx == m_num_ranges)
+    m_num_ranges = idx + 1;
+
+  if (idx == 0 && overwrite_loc_p)
+    {
+      m_loc = src_range.m_start;
+      /* Mark any cached value here as dirty.  */
+      m_have_expanded_location = false;
+    }
+}
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 4/5] Implement tree expression tracking in C FE (v2)
  2015-09-22 21:09 [PATCH 0/5] RFC: Overhaul of diagnostics (v2) David Malcolm
                   ` (3 preceding siblings ...)
  2015-09-22 21:33 ` [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2) David Malcolm
@ 2015-09-22 22:23 ` David Malcolm
  2015-09-25 14:22   ` Dodji Seketeli
  2015-09-23 13:36 ` [PATCH 0/5] RFC: Overhaul of diagnostics (v2) Michael Matz
  2015-10-23 20:25 ` [PATCH 00/10] Overhaul of diagnostics (v5) David Malcolm
  6 siblings, 1 reply; 83+ messages in thread
From: David Malcolm @ 2015-09-22 22:23 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

This is a combination of various patches from v1 of the kit,
including:
  12/22: Add source-ranges for trees
  13/22: gcc-rich-location.[ch]: add methods for working with tree ranges
  14/22: C: capture tree ranges for various expressions

The implementation of how ranges are stored has completely changed
since v1 of the kit.  Rather than introducing a SOURCE_RANGE tree node
and adding fields to decl and expr, the patch now captures ranges
for all C expressions during parsing within a new field of c_expr,
and for all tree nodes with a location_t, it stores them in
ad-hoc locations for later use.

Hence compound expressions get ranges; see:
  https://dmalcolm.fedorapeople.org/gcc/2015-09-22/diagnostic-test-expressions-1.html

and for this example:

  int test (int foo)
  {
    return foo * 100;
           ^^^   ^^^
  }

we have access to the ranges of "foo" and "100" during C parsing via
the c_expr, but once we have GENERIC, all we have is a VAR_DECL and an
INTEGER_CST (the former's location is in at the top of the
function, and the latter has no location).

This restriction means that I had to remove various expressions from
diagnostic-test-expressions-1.c; specifically:
  test_global
  test_param
  test_local
  test_integer_constants
  test_character_constants
  test_floating_constants
  test_enumeration_constant
  test_string_literal
  test_unary_plus
  test_sizeof

There are still some FIXMEs in here that probably need addressing.

gcc/ChangeLog:
	* Makefile.in (OBJS): Add gcc-rich-location.o.
	* gcc-rich-location.c: New file.
	* gcc-rich-location.h: New file.
	* gimple.h (gimple_set_block): Use "set_block".
	* print-tree.c (print_node): Print any source range information.
	* tree-cfg.c (move_block_to_fn): Use "set_block".
	(move_block_to_fn): Likewise.
	* tree-inline.c (copy_phis_for_bb): Likewise.
	* tree.c (tree_set_block): Use "set_block".
	(set_source_range): New functions.
	(set_block): New function.
	* tree.h (CAN_HAVE_RANGE_P): New.
	(EXPR_LOCATION_RANGE): New.
	(EXPR_HAS_RANGE): New.
	(get_expr_source_range): New inline function.
	(DECL_LOCATION_RANGE): New.
	(set_source_range): New decls.
	(set_block): New decl.
	(get_decl_source_range): New inline function.

gcc/c-family/ChangeLog:
	* c-common.c (c_fully_fold_internal): Capture existing souce_range,
	and store it on the result.

gcc/c/ChangeLog:
	* c-parser.c (set_c_expr_source_range): New functions.
	(c_parser_expr_no_commas): Call set_c_expr_source_range on the ret
	based on the range from the start of the LHS to the end of the
	RHS.
	(c_parser_conditional_expression): Likewise, based on the range
	from the start of the cond.value to the end of exp2.value.
	(c_parser_binary_expression): Call set_c_expr_source_range on
	the stack values for TRUTH_ANDIF_EXPR and TRUTH_ORIF_EXPR.
	(c_parser_cast_expression): Call set_c_expr_source_range on ret
	based on the cast_loc through to the end of the expr.
	(c_parser_unary_expression): Likewise, based on the
	op_loc through to the end of op.
	(c_parser_sizeof_expression) Likewise, based on the start of the
	sizeof token through to either the closing paren or the end of
	expr.
	(c_parser_postfix_expression): Likewise, using the token range,
	or from the open paren through to the close paren for
	parenthesized expressions.
	(c_parser_postfix_expression_after_primary): Likewise, for
	various kinds of expression.
	* c-tree.h (struct c_expr): Add field "src_range".
	(set_c_expr_source_range): New decls.
	* c-typeck.c (parser_build_unary_op): Call set_c_expr_source_range
	on ret for prefix unary ops.
	(parser_build_binary_op): Likewise, running from the start of
	arg1.value through to the end of arg2.value.

gcc/testsuite/ChangeLog:
	* gcc.dg/plugin/diagnostic-test-expressions-1.c: New file.
	* gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c:
	New file.
	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add
	diagnostic_plugin_test_tree_expression_range.c and
	diagnostic-test-expressions-1.c.

libcpp/ChangeLog:
	* include/line-map.h (location_adhoc_data): Add field "src_range".
	(get_combined_adhoc_loc): Add source_range param.
	(get_range_from_adhoc_loc): New decl.
	(COMBINE_LOCATION_DATA): Add  source_range param.
	* line-map.c (location_adhoc_data_hash): Contribute the src_range
	start and finish to the hash value.
	(location_adhoc_data_eq): Require that the src_range values be
	equal.
	(get_combined_adhoc_loc): Add source_range param and store it.
	Remove the requirement that "data" be non-NULL.
	(get_range_from_adhoc_loc): New function.
	(linemap_expand_location): Move the update of "loc" until after
	extracting "data".
---
 gcc/Makefile.in                                    |   1 +
 gcc/c-family/c-common.c                            |  10 +-
 gcc/c/c-parser.c                                   |  89 ++++-
 gcc/c/c-tree.h                                     |  11 +
 gcc/c/c-typeck.c                                   |  10 +
 gcc/gcc-rich-location.c                            |  86 +++++
 gcc/gcc-rich-location.h                            |  47 +++
 gcc/gimple.h                                       |   6 +-
 gcc/print-tree.c                                   |  21 +
 .../gcc.dg/plugin/diagnostic-test-expressions-1.c  | 422 +++++++++++++++++++++
 .../diagnostic_plugin_test_tree_expression_range.c | 159 ++++++++
 gcc/testsuite/gcc.dg/plugin/plugin.exp             |   2 +
 gcc/tree-cfg.c                                     |   9 +-
 gcc/tree-inline.c                                  |   5 +-
 gcc/tree.c                                         |  40 +-
 gcc/tree.h                                         |  40 ++
 libcpp/include/line-map.h                          |  13 +-
 libcpp/line-map.c                                  |  26 +-
 18 files changed, 964 insertions(+), 33 deletions(-)
 create mode 100644 gcc/gcc-rich-location.c
 create mode 100644 gcc/gcc-rich-location.h
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 009c745..8cd446d 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1255,6 +1255,7 @@ OBJS = \
 	fold-const.o \
 	function.o \
 	fwprop.o \
+	gcc-rich-location.o \
 	gcse.o \
 	gcse-common.o \
 	ggc-common.o \
diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index ded23d3..4505db7 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -1178,6 +1178,7 @@ c_fully_fold_internal (tree expr, bool in_init, bool *maybe_const_operands,
   bool op0_const_self = true, op1_const_self = true, op2_const_self = true;
   bool nowarning = TREE_NO_WARNING (expr);
   bool unused_p;
+  source_range old_range;
 
   /* This function is not relevant to C++ because C++ folds while
      parsing, and may need changes to be correct for C++ when C++
@@ -1193,6 +1194,9 @@ c_fully_fold_internal (tree expr, bool in_init, bool *maybe_const_operands,
       || code == SAVE_EXPR)
     return expr;
 
+  if (IS_EXPR_CODE_CLASS (kind))
+    old_range = EXPR_LOCATION_RANGE (expr);
+
   /* Operands of variable-length expressions (function calls) have
      already been folded, as have __builtin_* function calls, and such
      expressions cannot occur in constant expressions.  */
@@ -1617,7 +1621,11 @@ c_fully_fold_internal (tree expr, bool in_init, bool *maybe_const_operands,
       TREE_NO_WARNING (ret) = 1;
     }
   if (ret != expr)
-    protected_set_expr_location (ret, loc);
+    {
+      protected_set_expr_location (ret, loc);
+      if (IS_EXPR_CODE_CLASS (kind))
+	set_source_range (&ret, old_range.m_start, old_range.m_finish);
+    }
   return ret;
 }
 
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 5edf563..f0f39d4 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -67,6 +67,24 @@ along with GCC; see the file COPYING3.  If not see
 #include "gomp-constants.h"
 #include "c-family/c-indentation.h"
 
+void
+set_c_expr_source_range (c_expr *expr,
+			 location_t start, location_t finish)
+{
+  expr->src_range.m_start = start;
+  expr->src_range.m_finish = finish;
+  set_source_range (&expr->value,
+		    start, finish);
+}
+
+void
+set_c_expr_source_range (c_expr *expr,
+			 source_range src_range)
+{
+  expr->src_range = src_range;
+  set_source_range (&expr->value, src_range);
+}
+
 \f
 /* Initialization routine for this file.  */
 
@@ -6053,6 +6071,9 @@ c_parser_expr_no_commas (c_parser *parser, struct c_expr *after,
   ret.value = build_modify_expr (op_location, lhs.value, lhs.original_type,
 				 code, exp_location, rhs.value,
 				 rhs.original_type);
+  set_c_expr_source_range (&ret,
+			   lhs.src_range.m_start,
+			   rhs.src_range.m_finish);
   if (code == NOP_EXPR)
     ret.original_code = MODIFY_EXPR;
   else
@@ -6083,7 +6104,7 @@ c_parser_conditional_expression (c_parser *parser, struct c_expr *after,
 				 tree omp_atomic_lhs)
 {
   struct c_expr cond, exp1, exp2, ret;
-  location_t cond_loc, colon_loc, middle_loc;
+  location_t start, cond_loc, colon_loc, middle_loc;
 
   gcc_assert (!after || c_dialect_objc ());
 
@@ -6091,6 +6112,10 @@ c_parser_conditional_expression (c_parser *parser, struct c_expr *after,
 
   if (c_parser_next_token_is_not (parser, CPP_QUERY))
     return cond;
+  if (cond.value != error_mark_node)
+    start = cond.src_range.m_start;
+  else
+    start = UNKNOWN_LOCATION;
   cond_loc = c_parser_peek_token (parser)->location;
   cond = convert_lvalue_to_rvalue (cond_loc, cond, true, true);
   c_parser_consume_token (parser);
@@ -6166,6 +6191,9 @@ c_parser_conditional_expression (c_parser *parser, struct c_expr *after,
 			   ? t1
 			   : NULL);
     }
+  set_c_expr_source_range (&ret,
+			   start,
+			   exp2.src_range.m_finish);
   return ret;
 }
 
@@ -6318,6 +6346,7 @@ c_parser_binary_expression (c_parser *parser, struct c_expr *after,
     {
       enum c_parser_prec oprec;
       enum tree_code ocode;
+      source_range src_range;
       if (parser->error)
 	goto out;
       switch (c_parser_peek_token (parser)->type)
@@ -6406,6 +6435,7 @@ c_parser_binary_expression (c_parser *parser, struct c_expr *after,
       switch (ocode)
 	{
 	case TRUTH_ANDIF_EXPR:
+	  src_range = stack[sp].expr.src_range;
 	  stack[sp].expr
 	    = convert_lvalue_to_rvalue (stack[sp].loc,
 					stack[sp].expr, true, true);
@@ -6413,8 +6443,10 @@ c_parser_binary_expression (c_parser *parser, struct c_expr *after,
 	    (stack[sp].loc, default_conversion (stack[sp].expr.value));
 	  c_inhibit_evaluation_warnings += (stack[sp].expr.value
 					    == truthvalue_false_node);
+	  set_c_expr_source_range (&stack[sp].expr, src_range);
 	  break;
 	case TRUTH_ORIF_EXPR:
+	  src_range = stack[sp].expr.src_range;
 	  stack[sp].expr
 	    = convert_lvalue_to_rvalue (stack[sp].loc,
 					stack[sp].expr, true, true);
@@ -6422,6 +6454,7 @@ c_parser_binary_expression (c_parser *parser, struct c_expr *after,
 	    (stack[sp].loc, default_conversion (stack[sp].expr.value));
 	  c_inhibit_evaluation_warnings += (stack[sp].expr.value
 					    == truthvalue_true_node);
+	  set_c_expr_source_range (&stack[sp].expr, src_range);
 	  break;
 	default:
 	  break;
@@ -6490,6 +6523,10 @@ c_parser_cast_expression (c_parser *parser, struct c_expr *after)
 	expr = convert_lvalue_to_rvalue (expr_loc, expr, true, true);
       }
       ret.value = c_cast_expr (cast_loc, type_name, expr.value);
+      if (ret.value && expr.value)
+	set_c_expr_source_range (&ret,
+				 cast_loc,
+				 expr.src_range.m_finish);
       ret.original_code = ERROR_MARK;
       ret.original_type = NULL;
       return ret;
@@ -6539,6 +6576,7 @@ c_parser_unary_expression (c_parser *parser)
   struct c_expr ret, op;
   location_t op_loc = c_parser_peek_token (parser)->location;
   location_t exp_loc;
+  location_t finish;
   ret.original_code = ERROR_MARK;
   ret.original_type = NULL;
   switch (c_parser_peek_token (parser)->type)
@@ -6578,8 +6616,10 @@ c_parser_unary_expression (c_parser *parser)
       c_parser_consume_token (parser);
       exp_loc = c_parser_peek_token (parser)->location;
       op = c_parser_cast_expression (parser, NULL);
+      finish = op.src_range.m_finish;
       op = convert_lvalue_to_rvalue (exp_loc, op, true, true);
       ret.value = build_indirect_ref (op_loc, op.value, RO_UNARY_STAR);
+      set_c_expr_source_range (&ret, op_loc, finish);
       return ret;
     case CPP_PLUS:
       if (!c_dialect_objc () && !in_system_header_at (input_location))
@@ -6667,8 +6707,15 @@ static struct c_expr
 c_parser_sizeof_expression (c_parser *parser)
 {
   struct c_expr expr;
+  struct c_expr result;
   location_t expr_loc;
   gcc_assert (c_parser_next_token_is_keyword (parser, RID_SIZEOF));
+
+  location_t start;
+  location_t finish = UNKNOWN_LOCATION;
+
+  start = c_parser_peek_token (parser)->location;
+
   c_parser_consume_token (parser);
   c_inhibit_evaluation_warnings++;
   in_sizeof++;
@@ -6682,6 +6729,7 @@ c_parser_sizeof_expression (c_parser *parser)
       expr_loc = c_parser_peek_token (parser)->location;
       type_name = c_parser_type_name (parser);
       c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, "expected %<)%>");
+      finish = parser->tokens_buf[0].range.m_finish; // FIXME: better access API to last token
       if (type_name == NULL)
 	{
 	  struct c_expr ret;
@@ -6697,17 +6745,19 @@ c_parser_sizeof_expression (c_parser *parser)
 	  expr = c_parser_postfix_expression_after_paren_type (parser,
 							       type_name,
 							       expr_loc);
+	  finish = expr.src_range.m_finish;
 	  goto sizeof_expr;
 	}
       /* sizeof ( type-name ).  */
       c_inhibit_evaluation_warnings--;
       in_sizeof--;
-      return c_expr_sizeof_type (expr_loc, type_name);
+      result = c_expr_sizeof_type (expr_loc, type_name);
     }
   else
     {
       expr_loc = c_parser_peek_token (parser)->location;
       expr = c_parser_unary_expression (parser);
+      finish = expr.src_range.m_finish;
     sizeof_expr:
       c_inhibit_evaluation_warnings--;
       in_sizeof--;
@@ -6715,8 +6765,11 @@ c_parser_sizeof_expression (c_parser *parser)
       if (TREE_CODE (expr.value) == COMPONENT_REF
 	  && DECL_C_BIT_FIELD (TREE_OPERAND (expr.value, 1)))
 	error_at (expr_loc, "%<sizeof%> applied to a bit-field");
-      return c_expr_sizeof_expr (expr_loc, expr);
+      result = c_expr_sizeof_expr (expr_loc, expr);
     }
+  if (finish != UNKNOWN_LOCATION)
+    set_c_expr_source_range (&result, start, finish);
+  return result;
 }
 
 /* Parse an alignof expression.  */
@@ -7136,12 +7189,14 @@ c_parser_postfix_expression (c_parser *parser)
   struct c_expr expr, e1;
   struct c_type_name *t1, *t2;
   location_t loc = c_parser_peek_token (parser)->location;;
+  source_range tok_range = c_parser_peek_token (parser)->range;
   expr.original_code = ERROR_MARK;
   expr.original_type = NULL;
   switch (c_parser_peek_token (parser)->type)
     {
     case CPP_NUMBER:
       expr.value = c_parser_peek_token (parser)->value;
+      set_c_expr_source_range (&expr, tok_range);
       loc = c_parser_peek_token (parser)->location;
       c_parser_consume_token (parser);
       if (TREE_CODE (expr.value) == FIXED_CST
@@ -7156,6 +7211,7 @@ c_parser_postfix_expression (c_parser *parser)
     case CPP_CHAR32:
     case CPP_WCHAR:
       expr.value = c_parser_peek_token (parser)->value;
+      set_c_expr_source_range (&expr, tok_range);
       c_parser_consume_token (parser);
       break;
     case CPP_STRING:
@@ -7164,6 +7220,7 @@ c_parser_postfix_expression (c_parser *parser)
     case CPP_WSTRING:
     case CPP_UTF8STRING:
       expr.value = c_parser_peek_token (parser)->value;
+      set_c_expr_source_range (&expr, tok_range);
       expr.original_code = STRING_CST;
       c_parser_consume_token (parser);
       break;
@@ -7171,6 +7228,7 @@ c_parser_postfix_expression (c_parser *parser)
       gcc_assert (c_dialect_objc ());
       expr.value
 	= objc_build_string_object (c_parser_peek_token (parser)->value);
+      set_c_expr_source_range (&expr, tok_range);
       c_parser_consume_token (parser);
       break;
     case CPP_NAME:
@@ -7184,6 +7242,7 @@ c_parser_postfix_expression (c_parser *parser)
 					     (c_parser_peek_token (parser)->type
 					      == CPP_OPEN_PAREN),
 					     &expr.original_type);
+	    set_c_expr_source_range (&expr, tok_range);
 	    break;
 	  }
 	case C_ID_CLASSNAME:
@@ -7272,6 +7331,7 @@ c_parser_postfix_expression (c_parser *parser)
       else
 	{
 	  /* A parenthesized expression.  */
+	  location_t loc_open_paren = c_parser_peek_token (parser)->location;
 	  c_parser_consume_token (parser);
 	  expr = c_parser_expression (parser);
 	  if (TREE_CODE (expr.value) == MODIFY_EXPR)
@@ -7279,6 +7339,8 @@ c_parser_postfix_expression (c_parser *parser)
 	  if (expr.original_code != C_MAYBE_CONST_EXPR)
 	    expr.original_code = ERROR_MARK;
 	  /* Don't change EXPR.ORIGINAL_TYPE.  */
+	  location_t loc_close_paren = c_parser_peek_token (parser)->location;
+	  set_c_expr_source_range (&expr, loc_open_paren, loc_close_paren);
 	  c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
 				     "expected %<)%>");
 	}
@@ -7869,6 +7931,8 @@ c_parser_postfix_expression_after_primary (c_parser *parser,
   vec<tree, va_gc> *exprlist;
   vec<tree, va_gc> *origtypes = NULL;
   vec<location_t> arg_loc = vNULL;
+  location_t start;
+  location_t finish;
 
   while (true)
     {
@@ -7905,7 +7969,10 @@ c_parser_postfix_expression_after_primary (c_parser *parser,
 		{
 		  c_parser_skip_until_found (parser, CPP_CLOSE_SQUARE,
 					     "expected %<]%>");
+		  start = expr.src_range.m_start;
+		  finish = parser->tokens_buf[0].range.m_finish; // FIXME: better access API to last token
 		  expr.value = build_array_ref (op_loc, expr.value, idx);
+		  set_c_expr_source_range (&expr, start, finish);
 		}
 	    }
 	  expr.original_code = ERROR_MARK;
@@ -7948,9 +8015,13 @@ c_parser_postfix_expression_after_primary (c_parser *parser,
 			"%<memset%> used with constant zero length parameter; "
 			"this could be due to transposed parameters");
 
+	  start = expr.src_range.m_start;
+	  finish = parser->tokens_buf[0].range.m_finish; // FIXME: better access API to last token
 	  expr.value
 	    = c_build_function_call_vec (expr_loc, arg_loc, expr.value,
 					 exprlist, origtypes);
+	  set_c_expr_source_range (&expr, start, finish);
+
 	  expr.original_code = ERROR_MARK;
 	  if (TREE_CODE (expr.value) == INTEGER_CST
 	      && TREE_CODE (orig_expr.value) == FUNCTION_DECL
@@ -7979,8 +8050,11 @@ c_parser_postfix_expression_after_primary (c_parser *parser,
               expr.original_type = NULL;
 	      return expr;
 	    }
+	  start = expr.src_range.m_start;
+	  finish = c_parser_peek_token (parser)->range.m_finish;
 	  c_parser_consume_token (parser);
 	  expr.value = build_component_ref (op_loc, expr.value, ident);
+	  set_c_expr_source_range (&expr, start, finish);
 	  expr.original_code = ERROR_MARK;
 	  if (TREE_CODE (expr.value) != COMPONENT_REF)
 	    expr.original_type = NULL;
@@ -8008,12 +8082,15 @@ c_parser_postfix_expression_after_primary (c_parser *parser,
 	      expr.original_type = NULL;
 	      return expr;
 	    }
+	  start = expr.src_range.m_start;
+	  finish = c_parser_peek_token (parser)->range.m_finish;
 	  c_parser_consume_token (parser);
 	  expr.value = build_component_ref (op_loc,
 					    build_indirect_ref (op_loc,
 								expr.value,
 								RO_ARROW),
 					    ident);
+	  set_c_expr_source_range (&expr, start, finish);
 	  expr.original_code = ERROR_MARK;
 	  if (TREE_CODE (expr.value) != COMPONENT_REF)
 	    expr.original_type = NULL;
@@ -8029,6 +8106,8 @@ c_parser_postfix_expression_after_primary (c_parser *parser,
 	  break;
 	case CPP_PLUS_PLUS:
 	  /* Postincrement.  */
+	  start = expr.src_range.m_start;
+	  finish = c_parser_peek_token (parser)->range.m_finish;
 	  c_parser_consume_token (parser);
 	  /* If the expressions have array notations, we expand them.  */
 	  if (flag_cilkplus
@@ -8040,11 +8119,14 @@ c_parser_postfix_expression_after_primary (c_parser *parser,
 	      expr.value = build_unary_op (op_loc,
 					   POSTINCREMENT_EXPR, expr.value, 0);
 	    }
+	  set_c_expr_source_range (&expr, start, finish);
 	  expr.original_code = ERROR_MARK;
 	  expr.original_type = NULL;
 	  break;
 	case CPP_MINUS_MINUS:
 	  /* Postdecrement.  */
+	  start = expr.src_range.m_start;
+	  finish = c_parser_peek_token (parser)->range.m_finish;
 	  c_parser_consume_token (parser);
 	  /* If the expressions have array notations, we expand them.  */
 	  if (flag_cilkplus
@@ -8056,6 +8138,7 @@ c_parser_postfix_expression_after_primary (c_parser *parser,
 	      expr.value = build_unary_op (op_loc,
 					   POSTDECREMENT_EXPR, expr.value, 0);
 	    }
+	  set_c_expr_source_range (&expr, start, finish);
 	  expr.original_code = ERROR_MARK;
 	  expr.original_type = NULL;
 	  break;
diff --git a/gcc/c/c-tree.h b/gcc/c/c-tree.h
index 667529a..9453caf 100644
--- a/gcc/c/c-tree.h
+++ b/gcc/c/c-tree.h
@@ -132,6 +132,9 @@ struct c_expr
      The type of an enum constant is a plain integer type, but this
      field will be the enum type.  */
   tree original_type;
+
+  /* FIXME.  */
+  source_range src_range;
 };
 
 /* Type alias for struct c_expr. This allows to use the structure
@@ -709,4 +712,12 @@ extern void pedwarn_c90 (location_t, int opt, const char *, ...)
 extern bool pedwarn_c99 (location_t, int opt, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,4);
 
+extern void
+set_c_expr_source_range (c_expr *expr,
+			 location_t start, location_t finish);
+
+extern void
+set_c_expr_source_range (c_expr *expr,
+			 source_range src_range);
+
 #endif /* ! GCC_C_TREE_H */
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 3b26231..8f3e0a8 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -3395,6 +3395,12 @@ parser_build_unary_op (location_t loc, enum tree_code code, struct c_expr arg)
     overflow_warning (loc, result.value);
     }
 
+  /* We are typically called when parsing a prefix token at LOC acting on
+     ARG.  Reflect this by updating the source range of the result to
+     start at LOC and end at the end of ARG.  */
+  set_c_expr_source_range (&result,
+			   loc, arg.src_range.m_finish);
+
   return result;
 }
 
@@ -3432,6 +3438,10 @@ parser_build_binary_op (location_t location, enum tree_code code,
   if (location != UNKNOWN_LOCATION)
     protected_set_expr_location (result.value, location);
 
+  set_c_expr_source_range (&result,
+			   arg1.src_range.m_start,
+			   arg2.src_range.m_finish);
+
   /* Check for cases such as x+y<<z which users are likely
      to misinterpret.  */
   if (warn_parentheses)
diff --git a/gcc/gcc-rich-location.c b/gcc/gcc-rich-location.c
new file mode 100644
index 0000000..b0ec47b
--- /dev/null
+++ b/gcc/gcc-rich-location.c
@@ -0,0 +1,86 @@
+/* Implementation of gcc_rich_location class
+   Copyright (C) 2014-2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "rtl.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "alias.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree-core.h"
+#include "tree.h"
+#include "diagnostic-core.h"
+#include "gcc-rich-location.h"
+#include "print-tree.h"
+#include "pretty-print.h"
+#include "intl.h"
+#include "cpplib.h"
+#include "diagnostic.h"
+
+/* Extract any source range information from EXPR and write it
+   to *R.  */
+
+static bool
+get_range_for_expr (tree expr, location_range *r)
+{
+  if (EXPR_HAS_RANGE (expr))
+    {
+      source_range sr = EXPR_LOCATION_RANGE (expr);
+
+      /* Do we have meaningful data?  */
+      if (sr.m_start && sr.m_finish)
+	{
+	  r->m_start = expand_location (sr.m_start);
+	  r->m_finish = expand_location (sr.m_finish);
+	  return true;
+	}
+    }
+
+  return false;
+}
+
+/* Add a range to the rich_location, covering expression EXPR. */
+
+void
+gcc_rich_location::add_expr (tree expr)
+{
+  gcc_assert (expr);
+
+  location_range r;
+  r.m_show_caret_p = false;
+  if (get_range_for_expr (expr, &r))
+    add_range (&r);
+}
+
+/* If T is an expression, add a range for it to the rich_location.  */
+
+void
+gcc_rich_location::maybe_add_expr (tree t)
+{
+  if (EXPR_P (t))
+    add_expr (t);
+}
diff --git a/gcc/gcc-rich-location.h b/gcc/gcc-rich-location.h
new file mode 100644
index 0000000..c82cbf1
--- /dev/null
+++ b/gcc/gcc-rich-location.h
@@ -0,0 +1,47 @@
+/* Declarations relating to class gcc_rich_location
+   Copyright (C) 2014-2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_RICH_LOCATION_H
+#define GCC_RICH_LOCATION_H
+
+/* A gcc_rich_location is libcpp's rich_location with additional
+   helper methods for working with gcc's types.  */
+class gcc_rich_location : public rich_location
+{
+ public:
+  /* Constructors.  */
+
+  /* Constructing from a location.  */
+  gcc_rich_location (source_location loc) :
+    rich_location (loc) {}
+
+  /* Constructing from a source_range.  */
+  gcc_rich_location (source_range src_range) :
+    rich_location (src_range) {}
+
+
+  /* Methods for adding ranges via gcc entities.  */
+  void
+  add_expr (tree expr);
+
+  void
+  maybe_add_expr (tree t);
+};
+
+#endif /* GCC_RICH_LOCATION_H */
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 91c26b6..ba8f410 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -1709,11 +1709,7 @@ gimple_block (const gimple *g)
 static inline void
 gimple_set_block (gimple *g, tree block)
 {
-  if (block)
-    g->location =
-	COMBINE_LOCATION_DATA (line_table, g->location, block);
-  else
-    g->location = LOCATION_LOCUS (g->location);
+  g->location = set_block (g->location, block);
 }
 
 
diff --git a/gcc/print-tree.c b/gcc/print-tree.c
index ea50056..8b3794a 100644
--- a/gcc/print-tree.c
+++ b/gcc/print-tree.c
@@ -936,6 +936,27 @@ print_node (FILE *file, const char *prefix, tree node, int indent)
       expanded_location xloc = expand_location (EXPR_LOCATION (node));
       indent_to (file, indent+4);
       fprintf (file, "%s:%d:%d", xloc.file, xloc.line, xloc.column);
+
+      /* Print the range, if any */
+      source_range r = EXPR_LOCATION_RANGE (node);
+      if (r.m_start)
+	{
+	  xloc = expand_location (r.m_start);
+	  fprintf (file, " start: %s:%d:%d", xloc.file, xloc.line, xloc.column);
+	}
+      else
+	{
+	  fprintf (file, " start: unknown");
+	}
+      if (r.m_finish)
+	{
+	  xloc = expand_location (r.m_finish);
+	  fprintf (file, " finish: %s:%d:%d", xloc.file, xloc.line, xloc.column);
+	}
+      else
+	{
+	  fprintf (file, " finish: unknown");
+	}
     }
 
   fprintf (file, ">");
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
new file mode 100644
index 0000000..5485aaf
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
@@ -0,0 +1,422 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdiagnostics-show-caret" } */
+
+/* This is a collection of unittests to verify that we're correctly
+   capturing the source code ranges of various kinds of expression.
+
+   It uses the various "diagnostic_test_*_expression_range_plugin"
+   plugins which handles "__emit_expression_range" by generating a warning
+   at the given source range of the input argument.  Each of the
+   different plugins do this at a different phase of the internal
+   representation (tree, gimple, etc), so we can verify that the
+   source code range information is valid at each phase.
+
+   We want to accept an expression of any type.  To do this in C, we
+   use variadic arguments, but C requires at least one argument before
+   the ellipsis, so we have a dummy one.  */
+
+extern void __emit_expression_range (int dummy, ...);
+
+int global;
+
+void test_parentheses (int a, int b)
+{
+  __emit_expression_range (0, (a + b) ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, (a + b) );
+                               ~~~^~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, (a + b) * (a - b) ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, (a + b) * (a - b) );
+                               ~~~~~~~~^~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, !(a && b) ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, !(a && b) );
+                               ^~~~~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+/* Postfix expressions.  ************************************************/
+
+void test_array_reference (int *arr)
+{
+  __emit_expression_range (0, arr[100] ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, arr[100] );
+                               ~~~^~~~~
+   { dg-end-multiline-output "" } */
+}
+
+int test_function_call (int p, int q, int r)
+{
+  __emit_expression_range (0, test_function_call (p, q, r) ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, test_function_call (p, q, r) );
+                               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+  return 0;
+}
+
+struct test_struct
+{
+  int field;
+};
+
+int test_structure_references (struct test_struct *ptr)
+{
+  struct test_struct local;
+  local.field = 42;
+
+  __emit_expression_range (0, local.field ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, local.field );
+                               ~~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, ptr->field ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, ptr->field );
+                               ~~~^~~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+int test_postfix_incdec (int i)
+{
+  __emit_expression_range (0, i++ ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, i++ );
+                               ~^~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, i-- ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, i-- );
+                               ~^~
+   { dg-end-multiline-output "" } */
+}
+
+/* Unary operators.  ****************************************************/
+
+int test_prefix_incdec (int i)
+{
+  __emit_expression_range (0, ++i ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, ++i );
+                               ^~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, --i ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, --i );
+                               ^~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_address_operator (void)
+{
+  __emit_expression_range (0, &global ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, &global );
+                               ^~~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_indirection (int *ptr)
+{
+  __emit_expression_range (0, *ptr ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, *ptr );
+                               ^~~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_unary_minus (int i)
+{
+  __emit_expression_range (0, -i ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, -i );
+                               ^~
+   { dg-end-multiline-output "" } */
+}
+
+void test_ones_complement (int i)
+{
+  __emit_expression_range (0, ~i ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, ~i );
+                               ^~
+   { dg-end-multiline-output "" } */
+}
+
+void test_logical_negation (int flag)
+{
+  __emit_expression_range (0, !flag ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, !flag );
+                               ^~~~~
+   { dg-end-multiline-output "" } */
+}
+
+/* Casts.  ****************************************************/
+
+void test_cast (void *ptr)
+{
+  __emit_expression_range (0, (int *)ptr ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, (int *)ptr );
+                               ^~~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+}
+
+/* Binary operators.  *******************************************/
+
+void test_multiplicative_operators (int lhs, int rhs)
+{
+  __emit_expression_range (0, lhs * rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs * rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs / rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs / rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs % rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs % rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_additive_operators (int lhs, int rhs)
+{
+  __emit_expression_range (0, lhs + rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs + rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs - rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs - rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_shift_operators (int lhs, int rhs)
+{
+  __emit_expression_range (0, lhs << rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs << rhs );
+                               ~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs >> rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs >> rhs );
+                               ~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_relational_operators (int lhs, int rhs)
+{
+  __emit_expression_range (0, lhs < rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs < rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs > rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs > rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs <= rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs <= rhs );
+                               ~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs >= rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs >= rhs );
+                               ~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_equality_operators (int lhs, int rhs)
+{
+  __emit_expression_range (0, lhs == rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs == rhs );
+                               ~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs != rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs != rhs );
+                               ~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_bitwise_binary_operators (int lhs, int rhs)
+{
+  __emit_expression_range (0, lhs & rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs & rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs ^ rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs ^ rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs | rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs | rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_logical_operators (int lhs, int rhs)
+{
+  __emit_expression_range (0, lhs && rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs && rhs );
+                               ~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs || rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs || rhs );
+                               ~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+/* Conditional operator.  *******************************************/
+
+void test_conditional_operators (int flag, int on_true, int on_false)
+{
+  __emit_expression_range (0, flag ? on_true : on_false ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, flag ? on_true : on_false );
+                               ~~~~~~~~~~~~~~~^~~~~~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+/* Assignment expressions.  *******************************************/
+
+void test_assignment_expressions (int dest, int other)
+{
+  __emit_expression_range (0, dest = other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest = other );
+                               ~~~~~^~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest *= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest *= other );
+                               ~~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest /= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest /= other );
+                               ~~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest %= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest %= other );
+                               ~~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest += other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest += other );
+                               ~~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest -= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest -= other );
+                               ~~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest <<= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest <<= other );
+                               ~~~~~^~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest >>= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest >>= other );
+                               ~~~~~^~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest &= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest &= other );
+                               ~~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest ^= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest ^= other );
+                               ~~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest |= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest |= other );
+                               ~~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+/* Comma operator.  *******************************************/
+
+void test_comma_operator (int a, int b)
+{
+  __emit_expression_range (0, (a++, a + b) ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, (a++, a + b) );
+                               ~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+/* Examples of non-trivial expressions.  ****************************/
+
+extern double sqrt (double x);
+
+void test_quadratic (double a, double b, double c)
+{
+  __emit_expression_range (0, b * b - 4 * a * c ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, b * b - 4 * a * c );
+                               ~~~~~~^~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0,
+     (-b + sqrt (b * b - 4 * a * c))
+     / (2 * a)); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+      (-b + sqrt (b * b - 4 * a * c))
+      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+      / (2 * a));
+      ^~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c
new file mode 100644
index 0000000..ef7d13f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c
@@ -0,0 +1,159 @@
+/* This plugin verifies the source-code location ranges of
+   expressions, at the pre-gimplification tree stage.  */
+/* { dg-options "-O" } */
+
+#include "gcc-plugin.h"
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "stringpool.h"
+#include "toplev.h"
+#include "basic-block.h"
+#include "hash-table.h"
+#include "vec.h"
+#include "ggc.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "internal-fn.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "tree.h"
+#include "tree-pass.h"
+#include "intl.h"
+#include "plugin-version.h"
+#include "diagnostic.h"
+#include "context.h"
+#include "gcc-rich-location.h"
+#include "print-tree.h"
+
+/*
+  Hack: fails with linker error:
+./diagnostic_plugin_test_tree_expression_range.so: undefined symbol: _ZN17gcc_rich_location8add_exprEP9tree_node
+  since nothing in the tree is using gcc_rich_location::add_expr yet.
+
+  I've tried various workarounds (adding DEBUG_FUNCTION to the
+  method, taking its address), but can't seem to fix it that way.
+  So as a nasty workaround, the following material is copied&pasted
+  from gcc-rich-location.c: */
+
+static bool
+get_range_for_expr (tree expr, location_range *r)
+{
+  if (EXPR_HAS_RANGE (expr))
+    {
+      source_range sr = EXPR_LOCATION_RANGE (expr);
+
+      /* Do we have meaningful data?  */
+      if (sr.m_start && sr.m_finish)
+	{
+	  r->m_start = expand_location (sr.m_start);
+	  r->m_finish = expand_location (sr.m_finish);
+	  return true;
+	}
+    }
+
+  return false;
+}
+
+/* Add a range to the rich_location, covering expression EXPR. */
+
+void
+gcc_rich_location::add_expr (tree expr)
+{
+  gcc_assert (expr);
+
+  location_range r;
+  r.m_show_caret_p = false;
+  if (get_range_for_expr (expr, &r))
+    add_range (&r);
+}
+
+/* FIXME: end of material taken from gcc-rich-location.c */
+
+
+int plugin_is_GPL_compatible;
+
+static void
+emit_warning (rich_location *richloc)
+{
+  if (richloc->get_num_locations () < 2)
+    {
+      error_at_rich_loc (richloc, "range not found");
+      return;
+    }
+
+  location_range *range = richloc->get_range (1);
+  warning_at_rich_loc (richloc, 0,
+		       "tree range %i:%i-%i:%i",
+		       range->m_start.line,
+		       range->m_start.column,
+		       range->m_finish.line,
+		       range->m_finish.column);
+}
+
+tree
+cb_walk_tree_fn (tree * tp, int * walk_subtrees,
+		 void * data ATTRIBUTE_UNUSED)
+{
+  if (TREE_CODE (*tp) != CALL_EXPR)
+    return NULL_TREE;
+
+  tree call_expr = *tp;
+  tree fn = CALL_EXPR_FN (call_expr);
+  if (TREE_CODE (fn) != ADDR_EXPR)
+    return NULL_TREE;
+  fn = TREE_OPERAND (fn, 0);
+  if (TREE_CODE (fn) != FUNCTION_DECL)
+    return NULL_TREE;
+  if (strcmp (IDENTIFIER_POINTER (DECL_NAME (fn)), "__emit_expression_range"))
+    return NULL_TREE;
+
+  /* Get arg 1; print it! */
+  //debug_tree (call_expr);
+
+  tree arg = CALL_EXPR_ARG (call_expr, 1);
+  //debug_tree (arg);
+
+  gcc_rich_location richloc (EXPR_LOCATION (arg));
+  richloc.add_expr (arg);
+  emit_warning (&richloc);
+
+  return NULL_TREE; //  should we be setting *walk_subtrees?
+}
+
+static void
+callback (void *gcc_data, void *user_data)
+{
+  //fprintf (stdout, "callback called!\n");
+  tree fndecl = (tree)gcc_data;
+
+  /* FIXME: is this actually going to be valid on all frontends
+     before genericize? */
+  walk_tree (&DECL_SAVED_TREE (fndecl), cb_walk_tree_fn, NULL, NULL);
+}
+
+int
+plugin_init (struct plugin_name_args *plugin_info,
+	     struct plugin_gcc_version *version)
+{
+  struct register_pass_info pass_info;
+  const char *plugin_name = plugin_info->base_name;
+  int argc = plugin_info->argc;
+  struct plugin_argument *argv = plugin_info->argv;
+
+  if (!plugin_default_version_check (version, &gcc_version))
+    return 1;
+
+  register_callback (plugin_name,
+		     PLUGIN_PRE_GENERICIZE,
+		     callback,
+		     NULL);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/plugin.exp b/gcc/testsuite/gcc.dg/plugin/plugin.exp
index 941bccc..b7efcf5 100644
--- a/gcc/testsuite/gcc.dg/plugin/plugin.exp
+++ b/gcc/testsuite/gcc.dg/plugin/plugin.exp
@@ -66,6 +66,8 @@ set plugin_test_list [list \
     { diagnostic_plugin_test_show_locus.c \
 	  diagnostic-test-show-locus-bw.c \
 	  diagnostic-test-show-locus-color.c } \
+    { diagnostic_plugin_test_tree_expression_range.c \
+	  diagnostic-test-expressions-1.c } \
 ]
 
 foreach plugin_test $plugin_test_list {
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 807d96f..e605b0b 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -6738,10 +6738,7 @@ move_block_to_fn (struct function *dest_cfun, basic_block bb,
 	    continue;
 	  if (d->orig_block == NULL_TREE || block == d->orig_block)
 	    {
-	      if (d->new_block == NULL_TREE)
-		locus = LOCATION_LOCUS (locus);
-	      else
-		locus = COMBINE_LOCATION_DATA (line_table, locus, d->new_block);
+	      locus = set_block (locus, d->new_block);
 	      gimple_phi_arg_set_location (phi, i, locus);
 	    }
 	}
@@ -6801,9 +6798,7 @@ move_block_to_fn (struct function *dest_cfun, basic_block bb,
 	tree block = LOCATION_BLOCK (e->goto_locus);
 	if (d->orig_block == NULL_TREE
 	    || block == d->orig_block)
-	  e->goto_locus = d->new_block ?
-	      COMBINE_LOCATION_DATA (line_table, e->goto_locus, d->new_block) :
-	      LOCATION_LOCUS (e->goto_locus);
+	  e->goto_locus = set_block (e->goto_locus, d->new_block);
       }
 }
 
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index abaea3f..d98b7d2 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -2352,10 +2352,7 @@ copy_phis_for_bb (basic_block bb, copy_body_data *id)
 		  tree *n;
 		  n = id->decl_map->get (LOCATION_BLOCK (locus));
 		  gcc_assert (n);
-		  if (*n)
-		    locus = COMBINE_LOCATION_DATA (line_table, locus, *n);
-		  else
-		    locus = LOCATION_LOCUS (locus);
+		  locus = set_block (locus, *n);
 		}
 	      else
 		locus = LOCATION_LOCUS (locus);
diff --git a/gcc/tree.c b/gcc/tree.c
index 84fd34d..9e91d2c 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -11659,10 +11659,7 @@ tree_set_block (tree t, tree b)
 
   if (IS_EXPR_CODE_CLASS (c))
     {
-      if (b)
-	t->exp.locus = COMBINE_LOCATION_DATA (line_table, t->exp.locus, b);
-      else
-	t->exp.locus = LOCATION_LOCUS (t->exp.locus);
+      t->exp.locus = set_block (t->exp.locus, b);
     }
   else
     gcc_unreachable ();
@@ -13646,5 +13643,40 @@ nonnull_arg_p (const_tree arg)
   return false;
 }
 
+void
+set_source_range (tree *expr, location_t start, location_t finish)
+{
+  source_range src_range;
+  src_range.m_start = start;
+  src_range.m_finish = finish;
+  set_source_range (expr, src_range);
+}
+
+void
+set_source_range (tree *expr, source_range src_range)
+{
+  if (!EXPR_P (*expr))
+    return;
+
+  location_t adhoc = COMBINE_LOCATION_DATA (line_table,
+					    EXPR_LOCATION (*expr),
+					    src_range,
+					    NULL /* FIXME */);
+  SET_EXPR_LOCATION (*expr, adhoc);
+}
+
+location_t
+set_block (location_t loc, tree block)
+{
+  source_range src_range;
+  if (IS_ADHOC_LOC (loc))
+    /* FIXME: can we update in-place?  */
+    src_range = get_range_from_adhoc_loc (line_table, loc);
+  else
+    src_range = source_range::from_location (loc);
+
+  return COMBINE_LOCATION_DATA (line_table, loc, src_range, block);
+}
+
 
 #include "gt-tree.h"
diff --git a/gcc/tree.h b/gcc/tree.h
index e500151..2cfbf30 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1066,10 +1066,28 @@ extern void omp_clause_range_check_failed (const_tree, const char *, int,
 #define EXPR_FILENAME(NODE) LOCATION_FILE (EXPR_CHECK ((NODE))->exp.locus)
 #define EXPR_LINENO(NODE) LOCATION_LINE (EXPR_CHECK (NODE)->exp.locus)
 
+#define CAN_HAVE_RANGE_P(NODE) (CAN_HAVE_LOCATION_P (NODE))
+#define EXPR_LOCATION_RANGE(NODE) (get_expr_source_range (EXPR_CHECK ((NODE))))
+
+#define EXPR_HAS_RANGE(NODE) \
+    (CAN_HAVE_RANGE_P (NODE) \
+     ? EXPR_LOCATION_RANGE (NODE).m_start != UNKNOWN_LOCATION \
+     : false)
+
 /* True if a tree is an expression or statement that can have a
    location.  */
 #define CAN_HAVE_LOCATION_P(NODE) ((NODE) && EXPR_P (NODE))
 
+static inline source_range
+get_expr_source_range (tree expr)
+{
+  location_t loc = EXPR_LOCATION (expr);
+  if (IS_ADHOC_LOC (loc))
+    return get_range_from_adhoc_loc (line_table, loc);
+  else
+    return source_range::from_location (loc);
+}
+
 extern void protected_set_expr_location (tree, location_t);
 
 /* In a TARGET_EXPR node.  */
@@ -2092,6 +2110,9 @@ extern machine_mode element_mode (const_tree t);
 #define DECL_IS_BUILTIN(DECL) \
   (LOCATION_LOCUS (DECL_SOURCE_LOCATION (DECL)) <= BUILTINS_LOCATION)
 
+#define DECL_LOCATION_RANGE(NODE) \
+  (get_decl_source_range (DECL_MINIMAL_CHECK (NODE)))
+
 /*  For FIELD_DECLs, this is the RECORD_TYPE, UNION_TYPE, or
     QUAL_UNION_TYPE node that the field is a member of.  For VAR_DECL,
     PARM_DECL, FUNCTION_DECL, LABEL_DECL, RESULT_DECL, and CONST_DECL
@@ -5133,10 +5154,29 @@ type_with_alias_set_p (const_tree t)
   return false;
 }
 
+extern void
+set_source_range (tree *expr, location_t start, location_t finish);
+
+extern void
+set_source_range (tree *expr, source_range src_range);
+
+extern location_t
+set_block (location_t loc, tree block);
+
 extern void gt_ggc_mx (tree &);
 extern void gt_pch_nx (tree &);
 extern void gt_pch_nx (tree &, gt_pointer_operator, void *);
 
 extern bool nonnull_arg_p (const_tree);
 
+static inline source_range
+get_decl_source_range (tree decl)
+{
+  location_t loc = DECL_SOURCE_LOCATION (decl);
+  if (IS_ADHOC_LOC (loc))
+    return get_range_from_adhoc_loc (line_table, loc);
+  else
+    return source_range::from_location (loc);
+}
+
 #endif  /* GCC_TREE_H  */
diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index bd73780..8deb798 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -499,9 +499,11 @@ struct GTY(()) maps_info_macro {
   unsigned int cache;
 };
 
-/* Data structure to associate an arbitrary data to a source location.  */
+/* Data structure to associate a source_range together with an arbitrary
+   data pointer with a source location.  */
 struct GTY(()) location_adhoc_data {
   source_location locus;
+  source_range src_range;
   void * GTY((skip)) data;
 };
 
@@ -800,10 +802,14 @@ LINEMAPS_LAST_ALLOCATED_MACRO_MAP (const line_maps *set)
 
 extern void location_adhoc_data_fini (struct line_maps *);
 extern source_location get_combined_adhoc_loc (struct line_maps *,
-					       source_location, void *);
+					       source_location,
+					       source_range,
+					       void *);
 extern void *get_data_from_adhoc_loc (struct line_maps *, source_location);
 extern source_location get_location_from_adhoc_loc (struct line_maps *,
 						    source_location);
+extern source_range get_range_from_adhoc_loc (struct line_maps *,
+					      source_location);
 
 /* Get whether location LOC is an ad-hoc location.  */
 
@@ -818,9 +824,10 @@ IS_ADHOC_LOC (source_location loc)
 inline source_location
 COMBINE_LOCATION_DATA (struct line_maps *set,
 		       source_location loc,
+		       source_range src_range,
 		       void *block)
 {
-  return get_combined_adhoc_loc (set, loc, block);
+  return get_combined_adhoc_loc (set, loc, src_range, block);
 }
 
 extern void rebuild_location_adhoc_htab (struct line_maps *);
diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index a6fa782..439157e 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -69,7 +69,10 @@ location_adhoc_data_hash (const void *l)
 {
   const struct location_adhoc_data *lb =
       (const struct location_adhoc_data *) l;
-  return (hashval_t) lb->locus + (size_t) lb->data;
+  return ((hashval_t) lb->locus
+	  + (hashval_t) lb->src_range.m_start
+	  + (hashval_t) lb->src_range.m_finish
+	  + (size_t) lb->data);
 }
 
 /* Compare function for location_adhoc_data hashtable.  */
@@ -81,7 +84,10 @@ location_adhoc_data_eq (const void *l1, const void *l2)
       (const struct location_adhoc_data *) l1;
   const struct location_adhoc_data *lb2 =
       (const struct location_adhoc_data *) l2;
-  return lb1->locus == lb2->locus && lb1->data == lb2->data;
+  return (lb1->locus == lb2->locus
+	  && lb1->src_range.m_start == lb2->src_range.m_start
+	  && lb1->src_range.m_finish == lb2->src_range.m_finish
+	  && lb1->data == lb2->data);
 }
 
 /* Update the hashtable when location_adhoc_data is reallocated.  */
@@ -110,19 +116,20 @@ rebuild_location_adhoc_htab (struct line_maps *set)
 
 source_location
 get_combined_adhoc_loc (struct line_maps *set,
-			source_location locus, void *data)
+			source_location locus,
+			source_range src_range,
+			void *data)
 {
   struct location_adhoc_data lb;
   struct location_adhoc_data **slot;
 
-  linemap_assert (data);
-
   if (IS_ADHOC_LOC (locus))
     locus
       = set->location_adhoc_data_map.data[locus & MAX_SOURCE_LOCATION].locus;
   if (locus == 0 && data == NULL)
     return 0;
   lb.locus = locus;
+  lb.src_range = src_range;
   lb.data = data;
   slot = (struct location_adhoc_data **)
       htab_find_slot (set->location_adhoc_data_map.htab, &lb, INSERT);
@@ -177,6 +184,13 @@ get_location_from_adhoc_loc (struct line_maps *set, source_location loc)
   return set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].locus;
 }
 
+source_range
+get_range_from_adhoc_loc (struct line_maps *set, source_location loc)
+{
+  linemap_assert (IS_ADHOC_LOC (loc));
+  return set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].src_range;
+}
+
 /* Finalize the location_adhoc_data structure.  */
 void
 location_adhoc_data_fini (struct line_maps *set)
@@ -1478,9 +1492,9 @@ linemap_expand_location (struct line_maps *set,
   memset (&xloc, 0, sizeof (xloc));
   if (IS_ADHOC_LOC (loc))
     {
-      loc = set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].locus;
       xloc.data
 	= set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].data;
+      loc = set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].locus;
     }
 
   if (loc < RESERVED_LOCATION_COUNT)
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 0/5] RFC: Overhaul of diagnostics (v2)
  2015-09-22 21:09 [PATCH 0/5] RFC: Overhaul of diagnostics (v2) David Malcolm
                   ` (4 preceding siblings ...)
  2015-09-22 22:23 ` [PATCH 4/5] Implement tree expression tracking in C FE (v2) David Malcolm
@ 2015-09-23 13:36 ` Michael Matz
  2015-09-23 13:43   ` Richard Biener
  2015-10-23 20:25 ` [PATCH 00/10] Overhaul of diagnostics (v5) David Malcolm
  6 siblings, 1 reply; 83+ messages in thread
From: Michael Matz @ 2015-09-23 13:36 UTC (permalink / raw)
  To: David Malcolm; +Cc: gcc-patches

Hi,

On Tue, 22 Sep 2015, David Malcolm wrote:

> The drawback is that it could bloat the ad-hoc table.  Can the ad-hoc
> table ever get smaller, or does it only ever get inserted into?

It only ever grows.

> An idea I had is that we could stash short ranges directly into the 32 
> bits of location_t, by offsetting the per-column-bits somewhat.

It's certainly worth an experiment: let's say you restrict yourself to 
tokens less than 8 characters, you need an additional 3 bits (using one 
value, e.g. zero, as the escape value).  That leaves 20 bits for the line 
numbers (for the normal 8 bit columns), which might be enough for most 
single-file compilations.  For LTO compilation this often won't be enough.

> My plan is to investigate the impact these patches have on the time and 
> memory consumption of the compiler,

When you do so, make sure you're also measuring an LTO compilation with 
debug info of something big (firefox).  I know that we already had issues 
with the size of the linemap data in the past for these cases (probably 
when we added columns).


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 0/5] RFC: Overhaul of diagnostics (v2)
  2015-09-23 13:36 ` [PATCH 0/5] RFC: Overhaul of diagnostics (v2) Michael Matz
@ 2015-09-23 13:43   ` Richard Biener
  2015-09-23 13:53     ` Michael Matz
  2015-09-24  2:39     ` David Malcolm
  0 siblings, 2 replies; 83+ messages in thread
From: Richard Biener @ 2015-09-23 13:43 UTC (permalink / raw)
  To: Michael Matz; +Cc: David Malcolm, GCC Patches

On Wed, Sep 23, 2015 at 3:19 PM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> On Tue, 22 Sep 2015, David Malcolm wrote:
>
>> The drawback is that it could bloat the ad-hoc table.  Can the ad-hoc
>> table ever get smaller, or does it only ever get inserted into?
>
> It only ever grows.
>
>> An idea I had is that we could stash short ranges directly into the 32
>> bits of location_t, by offsetting the per-column-bits somewhat.
>
> It's certainly worth an experiment: let's say you restrict yourself to
> tokens less than 8 characters, you need an additional 3 bits (using one
> value, e.g. zero, as the escape value).  That leaves 20 bits for the line
> numbers (for the normal 8 bit columns), which might be enough for most
> single-file compilations.  For LTO compilation this often won't be enough.
>
>> My plan is to investigate the impact these patches have on the time and
>> memory consumption of the compiler,
>
> When you do so, make sure you're also measuring an LTO compilation with
> debug info of something big (firefox).  I know that we already had issues
> with the size of the linemap data in the past for these cases (probably
> when we added columns).

The issue we have with LTO is that the linemap gets populated in quite
random order and thus we repeatedly switch files (we've mitigated this
somewhat for GCC 5).  We also considered dropping column info
(and would drop range info) as diagnostics are from optimizers only
with LTO and we keep locations merely for debug info.

Richard.

>
> Ciao,
> Michael.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 0/5] RFC: Overhaul of diagnostics (v2)
  2015-09-23 13:43   ` Richard Biener
@ 2015-09-23 13:53     ` Michael Matz
  2015-09-23 15:51       ` Jeff Law
  2015-09-24  2:39     ` David Malcolm
  1 sibling, 1 reply; 83+ messages in thread
From: Michael Matz @ 2015-09-23 13:53 UTC (permalink / raw)
  To: Richard Biener; +Cc: David Malcolm, GCC Patches

Hi,

On Wed, 23 Sep 2015, Richard Biener wrote:

> The issue we have with LTO is that the linemap gets populated in quite 
> random order and thus we repeatedly switch files (we've mitigated this 
> somewhat for GCC 5).

Yes.

> We also considered dropping column info (and would drop range info) as 
> diagnostics are from optimizers only with LTO and we keep locations 
> merely for debug info.

That would be the obvious mitigations, yes.  I do like the fact that we'd 
be able to do all this without enlarging location_t.


Ciao,
Micha.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 0/5] RFC: Overhaul of diagnostics (v2)
  2015-09-23 13:53     ` Michael Matz
@ 2015-09-23 15:51       ` Jeff Law
  0 siblings, 0 replies; 83+ messages in thread
From: Jeff Law @ 2015-09-23 15:51 UTC (permalink / raw)
  To: Michael Matz, Richard Biener; +Cc: David Malcolm, GCC Patches

On 09/23/2015 07:47 AM, Michael Matz wrote:
> Hi,
>
> On Wed, 23 Sep 2015, Richard Biener wrote:
>
>> The issue we have with LTO is that the linemap gets populated in quite
>> random order and thus we repeatedly switch files (we've mitigated this
>> somewhat for GCC 5).
>
> Yes.
>
>> We also considered dropping column info (and would drop range info) as
>> diagnostics are from optimizers only with LTO and we keep locations
>> merely for debug info.
>
> That would be the obvious mitigations, yes.  I do like the fact that we'd
> be able to do all this without enlarging location_t.
That's the hope.

However, I did ask David to ponder the effects if ultimately we did need 
to extend location_t to 64 bits.

Jff

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 0/5] RFC: Overhaul of diagnostics (v2)
  2015-09-23 13:43   ` Richard Biener
  2015-09-23 13:53     ` Michael Matz
@ 2015-09-24  2:39     ` David Malcolm
  2015-09-24  9:03       ` Richard Biener
  1 sibling, 1 reply; 83+ messages in thread
From: David Malcolm @ 2015-09-24  2:39 UTC (permalink / raw)
  To: Richard Biener; +Cc: Michael Matz, GCC Patches

On Wed, 2015-09-23 at 15:36 +0200, Richard Biener wrote:
> On Wed, Sep 23, 2015 at 3:19 PM, Michael Matz <matz@suse.de> wrote:
> > Hi,
> >
> > On Tue, 22 Sep 2015, David Malcolm wrote:
> >
> >> The drawback is that it could bloat the ad-hoc table.  Can the ad-hoc
> >> table ever get smaller, or does it only ever get inserted into?
> >
> > It only ever grows.
> >
> >> An idea I had is that we could stash short ranges directly into the 32
> >> bits of location_t, by offsetting the per-column-bits somewhat.
> >
> > It's certainly worth an experiment: let's say you restrict yourself to
> > tokens less than 8 characters, you need an additional 3 bits (using one
> > value, e.g. zero, as the escape value).  That leaves 20 bits for the line
> > numbers (for the normal 8 bit columns), which might be enough for most
> > single-file compilations.  For LTO compilation this often won't be enough.
> >
> >> My plan is to investigate the impact these patches have on the time and
> >> memory consumption of the compiler,
> >
> > When you do so, make sure you're also measuring an LTO compilation with
> > debug info of something big (firefox).  I know that we already had issues
> > with the size of the linemap data in the past for these cases (probably
> > when we added columns).
> 
> The issue we have with LTO is that the linemap gets populated in quite
> random order and thus we repeatedly switch files (we've mitigated this
> somewhat for GCC 5).  We also considered dropping column info
> (and would drop range info) as diagnostics are from optimizers only
> with LTO and we keep locations merely for debug info.

Thanks.  Presumably the mitigation you're referring to is the
lto_location_cache class in lto-streamer-in.c?

Am I right in thinking that, right now, the LTO code doesn't support
ad-hoc locations? (presumably the block pointers only need to exist
during optimization, which happens after the serialization)

The obvious simplification would be, as you suggest, to not bother
storing range information with LTO, falling back to just the existing
representation.  Then there's no need to extend LTO to serialize ad-hoc
data; simply store the underlying locus into the bit stream.  I think
that this happens already: lto-streamer-out.c calls expand_location and
stores the result, so presumably any ad-hoc location_t values made by
the v2 patches would have dropped their range data there when I ran the
test suite.

If it's acceptable to not bother with ranges for LTO, one way to do the
"stashing short ranges into the location_t" idea might be for the
bits-per-range of location_t values to be a property of the line_table
(or possibly the line map), set up when the struct line_maps is created.
For non-LTO it could be some tuned value (maybe from a param?); for LTO
it could be zero, so that we have as many bits as before for line/column
data.

Hope this sounds sane
Dave

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 0/5] RFC: Overhaul of diagnostics (v2)
  2015-09-24  2:39     ` David Malcolm
@ 2015-09-24  9:03       ` Richard Biener
  2015-09-25 16:50         ` Jeff Law
  2015-10-13 15:33         ` Benchmarks of v2 (was Re: [PATCH 0/5] RFC: Overhaul of diagnostics (v2)) David Malcolm
  0 siblings, 2 replies; 83+ messages in thread
From: Richard Biener @ 2015-09-24  9:03 UTC (permalink / raw)
  To: David Malcolm; +Cc: Michael Matz, GCC Patches

On Thu, Sep 24, 2015 at 2:25 AM, David Malcolm <dmalcolm@redhat.com> wrote:
> On Wed, 2015-09-23 at 15:36 +0200, Richard Biener wrote:
>> On Wed, Sep 23, 2015 at 3:19 PM, Michael Matz <matz@suse.de> wrote:
>> > Hi,
>> >
>> > On Tue, 22 Sep 2015, David Malcolm wrote:
>> >
>> >> The drawback is that it could bloat the ad-hoc table.  Can the ad-hoc
>> >> table ever get smaller, or does it only ever get inserted into?
>> >
>> > It only ever grows.
>> >
>> >> An idea I had is that we could stash short ranges directly into the 32
>> >> bits of location_t, by offsetting the per-column-bits somewhat.
>> >
>> > It's certainly worth an experiment: let's say you restrict yourself to
>> > tokens less than 8 characters, you need an additional 3 bits (using one
>> > value, e.g. zero, as the escape value).  That leaves 20 bits for the line
>> > numbers (for the normal 8 bit columns), which might be enough for most
>> > single-file compilations.  For LTO compilation this often won't be enough.
>> >
>> >> My plan is to investigate the impact these patches have on the time and
>> >> memory consumption of the compiler,
>> >
>> > When you do so, make sure you're also measuring an LTO compilation with
>> > debug info of something big (firefox).  I know that we already had issues
>> > with the size of the linemap data in the past for these cases (probably
>> > when we added columns).
>>
>> The issue we have with LTO is that the linemap gets populated in quite
>> random order and thus we repeatedly switch files (we've mitigated this
>> somewhat for GCC 5).  We also considered dropping column info
>> (and would drop range info) as diagnostics are from optimizers only
>> with LTO and we keep locations merely for debug info.
>
> Thanks.  Presumably the mitigation you're referring to is the
> lto_location_cache class in lto-streamer-in.c?
>
> Am I right in thinking that, right now, the LTO code doesn't support
> ad-hoc locations? (presumably the block pointers only need to exist
> during optimization, which happens after the serialization)

LTO code does support ad-hoc locations but they are "restored" only
when reading function bodies and stmts (by means of COMBINE_LOCATION_DATA).

> The obvious simplification would be, as you suggest, to not bother
> storing range information with LTO, falling back to just the existing
> representation.  Then there's no need to extend LTO to serialize ad-hoc
> data; simply store the underlying locus into the bit stream.  I think
> that this happens already: lto-streamer-out.c calls expand_location and
> stores the result, so presumably any ad-hoc location_t values made by
> the v2 patches would have dropped their range data there when I ran the
> test suite.

Yep.  We only preserve BLOCKs, so if you don't add extra code to
preserve ranges they'll be "dropped".

> If it's acceptable to not bother with ranges for LTO, one way to do the
> "stashing short ranges into the location_t" idea might be for the
> bits-per-range of location_t values to be a property of the line_table
> (or possibly the line map), set up when the struct line_maps is created.
> For non-LTO it could be some tuned value (maybe from a param?); for LTO
> it could be zero, so that we have as many bits as before for line/column
> data.

That could be a possibility (likewise for column info?)

Richard.

> Hope this sounds sane
> Dave
>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2)
  2015-09-22 21:33 ` [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2) David Malcolm
@ 2015-09-25  9:49   ` Dodji Seketeli
  2015-09-25 12:34     ` Manuel López-Ibáñez
  2015-09-25 20:39     ` [PATCH] v3 of diagnostic_show_locus and rich_location (was Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2)) David Malcolm
  0 siblings, 2 replies; 83+ messages in thread
From: Dodji Seketeli @ 2015-09-25  9:49 UTC (permalink / raw)
  To: David Malcolm
  Cc: gcc-patches, Jason Merrill, Tobias Burnus, Joseph S. Myers,
	Manuel López-Ibáñez, Mike Stump, Rainer Orth

Hello David,

I like this!  Thank you very much for working on this.

I think this patch is in great shape, and once we agree on some of the
nits I have commented on below, I think it should go in. Oh, it also
needs the first patch (1/5, dejagnu first) to go in first, as this one
depends on it.  I defer to the dejagnu maintainers for that one.

The line-map parts are OK to me too, but I have no power on those.  So I
defer to the FE maintainers for that.  The diagnostics parts of the
Fortran, C++ and C FE look good to me too; these are just well contained
mechanical adjustments, but I defer to FE maintainers for the final
word.

Please find my comments below.

[...]

diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c

[...]

+/* A class to inject colorization codes when printing the diagnostic locus,
+   tracking state as it goes.  */
+
+class colorizer
+{

[...]

    +  void set_state (int state);
    +  void begin_state (int state);
    +  void finish_state (int state);

I think the concept of state could use a little bit of explanation, at
least to say that there are the same number of states, as the number
of ranges.  And that the 'state' argument to these functions really is
the range index.

Also, I am thinking that there should maybe be a layout::state type,
which would have two notional properties (for now): range_index and
draw_caret_p. So that this function:

+bool
+layout::get_state_at_point (/* Inputs.  */
+			    int row, int column,
+			    int first_non_ws, int last_non_ws,
+			    /* Outputs.  */
+			    int *out_range_idx,
+			    bool *out_draw_caret_p)

Would take just one output parameter, e.g, a reference to
layout::state.


+layout::layout (diagnostic_context * context,
+		const diagnostic_info *diagnostic)

[...]

+      if (loc_range->m_finish.file != m_exploc.file)
+	continue;
+      if (loc_range->m_show_caret_p)
+	if (loc_range->m_finish.file != m_exploc.file)
+	  continue;

I think the second if clause is redundant.

+  if (0)
+    show_ruler (context, line_width, m_x_offset);

This should probably be removed from the final code to be committed.

[...]

+/* Get the column beyond the rightmost one that could contain a caret or
+   range marker, given that we stop rendering at trailing whitespace.  */
+
+int
+layout::get_x_bound_for_row (int row, int caret_column,
+			     int last_non_ws)

Please describe what the parameters mean here, especially last_non_ws.
I had to read its code to know that last_non_ws was the *column* of
the last non white space character.

[...]

+void
+layout::print_line (int row)

This function is neat.  I like it! :-)

[...]

 void
 diagnostic_show_locus (diagnostic_context * context,
 		       const diagnostic_info *diagnostic)
@@ -75,16 +710,25 @@ diagnostic_show_locus (diagnostic_context * context,
     return;

+      /* The GCC 5 routine. */

I'd say the GCC <= 5 routine ;-)

+  else
+    /* The GCC 6 routine.  */

And here, the GCC > 5 routine.

I would be surprised to see this patch in particular incur any
noticeable increase in time and space consumption, but, have you noticed
anythying related to that during bootstrap?

Cheers,

-- 
		Dodji

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 3/5] Implement token range tracking within libcpp and the C FE (v2)
  2015-09-22 21:10 ` [PATCH 3/5] Implement token range tracking within libcpp and the C FE (v2) David Malcolm
@ 2015-09-25  9:58   ` Dodji Seketeli
  2015-09-25 14:53     ` David Malcolm
  0 siblings, 1 reply; 83+ messages in thread
From: Dodji Seketeli @ 2015-09-25  9:58 UTC (permalink / raw)
  To: David Malcolm
  Cc: gcc-patches, Manuel López-Ibáñez, Jason Merrill,
	Joseph S. Myers

Hello,

David Malcolm <dmalcolm@redhat.com> a écrit:

> This is an updated version of:
>   https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00736.html
>
> Changes in V2 of the patch:
>   * c_lex_with_flags: don't write through the range ptr if it's NULL
>   * don't add any fields to the C++ frontend's cp_token for now
>   * libcpp/lex.c: prevent usage of stale/uninitialized data in
>     _cpp_temp_token and _cpp_lex_direct.
>
> This patch adds source *range* information to libcpp's cpp_token, and to
> c_token in the C frontend.
>
> As noted before, to minimize churn, I kept the existing location_t
> fields, though in theory these are always just equal to the start of
> the source range.

Funny; I first overlooked this comment of you, and then when reading the
patch, I asked myself "why keep the existing location_t" ?  I mean, in
here:

     /* A preprocessing token.  This has been carefully packed and should
    -   occupy 16 bytes on 32-bit hosts and 24 bytes on 64-bit hosts.  */
    +   occupy 16 bytes on 32-bit hosts and 24 bytes on 64-bit hosts.
    +   FIXME: the above comment is no longer true with this patch.  */
     struct GTY(()) cpp_token {
       source_location src_loc;	/* Location of first char of token.  */
    +  source_range src_range;	/* Source range covered by the token.  */
       ENUM_BITFIELD(cpp_ttype) type : CHAR_BIT;  /* token type */
       unsigned short flags;		/* flags - see above */
 
You could just change the type of src_loc and make it be a source_range.

Source range could take a converting constructor, that takes a
source_location, so that the existing code that does "cpp_token.src_loc
= a_source_location;" keeps working.

But then, in the previous patch of the series, I see:

+/* A range of source locations.
+
+   Ranges are closed:
+   m_start is the first location within the range,
+   m_finish is the last location within the range.
+
+   We may need a more compact way to store these, but for now,
+   let's do it the simple way, as a pair.  */
+struct GTY(()) source_range
+{
+  source_location m_start;
+  source_location m_finish;
+
+  void debug (const char *msg) const;
+
+  /* We avoid using constructors, since various structs that
+     don't yet have constructors will embed instances of
+     source_range.  */
+

But what if we define a default constructor for that one (as well as one
that takes a source_location)?  Wouldn't that work for the embedding
case that you talk about in that comment?

The reason why I am asking all this is, memory consumption.  Maybe
you've measured it and it's not relevant, but otherwise, if we could do
away with duplicating the start location and still miminize the churn,
maybe we should try.

-- 
		Dodji

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2)
  2015-09-25  9:49   ` Dodji Seketeli
@ 2015-09-25 12:34     ` Manuel López-Ibáñez
  2015-09-25 16:21       ` Dodji Seketeli
  2015-09-25 20:39     ` [PATCH] v3 of diagnostic_show_locus and rich_location (was Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2)) David Malcolm
  1 sibling, 1 reply; 83+ messages in thread
From: Manuel López-Ibáñez @ 2015-09-25 12:34 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: David Malcolm, Gcc Patch List, Jason Merrill, Tobias Burnus,
	Joseph S. Myers, Mike Stump, Rainer Orth

On 25 September 2015 at 10:51, Dodji Seketeli <dodji@seketeli.org> wrote:
> The line-map parts are OK to me too, but I have no power on those.  So I

You are listed as "line map" maintainer in MAINTAINERS. I rooted for you! :)

Cheers,

Manuel.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 4/5] Implement tree expression tracking in C FE (v2)
  2015-09-22 22:23 ` [PATCH 4/5] Implement tree expression tracking in C FE (v2) David Malcolm
@ 2015-09-25 14:22   ` Dodji Seketeli
  2015-09-25 15:04     ` David Malcolm
  0 siblings, 1 reply; 83+ messages in thread
From: Dodji Seketeli @ 2015-09-25 14:22 UTC (permalink / raw)
  To: David Malcolm; +Cc: gcc-patches

Hello,

Similarly to a comment I made on the previous patch of the series,

+++ b/libcpp/include/line-map.h

[...]

     struct GTY(()) location_adhoc_data {
       source_location locus;
    +  source_range src_range;
       void * GTY((skip)) data;
     };

Could we just change the type of locus and make it be a source_range
instead?  With the right converting constructor (in the source_range
class) that takes a source_location, the amount of churn should
hopefully be minimized, or maybe I am missing something ...

[...]

diff --git a/libcpp/line-map.c b/libcpp/line-map.c

[...]

+source_range
+get_range_from_adhoc_loc (struct line_maps *set, source_location loc)
+{

Please add a comment for this function.

[...]

diff --git a/gcc/tree.c b/gcc/tree.c

+void
+set_source_range (tree *expr, location_t start, location_t finish)

Please add a comment for this function and its overloads.

[...]

+void
+set_c_expr_source_range (c_expr *expr,
+			 location_t start, location_t finish)
+{

Likewise.

+location_t
+set_block (location_t loc, tree block)
+{

Likewise.  I am wondering if we shouldn't even change the name of this
function to better suit what it does: associate a tree to a location.

+  source_range src_range;
+  if (IS_ADHOC_LOC (loc))
+    /* FIXME: can we update in-place?  */
+    src_range = get_range_from_adhoc_loc (line_table, loc);

Hmmh, at the moment, I don't think we can update an entry of the adhoc
map that associates {location, range, tree} as all three components
contribute to the hash value of the entry.  A new triplet means a new
entry.

My understanding is that initially the two elements of the pair
{location, tree} were contributing to the hash value because the same
location could very well belong to different blocks.

+  else
+    src_range = source_range::from_location (loc);
+
+  return COMBINE_LOCATION_DATA (line_table, loc, src_range, block);
+}

[...]

@@ -6091,6 +6112,10 @@ c_parser_conditional_expression (c_parser *parser, struct c_expr *after,
 
   if (c_parser_next_token_is_not (parser, CPP_QUERY))
     return cond;
+  if (cond.value != error_mark_node)
+    start = cond.src_range.m_start;
+  else
+    start = UNKNOWN_LOCATION;

I think that "getting the start range of a c_expr" is an operation
that is generally useful; and it's also generally useful for that
operation to handle cases where the tree carried by the c_expr can be
an error_mark_node.  So maybe it would be appropriate to have a getter
function for that operation and use it here instead.

You would then use that operation in the other places of this patch
that are getting c_expr::src_range.start -- by the way, those other
places are not handling the error_mark_node case like the above.

[...]

+++ b/gcc/c/c-tree.h
@@ -132,6 +132,9 @@ struct c_expr
      The type of an enum constant is a plain integer type, but this
      field will be the enum type.  */
   tree original_type;
+
+  /* FIXME.  */
+  source_range src_range;
 };

Why a FIXME here?

+#define CAN_HAVE_RANGE_P(NODE) (CAN_HAVE_LOCATION_P (NODE))

Please add a comment for this.

+#define EXPR_LOCATION_RANGE(NODE) (get_expr_source_range (EXPR_CHECK ((NODE))))

Likewise.

+#define EXPR_HAS_RANGE(NODE) \
+    (CAN_HAVE_RANGE_P (NODE) \
+     ? EXPR_LOCATION_RANGE (NODE).m_start != UNKNOWN_LOCATION \
+     : false)
+

Likewise.

[...]
 
+static inline source_range
+get_expr_source_range (tree expr)

Likewise.

+#define DECL_LOCATION_RANGE(NODE) \
+  (get_decl_source_range (DECL_MINIMAL_CHECK (NODE)))
+

Likewise.

+static inline source_range
+get_decl_source_range (tree decl)
+{

Likewise.

-- 
		Dodji

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 3/5] Implement token range tracking within libcpp and the C FE (v2)
  2015-09-25  9:58   ` Dodji Seketeli
@ 2015-09-25 14:53     ` David Malcolm
  2015-09-25 16:15       ` Dodji Seketeli
  0 siblings, 1 reply; 83+ messages in thread
From: David Malcolm @ 2015-09-25 14:53 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: gcc-patches, Manuel López-Ibáñez, Jason Merrill,
	Joseph S. Myers

On Fri, 2015-09-25 at 11:13 +0200, Dodji Seketeli wrote:
> Hello,
> 
> David Malcolm <dmalcolm@redhat.com> a écrit:
> 
> > This is an updated version of:
> >   https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00736.html
> >
> > Changes in V2 of the patch:
> >   * c_lex_with_flags: don't write through the range ptr if it's NULL
> >   * don't add any fields to the C++ frontend's cp_token for now
> >   * libcpp/lex.c: prevent usage of stale/uninitialized data in
> >     _cpp_temp_token and _cpp_lex_direct.
> >
> > This patch adds source *range* information to libcpp's cpp_token, and to
> > c_token in the C frontend.
> >
> > As noted before, to minimize churn, I kept the existing location_t
> > fields, though in theory these are always just equal to the start of
> > the source range.
> 
> Funny; I first overlooked this comment of you, and then when reading the
> patch, I asked myself "why keep the existing location_t" ?  I mean, in
> here:
> 
>      /* A preprocessing token.  This has been carefully packed and should
>     -   occupy 16 bytes on 32-bit hosts and 24 bytes on 64-bit hosts.  */
>     +   occupy 16 bytes on 32-bit hosts and 24 bytes on 64-bit hosts.
>     +   FIXME: the above comment is no longer true with this patch.  */
>      struct GTY(()) cpp_token {
>        source_location src_loc;	/* Location of first char of token.  */
>     +  source_range src_range;	/* Source range covered by the token.  */
>        ENUM_BITFIELD(cpp_ttype) type : CHAR_BIT;  /* token type */
>        unsigned short flags;		/* flags - see above */
>  
> You could just change the type of src_loc and make it be a source_range.

Interesting idea.

For the general case of expressions, I want a location to mean a
caret/point plus a range that contains it, but for tokens, the
caret/point is always at the start of the range.  So maybe a src_range
would do here.

That said, in patches 3 and 4 of this kit I'm experimenting with
representation; as I said in the blurb for patch 3: "See the
cover-letter for this patch kit which describes how we might go back to
using just a location_t, and stashing the range inside the location_t.
I'm doing it this way for now to allow for more flexibility as I
benchmark and explore implementation options."

So patches 3 and 4 are rather more experimental than patches 1,2 and 5,
as I find out what the different representations do to the time&memory
consumption of the compiler.

I like the idea of "source_location" and "location_t" becoming a
representation of "(point/caret + range)" rather than just a
point/caret, since this means that we can pass location_t around as
before, but then we can extract ranges from them, and all of the
existing diagnostics ought to "automagically" gain underlines "for
free".  I'm experimenting with ways to try to do that efficiently, with
strategies for packing things into the 32-bits compactly; see e.g.:
 https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01826.html

If so, then cpp_token's src_loc would remain a source_location;
source_location itself becomes richer.

> Source range could take a converting constructor, that takes a
> source_location, so that the existing code that does "cpp_token.src_loc
> = a_source_location;" keeps working.
> 
> But then, in the previous patch of the series, I see:
> 
> +/* A range of source locations.
> +
> +   Ranges are closed:
> +   m_start is the first location within the range,
> +   m_finish is the last location within the range.
> +
> +   We may need a more compact way to store these, but for now,
> +   let's do it the simple way, as a pair.  */
> +struct GTY(()) source_range
> +{
> +  source_location m_start;
> +  source_location m_finish;
> +
> +  void debug (const char *msg) const;
> +
> +  /* We avoid using constructors, since various structs that
> +     don't yet have constructors will embed instances of
> +     source_range.  */
> +
> 
> But what if we define a default constructor for that one (as well as one
> that takes a source_location)?  Wouldn't that work for the embedding
> case that you talk about in that comment?

Perhaps, but I worry that it would lead to a cascade, where suddenly
we'd need constructors for various other types.  I can try it, I guess.

> The reason why I am asking all this is, memory consumption.  Maybe
> you've measured it and it's not relevant, but otherwise, if we could do
> away with duplicating the start location and still miminize the churn,
> maybe we should try.

(nods)   I'm experimenting with some different approaches here.

Thanks
Dave

[BTW, I'm about to disappear on a vacation from tomorrow until October
6th, and away from computers]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 4/5] Implement tree expression tracking in C FE (v2)
  2015-09-25 14:22   ` Dodji Seketeli
@ 2015-09-25 15:04     ` David Malcolm
  2015-09-25 16:36       ` Dodji Seketeli
  0 siblings, 1 reply; 83+ messages in thread
From: David Malcolm @ 2015-09-25 15:04 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: gcc-patches

On Fri, 2015-09-25 at 16:06 +0200, Dodji Seketeli wrote:
> Hello,
> 
> Similarly to a comment I made on the previous patch of the series,
> 
> +++ b/libcpp/include/line-map.h
> 
> [...]
> 
>      struct GTY(()) location_adhoc_data {
>        source_location locus;
>     +  source_range src_range;
>        void * GTY((skip)) data;
>      };
> 
> Could we just change the type of locus and make it be a source_range
> instead?  With the right converting constructor (in the source_range
> class) that takes a source_location, the amount of churn should
> hopefully be minimized, or maybe I am missing something ...

Thanks.

I think that the above is one place where we *would* want both locus and
src_range.

One key idea within this patch is for source_location/location_t to be
able to track both a point/caret/locus and a range containing it.   The
idea is to stash the range information within the ad-hoc table.

For the most simple expressions involving just one token, locus ==
src_range.m_start, but in the most general case, locus,
src_range.m_start and src_range.m_finish are all different from each
other; consider this expression:

  foo && bar
  ~~~~^~~~~~

(i.e. where the caret/locus is at the first '&' and the range starts at
the 'f' of "foo" and finishes at the 'r' of "bar").

As noted elsewhere, we might try to pack short ranges directly into the
32 bits of source_location:
 https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01826.html
which would avoid having to use ad-hoc for such short ranges; ideally
most tokens.  I'm experimenting with that (I don't have it fully working
yet).

> [...]
> 
> diff --git a/libcpp/line-map.c b/libcpp/line-map.c
> 
> [...]
> 
> +source_range
> +get_range_from_adhoc_loc (struct line_maps *set, source_location loc)
> +{
> 
> Please add a comment for this function.

(nods)

> [...]
> 
> diff --git a/gcc/tree.c b/gcc/tree.c
> 
> +void
> +set_source_range (tree *expr, location_t start, location_t finish)
> 
> Please add a comment for this function and its overloads.

(nods)

> [...]
> 
> +void
> +set_c_expr_source_range (c_expr *expr,
> +			 location_t start, location_t finish)
> +{
> 
> Likewise.

(nods)

> +location_t
> +set_block (location_t loc, tree block)
> +{
> 
> Likewise.  I am wondering if we shouldn't even change the name of this
> function to better suit what it does: associate a tree to a location.

"associate_tree_with_location" ?

> +  source_range src_range;
> +  if (IS_ADHOC_LOC (loc))
> +    /* FIXME: can we update in-place?  */
> +    src_range = get_range_from_adhoc_loc (line_table, loc);
> 
> Hmmh, at the moment, I don't think we can update an entry of the adhoc
> map that associates {location, range, tree} as all three components
> contribute to the hash value of the entry.  A new triplet means a new
> entry.
> 
> My understanding is that initially the two elements of the pair
> {location, tree} were contributing to the hash value because the same
> location could very well belong to different blocks.

Ah; thanks.

> +  else
> +    src_range = source_range::from_location (loc);
> +
> +  return COMBINE_LOCATION_DATA (line_table, loc, src_range, block);
> +}
> 
> [...]
> 
> @@ -6091,6 +6112,10 @@ c_parser_conditional_expression (c_parser *parser, struct c_expr *after,
>  
>    if (c_parser_next_token_is_not (parser, CPP_QUERY))
>      return cond;
> +  if (cond.value != error_mark_node)
> +    start = cond.src_range.m_start;
> +  else
> +    start = UNKNOWN_LOCATION;
> 
> I think that "getting the start range of a c_expr" is an operation
> that is generally useful; and it's also generally useful for that
> operation to handle cases where the tree carried by the c_expr can be
> an error_mark_node.  So maybe it would be appropriate to have a getter
> function for that operation and use it here instead.

Good idea; thanks.  That will be helpful as I try other representations.

> You would then use that operation in the other places of this patch
> that are getting c_expr::src_range.start -- by the way, those other
> places are not handling the error_mark_node case like the above.
> 
> [...]
> 
> +++ b/gcc/c/c-tree.h
> @@ -132,6 +132,9 @@ struct c_expr
>       The type of an enum constant is a plain integer type, but this
>       field will be the enum type.  */
>    tree original_type;
> +
> +  /* FIXME.  */
> +  source_range src_range;
>  };
> 
> Why a FIXME here?

To remind myself before posting the patches that I need to give the
field a descriptive comment, explaining what purpose it serves.

Oops :)

It probably should read something like this:

  /* The source range of this C expression.  This might
     be thought of as redundant, since ranges for
     expressions can be stored in a location_t, but
     not all tree nodes in expressions have a location_t.

     Consider this source code:  

	int test (int foo)
	{
	  return foo * 100;
                ^^^   ^^^
       }

    During C parsing, the ranges for "foo" and "100" are
    stored within this field of c_expr, but after parsing
    to GENERIC, all we have is a VAR_DECL and an
    INTEGER_CST (the former's location is in at the top of the
    function, and the latter has no location).

    For consistency, we store ranges for all expressions
    in this field, not just those that don't have a
    location_t. */
  source_range src_range;


> +#define CAN_HAVE_RANGE_P(NODE) (CAN_HAVE_LOCATION_P (NODE))
> 
> Please add a comment for this.

> +#define EXPR_LOCATION_RANGE(NODE) (get_expr_source_range (EXPR_CHECK ((NODE))))
> 
> Likewise.

> +#define EXPR_HAS_RANGE(NODE) \
> +    (CAN_HAVE_RANGE_P (NODE) \
> +     ? EXPR_LOCATION_RANGE (NODE).m_start != UNKNOWN_LOCATION \
> +     : false)
> +
> 
> Likewise.
> [...]
>  
> +static inline source_range
> +get_expr_source_range (tree expr)
> 
> Likewise.
> 
> +#define DECL_LOCATION_RANGE(NODE) \
> +  (get_decl_source_range (DECL_MINIMAL_CHECK (NODE)))
> +
> 
> Likewise.
> 
> +static inline source_range
> +get_decl_source_range (tree decl)
> +{
> 
> Likewise.

Thanks
Dave

[BTW, I'm about to disappear on a vacation from tomorrow until October
6th, and will be away from email and computers]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 3/5] Implement token range tracking within libcpp and the C FE (v2)
  2015-09-25 14:53     ` David Malcolm
@ 2015-09-25 16:15       ` Dodji Seketeli
  0 siblings, 0 replies; 83+ messages in thread
From: Dodji Seketeli @ 2015-09-25 16:15 UTC (permalink / raw)
  To: David Malcolm
  Cc: gcc-patches, Manuel López-Ibáñez, Jason Merrill,
	Joseph S. Myers

David Malcolm <dmalcolm@redhat.com> a écrit:

> On Fri, 2015-09-25 at 11:13 +0200, Dodji Seketeli wrote:
[...]

>> Funny; I first overlooked this comment of you, and then when reading the
>> patch, I asked myself "why keep the existing location_t" ?  I mean, in
>> here:
>> 
>>      /* A preprocessing token.  This has been carefully packed and should
>>     -   occupy 16 bytes on 32-bit hosts and 24 bytes on 64-bit hosts.  */
>>     +   occupy 16 bytes on 32-bit hosts and 24 bytes on 64-bit hosts.
>>     +   FIXME: the above comment is no longer true with this patch.  */
>>      struct GTY(()) cpp_token {
>>        source_location src_loc;	/* Location of first char of token.  */
>>     +  source_range src_range;	/* Source range covered by the token.  */
>>        ENUM_BITFIELD(cpp_ttype) type : CHAR_BIT;  /* token type */
>>        unsigned short flags;		/* flags - see above */
>>  
>> You could just change the type of src_loc and make it be a source_range.
>
> Interesting idea.
>
> For the general case of expressions, I want a location to mean a
> caret/point plus a range that contains it, but for tokens, the
> caret/point is always at the start of the range.  So maybe a src_range
> would do here.
>
> That said, in patches 3 and 4 of this kit I'm experimenting with
> representation; as I said in the blurb for patch 3: "See the
> cover-letter for this patch kit which describes how we might go back to
> using just a location_t, and stashing the range inside the location_t.
> I'm doing it this way for now to allow for more flexibility as I
> benchmark and explore implementation options."

Right.

> So patches 3 and 4 are rather more experimental than patches 1,2 and 5,
> as I find out what the different representations do to the time&memory
> consumption of the compiler.

Understood.

> I like the idea of "source_location" and "location_t" becoming a
> representation of "(point/caret + range)" rather than just a
> point/caret, since this means that we can pass location_t around as
> before, but then we can extract ranges from them, and all of the
> existing diagnostics ought to "automagically" gain underlines "for
> free".

Me too.

>  I'm experimenting with ways to try to do that efficiently, with
> strategies for packing things into the 32-bits compactly; see e.g.:
>  https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01826.html
>
> If so, then cpp_token's src_loc would remain a source_location;
> source_location itself becomes richer.
>
>> Source range could take a converting constructor, that takes a
>> source_location, so that the existing code that does "cpp_token.src_loc
>> = a_source_location;" keeps working.
>> 
>> But then, in the previous patch of the series, I see:
>> 
>> +/* A range of source locations.
>> +
>> +   Ranges are closed:
>> +   m_start is the first location within the range,
>> +   m_finish is the last location within the range.
>> +
>> +   We may need a more compact way to store these, but for now,
>> +   let's do it the simple way, as a pair.  */
>> +struct GTY(()) source_range
>> +{
>> +  source_location m_start;
>> +  source_location m_finish;
>> +
>> +  void debug (const char *msg) const;
>> +
>> +  /* We avoid using constructors, since various structs that
>> +     don't yet have constructors will embed instances of
>> +     source_range.  */
>> +
>> 
>> But what if we define a default constructor for that one (as well as one
>> that takes a source_location)?  Wouldn't that work for the embedding
>> case that you talk about in that comment?
>
> Perhaps, but I worry that it would lead to a cascade, where suddenly
> we'd need constructors for various other types.  I can try it, I
> guess.

If it leads to a cascade, then don't bother.  My point was precisely to
try to avoid the churn, while limiting the amount of data size increase
for cpp_token in general.

As you implied, if we can just stay with a source_location that carries
the information of a pointer plus a range, even better.

> [BTW, I'm about to disappear on a vacation from tomorrow until October
> 6th, and away from computers]

Thanks for the heads-up.

-- 
		Dodji

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2)
  2015-09-25 12:34     ` Manuel López-Ibáñez
@ 2015-09-25 16:21       ` Dodji Seketeli
  0 siblings, 0 replies; 83+ messages in thread
From: Dodji Seketeli @ 2015-09-25 16:21 UTC (permalink / raw)
  To: Manuel López-Ibáñez
  Cc: David Malcolm, Gcc Patch List, Jason Merrill, Tobias Burnus,
	Joseph S. Myers, Mike Stump, Rainer Orth

Manuel López-Ibáñez <lopezibanez@gmail.com> a écrit:

> On 25 September 2015 at 10:51, Dodji Seketeli <dodji@seketeli.org> wrote:
>> The line-map parts are OK to me too, but I have no power on those.  So I
>
> You are listed as "line map" maintainer in MAINTAINERS. I rooted for
> you! :)

Right, I meant the libcpp parts (which are not line-map.h), sorry :-)

Cheers,

-- 
		Dodji

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 4/5] Implement tree expression tracking in C FE (v2)
  2015-09-25 15:04     ` David Malcolm
@ 2015-09-25 16:36       ` Dodji Seketeli
  0 siblings, 0 replies; 83+ messages in thread
From: Dodji Seketeli @ 2015-09-25 16:36 UTC (permalink / raw)
  To: David Malcolm; +Cc: gcc-patches

David Malcolm <dmalcolm@redhat.com> a écrit:

> On Fri, 2015-09-25 at 16:06 +0200, Dodji Seketeli wrote:
>> Hello,
>> 
>> Similarly to a comment I made on the previous patch of the series,
>> 
>> +++ b/libcpp/include/line-map.h
>> 
>> [...]
>> 
>>      struct GTY(()) location_adhoc_data {
>>        source_location locus;
>>     +  source_range src_range;
>>        void * GTY((skip)) data;
>>      };
>> 
>> Could we just change the type of locus and make it be a source_range
>> instead?  With the right converting constructor (in the source_range
>> class) that takes a source_location, the amount of churn should
>> hopefully be minimized, or maybe I am missing something ...
>
> Thanks.
>
> I think that the above is one place where we *would* want both locus and
> src_range.

Right, as opposed to the token case where conceptually, all we need is
the begining and the end of the token.  My confusion.

[...]

> As noted elsewhere, we might try to pack short ranges directly into the
> 32 bits of source_location:
>  https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01826.html
> which would avoid having to use ad-hoc for such short ranges; ideally
> most tokens.  I'm experimenting with that (I don't have it fully working
> yet).

Right.  My personal inclination would be to make the general case of
storing all ranges in this adhoc data structure, or, even, into another
on-the-side data structure, similar to the adhoc one, but dedicated to
range associated to points, as you see fit.

Then when that works, consider the optimization of stuffing short ranges
into the 32 bits of source_location, depending on the memory profiles we
get.  But it's your call :-)

[...]

>> +location_t
>> +set_block (location_t loc, tree block)
>> +{
>> 
>> Likewise.  I am wondering if we shouldn't even change the name of this
>> function to better suit what it does: associate a tree to a location.
>
> "associate_tree_with_location" ?

If you find it cool, I am cool :-)

[...]

>> +++ b/gcc/c/c-tree.h
>> @@ -132,6 +132,9 @@ struct c_expr
>>       The type of an enum constant is a plain integer type, but this
>>       field will be the enum type.  */
>>    tree original_type;
>> +
>> +  /* FIXME.  */
>> +  source_range src_range;
>>  };
>> 
>> Why a FIXME here?
>
> To remind myself before posting the patches that I need to give the
> field a descriptive comment, explaining what purpose it serves.
>
> Oops :)
>
> It probably should read something like this:
>
>   /* The source range of this C expression.  This might
>      be thought of as redundant, since ranges for
>      expressions can be stored in a location_t, but
>      not all tree nodes in expressions have a location_t.
>
>      Consider this source code:  
>
> 	int test (int foo)
> 	{
> 	  return foo * 100;
>                 ^^^   ^^^
>        }
>
>     During C parsing, the ranges for "foo" and "100" are
>     stored within this field of c_expr, but after parsing
>     to GENERIC, all we have is a VAR_DECL and an
>     INTEGER_CST (the former's location is in at the top of the
>     function, and the latter has no location).
>
>     For consistency, we store ranges for all expressions
>     in this field, not just those that don't have a
>     location_t. */
>   source_range src_range;

Great, thanks.

[...]

> [BTW, I'm about to disappear on a vacation from tomorrow until October
> 6th, and will be away from email and computers]

Thanks for the heads-up.

Cheers,

-- 
		Dodji

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 0/5] RFC: Overhaul of diagnostics (v2)
  2015-09-24  9:03       ` Richard Biener
@ 2015-09-25 16:50         ` Jeff Law
  2015-10-13 15:33         ` Benchmarks of v2 (was Re: [PATCH 0/5] RFC: Overhaul of diagnostics (v2)) David Malcolm
  1 sibling, 0 replies; 83+ messages in thread
From: Jeff Law @ 2015-09-25 16:50 UTC (permalink / raw)
  To: Richard Biener, David Malcolm; +Cc: Michael Matz, GCC Patches

On 09/24/2015 02:15 AM, Richard Biener wrote:
> On Thu, Sep 24, 2015 at 2:25 AM, David Malcolm <dmalcolm@redhat.com> wrote:
>
> LTO code does support ad-hoc locations but they are "restored" only
> when reading function bodies and stmts (by means of COMBINE_LOCATION_DATA).
>
>> The obvious simplification would be, as you suggest, to not bother
>> storing range information with LTO, falling back to just the existing
>> representation.  Then there's no need to extend LTO to serialize ad-hoc
>> data; simply store the underlying locus into the bit stream.  I think
>> that this happens already: lto-streamer-out.c calls expand_location and
>> stores the result, so presumably any ad-hoc location_t values made by
>> the v2 patches would have dropped their range data there when I ran the
>> test suite.
>
> Yep.  We only preserve BLOCKs, so if you don't add extra code to
> preserve ranges they'll be "dropped".
Right, but as David pointed out, most of the interesting uses for ranges 
(at least right now) are in the front-end diagnostics.  So losing them 
at this point isn't a major loss IMHO.

jeff


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 1/5] Testsuite: add dg-{begin|end}-multiline-output commands
  2015-09-22 21:09 ` [PATCH 1/5] Testsuite: add dg-{begin|end}-multiline-output commands David Malcolm
@ 2015-09-25 17:22   ` Jeff Law
  2015-09-27  1:29     ` Bernhard Reutner-Fischer
  0 siblings, 1 reply; 83+ messages in thread
From: Jeff Law @ 2015-09-25 17:22 UTC (permalink / raw)
  To: David Malcolm, gcc-patches

On 09/22/2015 03:26 PM, David Malcolm wrote:
> This patch is essentially identical to v1 here:
>    https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00729.html
> The only change is in the ChangeLog, moving the libgo.exp
> ChangeLog entry into gcc/testsuite/ChangeLog, analogous to
> where Ian put it when introducing the file in r167407.
>
> OK for trunk?
>
> Blurb from v1 follows:
>
> This patch adds an easy way to write tests for expected multiline
> output.  For example we can test carets and underlines for
> a particular diagnostic with:
>
> /* { dg-begin-multiline-output "" }
>   typedef struct _GMutex GMutex;
>                  ^~~~~~~
>     { dg-end-multiline-output "" } */
>
> It is used extensively by the rest of the patch kit.
>
> multiline.exp is used by prune.exp; hence we need to load it before
> prune.exp via *load_gcc_lib* for the testsuites of the various
> non-"gcc" support libraries (e.g. boehm-gc).
>
> gcc/testsuite/ChangeLog:
> 	* lib/multiline.exp: New file.
> 	* lib/prune.exp: Load multiline.exp.
> 	(prune_gcc_output): Call into multiline.exp to handle any
> 	multiline output directives.
> 	* lib/libgo.exp: Load multiline.exp before prune.exp, using
> 	load_gcc_lib.
>
> boehm-gc/ChangeLog:
> 	* testsuite/lib/boehm-gc.exp: Load multiline.exp before
> 	prune.exp, using load_gcc_lib.
>
> libatomic/ChangeLog:
> 	* testsuite/lib/libatomic.exp: Load multiline.exp before
> 	prune.exp, using load_gcc_lib.
>
> libgomp/ChangeLog:
> 	* testsuite/lib/libgomp.exp: Load multiline.exp before prune.exp,
> 	using load_gcc_lib.
>
> libitm/ChangeLog:
> 	* testsuite/lib/libitm.exp: Load multiline.exp before prune.exp,
> 	using load_gcc_lib.
>
> libvtv/ChangeLog:
> 	* testsuite/lib/libvtv.exp: Load multiline.exp before prune.exp,
> 	using load_gcc_lib.
This stalled due to the dejagnu version discussion, which itself has 
stalled :(

I think the only issue was the loading of prune.exp and until we've 
jumped to the latest dejagnu, using load_gcc_lib is the approved way to 
deal with that problem.

Soooo.

Approved.  Hopefully we'll be able to clean up the load_gcc_lib mess in 
the near future, but I don't see a good reason to continue to hold up 
this patch.

jeff


^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH] v3 of diagnostic_show_locus and rich_location (was Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2))
  2015-09-25  9:49   ` Dodji Seketeli
  2015-09-25 12:34     ` Manuel López-Ibáñez
@ 2015-09-25 20:39     ` David Malcolm
  2015-09-25 20:42       ` Manuel López-Ibáñez
                         ` (2 more replies)
  1 sibling, 3 replies; 83+ messages in thread
From: David Malcolm @ 2015-09-25 20:39 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: gcc-patches, Jason Merrill, Tobias Burnus, Joseph S. Myers,
	Manuel López-Ibáñez, Mike Stump, Rainer Orth

[-- Attachment #1: Type: text/plain, Size: 7449 bytes --]

On Fri, 2015-09-25 at 10:51 +0200, Dodji Seketeli wrote:
> Hello David,
> 
> I like this!  Thank you very much for working on this.

Thanks for the review.

> I think this patch is in great shape, and once we agree on some of the
> nits I have commented on below, I think it should go in. Oh, it also
> needs the first patch (1/5, dejagnu first) to go in first, as this one
> depends on it.  I defer to the dejagnu maintainers for that one.

Indeed.  Jeff just approved it, fwiw.

> The line-map parts are OK to me too, but I have no power on those.  So I
> defer to the FE maintainers for that.  The diagnostics parts of the
> Fortran, C++ and C FE look good to me too; these are just well contained
> mechanical adjustments, but I defer to FE maintainers for the final
> word.
> 
> Please find my comments below.

Updated patch attached.

> [...]
> 
> diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
> 
> [...]
> 
> +/* A class to inject colorization codes when printing the diagnostic locus,
> +   tracking state as it goes.  */
> +
> +class colorizer
> +{
> 
> [...]
> 
>     +  void set_state (int state);
>     +  void begin_state (int state);
>     +  void finish_state (int state);
> 
> I think the concept of state could use a little bit of explanation, at
> least to say that there are the same number of states, as the number
> of ranges.  And that the 'state' argument to these functions really is
> the range index.

Here's the revised comment I put in the attached patch:

+/* A class to inject colorization codes when printing the diagnostic locus.
+
+   It has one kind of colorization for each of:
+     - normal text
+     - range 0 (the "primary location")
+     - range 1
+     - range 2
+
+   The class caches the lookup of the color codes for the above.
+
+   The class also has responsibility for tracking which of the above is
+   active, filtering out unnecessary changes.  This allows layout::print_line
+   to simply request a colorization code for *every* character it prints
+   thorough this class, and have the filtering be done for it here.  */

Hopefully that comment explains the possible states the colorizer can
have.

FWIW I have a follow-up patch to add support for fix-it hints, so they
might be another kind of colorization state.
(see https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00732.html for the
earlier version of said patch, in v1 of the kit).

> Also, I am thinking that there should maybe be a layout::state type,
> which would have two notional properties (for now): range_index and
> draw_caret_p. So that this function:
> 
> +bool
> +layout::get_state_at_point (/* Inputs.  */
> +			    int row, int column,
> +			    int first_non_ws, int last_non_ws,
> +			    /* Outputs.  */
> +			    int *out_range_idx,
> +			    bool *out_draw_caret_p)
> 
> Would take just one output parameter, e.g, a reference to
> layout::state.

Fixed, though I called it "struct point_state", given that it's coming
from get_state_at_point.  I passed it by pointer, since AFAIK our coding
standards don't yet approve of the use of references in the codebase
(outside of places where we need them e.g. container classes).

I also added a unit test for a rich_location with two caret locations
(mimicking one of the Fortran examples), to give us coverage for this
case:

+void test_multiple_carets (void)
+{
+#if 0
+   x = x + y /* { dg-warning "8: test" } */
+/* { dg-begin-multiline-output "" }
+    x = x + y
+        A   B
+   { dg-end-multiline-output "" } */
+#endif
+}

where the "A" and "B" as caret chars are coming from new code in the
show_locus unittest plugin.

> +layout::layout (diagnostic_context * context,
> +		const diagnostic_info *diagnostic)
> 
> [...]
> 
> +      if (loc_range->m_finish.file != m_exploc.file)
> +	continue;
> +      if (loc_range->m_show_caret_p)
> +	if (loc_range->m_finish.file != m_exploc.file)
> +	  continue;
> 
> I think the second if clause is redundant.

Good catch; thanks.  The second if clause was meant to be testing
m_caret.file.  Fixed in the updated patch.

> 
> +  if (0)
> +    show_ruler (context, line_width, m_x_offset);
> 
> This should probably be removed from the final code to be committed.

FWIW, the ruler is very helpful to me when debugging the locus-printing
(e.g. when adding fix-it-hints), and if we remove that if (0) call, we
get:

warning: ‘void show_ruler(diagnostic_context*, int, int)’ defined but
not used [-Wunused-function]

which will break bootstrap, so perhaps it instead should be an option?
"-fdiagnostics-show-ruler" or somesuch?

I don't know that it would be helpful to end-users though.

I'd prefer to just keep it in the code with the
  if (0)
as-is, since it's useful "scaffolding" for hacking on the code.

> [...]
> 
> +/* Get the column beyond the rightmost one that could contain a caret or
> +   range marker, given that we stop rendering at trailing whitespace.  */
> +
> +int
> +layout::get_x_bound_for_row (int row, int caret_column,
> +			     int last_non_ws)
> 
> Please describe what the parameters mean here, especially last_non_ws.
> I had to read its code to know that last_non_ws was the *column* of
> the last non white space character.

I renamed it to "last_non_ws_column", and fleshed out the comment:

-/* Get the column beyond the rightmost one that could contain a caret or
-   range marker, given that we stop rendering at trailing whitespace.  */
+/* Helper function for use by layout::print_line when printing the
+   annotation line under the source line.
+   Get the column beyond the rightmost one that could contain a caret or
+   range marker, given that we stop rendering at trailing whitespace.
+   ROW is the source line within the given file.
+   CARET_COLUMN is the column of range 0's caret.
+   LAST_NON_WS_COLUMN is the last column containing a non-whitespace
+   character of source (as determined when printing the source line).  */

> [...]
> 
> +void
> +layout::print_line (int row)
> 
> This function is neat.  I like it! :-)

:)

> [...]
> 
>  void
>  diagnostic_show_locus (diagnostic_context * context,
>  		       const diagnostic_info *diagnostic)
> @@ -75,16 +710,25 @@ diagnostic_show_locus (diagnostic_context * context,
>      return;
> 
> +      /* The GCC 5 routine. */
> 
> I'd say the GCC <= 5 routine ;-)

> +  else
> +    /* The GCC 6 routine.  */
> 
> And here, the GCC > 5 routine.

Changed to "GCC < 6" and "GCC >= 6", on the pedantic grounds that e.g.
5.1 > 5

> I would be surprised to see this patch in particular incur any
> noticeable increase in time and space consumption, but, have you noticed
> anythying related to that during bootstrap?

I hadn't noticed it, but I wasn't timing.  I'll have a look.

One possible nit here is that the patch expands locations when
constructing rich_location instances, and it does that for warnings
before the logic to ignore them.  So there may be some extra calls there
that aren't present in trunk, for discarded warnings.  I don't expect
that to affect the speed of the compiler though (I expect it to be lost
in the noise).

Updated patch attached.  It compiles; a bootstrap/regrtest is in
progress, but may not be done before I disappear on vacation.  I believe
it addresses all of the points you raised apart from the show_ruler one.

OK for trunk if it passes bootstrap/regrtest?
(see https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01700.html for the
supporting blurb for v2).

Dave

[-- Attachment #2: diagnostic-show-locus-v3.patch --]
[-- Type: text/x-patch, Size: 107759 bytes --]

commit 980725625dc77a478ef725d9e28d907352c9cbd4
Author: David Malcolm <dmalcolm@redhat.com>
Date:   Mon Aug 31 21:32:20 2015 -0400

    Reimplement diagnostic_show_locus, introducing rich_location classes (v3)
    
    gcc/ChangeLog:
    	* diagnostic-color.c (color_dict): Eliminate "caret"; add "range1"
    	and "range2".
    	(parse_gcc_colors): Update comment to describe default GCC_COLORS.
    	* diagnostic-core.h (warning_at_rich_loc): New declaration.
    	(error_at_rich_loc): New declaration.
    	(permerror_at_rich_loc): New declaration.
    	(inform_at_rich_loc): New declaration.
    	* diagnostic-show-locus.c (struct point_state): New struct.
    	(class colorizer): New class.
    	(class layout_point): New class.
    	(class layout_range): New class.
    	(class layout): New class.
    	(colorizer::colorizer): New ctor.
    	(colorizer::~colorizer): New dtor.
    	(colorizer::set_state): New method.
    	(colorizer::begin_state): New method.
    	(colorizer::finish_state): New method.
    	(layout_range::layout_range): New ctor.
    	(layout_range::contains_point): New method.
    	(get_line_width_without_trailing_whitespace): New function.
    	(layout::layout): New ctor.
    	(layout::print_line): New method.
    	(layout::get_state_at_point): New method.
    	(layout::get_x_bound_for_row): New method.
    	(show_ruler): New function.
    	(diagnostic_show_locus): Call new function diagnostic_print_ranges,
    	falling back to diagnostic_print_caret_line if the frontend has
    	set frontend_calls_diagnostic_print_caret_line_p on the
    	diagnostic_context.
    	(diagnostic_print_ranges): New function.
    	* diagnostic.c (diagnostic_initialize): Replace
    	MAX_LOCATIONS_PER_MESSAGE with rich_location::MAX_RANGES.
    	(diagnostic_set_info_translated): Convert param from location_t
    	to rich_location *.  Eliminate calls to set_location on the
    	message in favor of storing the rich_location ptr there.
    	(diagnostic_set_info): Convert param from location_t to
    	rich_location *.
    	(diagnostic_build_prefix): Break out array into...
    	(diagnostic_kind_color): New variable.
    	(diagnostic_get_color_for_kind): New function.
    	(diagnostic_report_diagnostic): Colorize the option_text
    	using the color for the severity.
    	(diagnostic_append_note): Update for change in signature of
    	diagnostic_set_info.
    	(diagnostic_append_note_at_rich_loc): New function.
    	(emit_diagnostic): Update for change in signature of
    	diagnostic_set_info.
    	(inform): Likewise.
    	(inform_at_rich_loc): New function.
    	(inform_n): Update for change in signature of diagnostic_set_info.
    	(warning): Likewise.
    	(warning_at): Likewise.
    	(warning_at_rich_loc): New function.
    	(warning_n): Update for change in signature of diagnostic_set_info.
    	(pedwarn): Likewise.
    	(permerror): Likewise.
    	(permerror_at_rich_loc): New function.
    	(error): Update for change in signature of diagnostic_set_info.
    	(error_n): Likewise.
    	(error_at): Likewise.
    	(error_at_rich_loc): New function.
    	(sorry): Update for change in signature of diagnostic_set_info.
    	(fatal_error): Likewise.
    	(internal_error): Likewise.
    	(internal_error_no_backtrace): Likewise.
    	(source_range::debug): New function.
    	* diagnostic.h (struct diagnostic_info): Eliminate field
    	"override_column".  Add field "richloc".
    	(struct diagnostic_context): Convert MAX_LOCATIONS_PER_MESSAGE to
    	rich_location::MAX_RANGES.  Add field
    	"frontend_calls_diagnostic_print_caret_line_p".
    	(diagnostic_override_column): Eliminate this macro.
    	(diagnostic_set_info): Convert param from location_t to
    	rich_location *.
    	(diagnostic_set_info_translated): Likewise.
    	(diagnostic_append_note_at_rich_loc): New function.
    	(diagnostic_num_locations): New function.
    	(diagnostic_expand_location): Get the location from the
    	rich_location.
    	(diagnostic_get_color_for_kind): New declaration.
    	* genmatch.c (linemap_client_expand_location_to_spelling_point): New.
    	(error_cb): Update for change in signature of "error" callback.
    	(fatal_at): Likewise.
    	(warning_at): Likewise.
    	* input.c (linemap_client_expand_location_to_spelling_point): New.
    	* pretty-print.c (text_info::set_range): New method.
    	(text_info::get_location): New method.
    	* pretty-print.h (MAX_LOCATIONS_PER_MESSAGE): Eliminate this macro.
    	(struct text_info): Eliminate "locations" array in favor of
    	"m_richloc", a rich_location *.
    	(textinfo::set_location): Add a "caret_p" param, and reimplement
    	in terms of a call to set_range.
    	(textinfo::get_location): Eliminate inline implementation in favor of
    	an out-of-line reimplementation.
    	(textinfo::set_range): New method.
    	* rtl-error.c (diagnostic_for_asm): Update for change in signature
    	of diagnostic_set_info.
    	* tree-diagnostic.c (default_tree_printer): Update for new
    	"caret_p" param for textinfo::set_location.
    	* tree-pretty-print.c (percent_K_format): Likewise.
    
    gcc/c-family/ChangeLog:
    	* c-common.c (c_cpp_error): Convert parameter from location_t to
    	rich_location *.  Eliminate the "column_override" parameter and
    	the call to diagnostic_override_column.
    	Update the "done_lexing" clause to set range 0
    	on the rich_location, rather than overwriting a location_t.
    	* c-common.h (c_cpp_error): Convert parameter from location_t to
    	rich_location *.  Eliminate the "column_override" parameter.
    
    gcc/c/ChangeLog:
    	* c-decl.c (warn_defaults_to): Update for change in signature
    	of diagnostic_set_info.
    	* c-errors.c (pedwarn_c99): Likewise.
    	(pedwarn_c90): Likewise.
    	* c-objc-common.c (c_tree_printer): Update for new "caret_p" param
    	for textinfo::set_location.
    
    gcc/cp/ChangeLog:
    	* error.c (cp_printer): Update for new "caret_p" param for
    	textinfo::set_location.
    	(pedwarn_cxx98): Update for change in signature of
    	diagnostic_set_info.
    
    gcc/fortran/ChangeLog:
    	* cpp.c (cb_cpp_error): Convert parameter from location_t to
    	rich_location *.  Eliminate the "column_override" parameter.
    	* error.c (gfc_warning): Update for change in signature of
    	diagnostic_set_info.
    	(gfc_format_decoder): Update handling of %C/%L for changes
    	to struct text_info.
    	(gfc_diagnostic_starter): Use richloc when determining whether to
    	print one locus or two.
    	(gfc_warning_now_at): Update for change in signature of
    	diagnostic_set_info.
    	(gfc_warning_now): Likewise.
    	(gfc_error_now): Likewise.
    	(gfc_fatal_error): Likewise.
    	(gfc_error): Likewise.
    	(gfc_internal_error): Likewise.
    	(gfc_diagnostics_init): Set
    	frontend_calls_diagnostic_print_caret_line_p.
    
    gcc/testsuite/ChangeLog:
    	* gcc.dg/plugin/diagnostic-test-show-locus-bw.c: New file.
    	* gcc.dg/plugin/diagnostic-test-show-locus-color.c: New file.
    	* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c: New file.
    	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add the above.
    	* lib/gcc-dg.exp: Load multiline.exp.
    
    libcpp/ChangeLog:
    	* errors.c (cpp_diagnostic): Update for change in signature
    	of "error" callback.
    	(cpp_diagnostic_with_line): Likewise, calling override_column
    	on the rich_location.
    	* include/cpplib.h (struct cpp_callbacks): Within "error"
    	callback, convert param from source_location to rich_location *,
    	and drop column_override param.
    	* include/line-map.h (struct source_range): New struct.
    	(struct location_range): New struct.
    	(class rich_location): New class.
    	(linemap_client_expand_location_to_spelling_point): New declaration.
    	* line-map.c (rich_location::rich_location): New ctors.
    	(rich_location::lazily_expand_location): New method.
    	(rich_location::override_column): New method.
    	(rich_location::add_range): New methods.
    	(rich_location::set_range): New method.

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 4b922bf..ded23d3 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -10451,15 +10451,14 @@ c_option_controlling_cpp_error (int reason)
 /* Callback from cpp_error for PFILE to print diagnostics from the
    preprocessor.  The diagnostic is of type LEVEL, with REASON set
    to the reason code if LEVEL is represents a warning, at location
-   LOCATION unless this is after lexing and the compiler's location
-   should be used instead, with column number possibly overridden by
-   COLUMN_OVERRIDE if not zero; MSG is the translated message and AP
+   RICHLOC unless this is after lexing and the compiler's location
+   should be used instead; MSG is the translated message and AP
    the arguments.  Returns true if a diagnostic was emitted, false
    otherwise.  */
 
 bool
 c_cpp_error (cpp_reader *pfile ATTRIBUTE_UNUSED, int level, int reason,
-	     location_t location, unsigned int column_override,
+	     rich_location *richloc,
 	     const char *msg, va_list *ap)
 {
   diagnostic_info diagnostic;
@@ -10500,11 +10499,11 @@ c_cpp_error (cpp_reader *pfile ATTRIBUTE_UNUSED, int level, int reason,
       gcc_unreachable ();
     }
   if (done_lexing)
-    location = input_location;
+    richloc->set_range (0,
+			source_range::from_location (input_location),
+			true, true);
   diagnostic_set_info_translated (&diagnostic, msg, ap,
-				  location, dlevel);
-  if (column_override)
-    diagnostic_override_column (&diagnostic, column_override);
+				  richloc, dlevel);
   diagnostic_override_option_index (&diagnostic,
                                     c_option_controlling_cpp_error (reason));
   ret = report_diagnostic (&diagnostic);
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 74d1bc1..bb17fcc 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -981,9 +981,9 @@ extern void init_c_lex (void);
 
 extern void c_cpp_builtins (cpp_reader *);
 extern void c_cpp_builtins_optimize_pragma (cpp_reader *, tree, tree);
-extern bool c_cpp_error (cpp_reader *, int, int, location_t, unsigned int,
+extern bool c_cpp_error (cpp_reader *, int, int, rich_location *,
 			 const char *, va_list *)
-     ATTRIBUTE_GCC_DIAG(6,0);
+     ATTRIBUTE_GCC_DIAG(5,0);
 extern int c_common_has_attribute (cpp_reader *);
 
 extern bool parse_optimize_options (tree, bool);
diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index a110226..9af447c 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -5285,9 +5285,10 @@ warn_defaults_to (location_t location, int opt, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
                        flag_isoc99 ? DK_PEDWARN : DK_WARNING);
   diagnostic.option_index = opt;
   report_diagnostic (&diagnostic);
diff --git a/gcc/c/c-errors.c b/gcc/c/c-errors.c
index e5fbf05..0f8b933 100644
--- a/gcc/c/c-errors.c
+++ b/gcc/c/c-errors.c
@@ -42,13 +42,14 @@ pedwarn_c99 (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool warned = false;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
   /* If desired, issue the C99/C11 compat warning, which is more specific
      than -pedantic.  */
   if (warn_c99_c11_compat > 0)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 			   (pedantic && !flag_isoc11)
 			   ? DK_PEDWARN : DK_WARNING);
       diagnostic.option_index = OPT_Wc99_c11_compat;
@@ -60,7 +61,7 @@ pedwarn_c99 (location_t location, int opt, const char *gmsgid, ...)
   /* For -pedantic outside C11, issue a pedwarn.  */
   else if (pedantic && !flag_isoc11)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_PEDWARN);
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_PEDWARN);
       diagnostic.option_index = opt;
       warned = report_diagnostic (&diagnostic);
     }
@@ -80,6 +81,7 @@ pedwarn_c90 (location_t location, int opt, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
   /* Warnings such as -Wvla are the most specific ones.  */
@@ -90,7 +92,7 @@ pedwarn_c90 (location_t location, int opt, const char *gmsgid, ...)
         goto out;
       else if (opt_var > 0)
 	{
-	  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+	  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 			       (pedantic && !flag_isoc99)
 			       ? DK_PEDWARN : DK_WARNING);
 	  diagnostic.option_index = opt;
@@ -102,7 +104,7 @@ pedwarn_c90 (location_t location, int opt, const char *gmsgid, ...)
      specific than -pedantic.  */
   if (warn_c90_c99_compat > 0)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 			   (pedantic && !flag_isoc99)
 			   ? DK_PEDWARN : DK_WARNING);
       diagnostic.option_index = OPT_Wc90_c99_compat;
@@ -114,7 +116,7 @@ pedwarn_c90 (location_t location, int opt, const char *gmsgid, ...)
   /* For -pedantic outside C99, issue a pedwarn.  */
   else if (pedantic && !flag_isoc99)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_PEDWARN);
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_PEDWARN);
       diagnostic.option_index = opt;
       report_diagnostic (&diagnostic);
     }
diff --git a/gcc/c/c-objc-common.c b/gcc/c/c-objc-common.c
index 47fd7de..1e601f9 100644
--- a/gcc/c/c-objc-common.c
+++ b/gcc/c/c-objc-common.c
@@ -101,7 +101,7 @@ c_tree_printer (pretty_printer *pp, text_info *text, const char *spec,
     {
       t = va_arg (*text->args_ptr, tree);
       if (set_locus)
-	text->set_location (0, DECL_SOURCE_LOCATION (t));
+	text->set_location (0, DECL_SOURCE_LOCATION (t), true);
     }
 
   switch (*spec)
diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index faf8744..19ca8c3 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -3554,7 +3554,7 @@ cp_printer (pretty_printer *pp, text_info *text, const char *spec,
 
   pp_string (pp, result);
   if (set_locus && t != NULL)
-    text->set_location (0, location_of (t));
+    text->set_location (0, location_of (t), true);
   return true;
 #undef next_tree
 #undef next_tcode
@@ -3668,9 +3668,10 @@ pedwarn_cxx98 (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 		       (cxx_dialect == cxx98) ? DK_PEDWARN : DK_WARNING);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
diff --git a/gcc/diagnostic-color.c b/gcc/diagnostic-color.c
index 3fe49b2..d848dfc 100644
--- a/gcc/diagnostic-color.c
+++ b/gcc/diagnostic-color.c
@@ -164,7 +164,8 @@ static struct color_cap color_dict[] =
   { "warning", SGR_SEQ (COLOR_BOLD COLOR_SEPARATOR COLOR_FG_MAGENTA),
 	       7, false },
   { "note", SGR_SEQ (COLOR_BOLD COLOR_SEPARATOR COLOR_FG_CYAN), 4, false },
-  { "caret", SGR_SEQ (COLOR_BOLD COLOR_SEPARATOR COLOR_FG_GREEN), 5, false },
+  { "range1", SGR_SEQ (COLOR_FG_GREEN), 6, false },
+  { "range2", SGR_SEQ (COLOR_FG_BLUE), 6, false },
   { "locus", SGR_SEQ (COLOR_BOLD), 5, false },
   { "quote", SGR_SEQ (COLOR_BOLD), 5, false },
   { NULL, NULL, 0, false }
@@ -195,7 +196,7 @@ colorize_stop (bool show_color)
 }
 
 /* Parse GCC_COLORS.  The default would look like:
-   GCC_COLORS='error=01;31:warning=01;35:note=01;36:caret=01;32:locus=01:quote=01'
+   GCC_COLORS='error=01;31:warning=01;35:note=01;36:range1=32:range2=34;locus=01:quote=01'
    No character escaping is needed or supported.  */
 static bool
 parse_gcc_colors (void)
diff --git a/gcc/diagnostic-core.h b/gcc/diagnostic-core.h
index 66d2e42..a8a7c37 100644
--- a/gcc/diagnostic-core.h
+++ b/gcc/diagnostic-core.h
@@ -63,18 +63,26 @@ extern bool warning_n (location_t, int, int, const char *, const char *, ...)
     ATTRIBUTE_GCC_DIAG(4,6) ATTRIBUTE_GCC_DIAG(5,6);
 extern bool warning_at (location_t, int, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,4);
+extern bool warning_at_rich_loc (rich_location *, int, const char *, ...)
+    ATTRIBUTE_GCC_DIAG(3,4);
 extern void error (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
 extern void error_n (location_t, int, const char *, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,5) ATTRIBUTE_GCC_DIAG(4,5);
 extern void error_at (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
+extern void error_at_rich_loc (rich_location *, const char *, ...)
+  ATTRIBUTE_GCC_DIAG(2,3);
 extern void fatal_error (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3)
      ATTRIBUTE_NORETURN;
 /* Pass one of the OPT_W* from options.h as the second parameter.  */
 extern bool pedwarn (location_t, int, const char *, ...)
      ATTRIBUTE_GCC_DIAG(3,4);
 extern bool permerror (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
+extern bool permerror_at_rich_loc (rich_location *, const char *,
+				   ...) ATTRIBUTE_GCC_DIAG(2,3);
 extern void sorry (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
 extern void inform (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
+extern void inform_at_rich_loc (rich_location *, const char *,
+				...) ATTRIBUTE_GCC_DIAG(2,3);
 extern void inform_n (location_t, int, const char *, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,5) ATTRIBUTE_GCC_DIAG(4,5);
 extern void verbatim (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
index 147a2b8..c3a941d 100644
--- a/gcc/diagnostic-show-locus.c
+++ b/gcc/diagnostic-show-locus.c
@@ -36,6 +36,13 @@ along with GCC; see the file COPYING3.  If not see
 # include <sys/ioctl.h>
 #endif
 
+static void
+show_ruler (diagnostic_context *context, int max_width, int x_offset);
+
+static void
+diagnostic_print_ranges (diagnostic_context * context,
+			 const diagnostic_info *diagnostic);
+
 /* If LINE is longer than MAX_WIDTH, and COLUMN is not smaller than
    MAX_WIDTH by some margin, then adjust the start of the line such
    that the COLUMN is smaller than MAX_WIDTH minus the margin.  The
@@ -60,11 +67,663 @@ adjust_line (const char *line, int line_width,
   return line;
 }
 
-/* Print the physical source line corresponding to the location of
-   this diagnostic, and a caret indicating the precise column.  This
-   function only prints two caret characters if the two locations
-   given by DIAGNOSTIC are on the same line according to
-   diagnostic_same_line().  */
+/* Classes for rendering source code and diagnostics, within an
+   anonymous namespace.
+   The work is done by "class layout", which embeds and uses
+   "class colorizer" and "class layout_range" to get things done.  */
+
+namespace {
+
+/* The state at a given point of the source code, assuming that we're
+   in a range: which range are we in, and whether we should draw a caret at
+   this point.  */
+
+struct point_state
+{
+  int range_idx;
+  bool draw_caret_p;
+};
+
+/* A class to inject colorization codes when printing the diagnostic locus.
+
+   It has one kind of colorization for each of:
+     - normal text
+     - range 0 (the "primary location")
+     - range 1
+     - range 2
+
+   The class caches the lookup of the color codes for the above.
+
+   The class also has responsibility for tracking which of the above is
+   active, filtering out unnecessary changes.  This allows layout::print_line
+   to simply request a colorization code for *every* character it prints
+   thorough this class, and have the filtering be done for it here.  */
+
+class colorizer
+{
+ public:
+  colorizer (diagnostic_context *context,
+	     const diagnostic_info *diagnostic);
+  ~colorizer ();
+
+  void set_range (int range_idx) { set_state (range_idx); }
+  void set_normal_text () { set_state (STATE_NORMAL_TEXT); }
+
+ private:
+  void set_state (int state);
+  void begin_state (int state);
+  void finish_state (int state);
+
+ private:
+  static const int STATE_NORMAL_TEXT = -1;
+
+  diagnostic_context *m_context;
+  const diagnostic_info *m_diagnostic;
+  int m_current_state;
+  const char *m_caret_cs;
+  const char *m_caret_ce;
+  const char *m_range1_cs;
+  const char *m_range2_cs;
+  const char *m_range_ce;
+};
+
+/* A point within a layout_range; similar to an expanded_location,
+   but after filtering on file.  */
+
+class layout_point
+{
+ public:
+  layout_point (const expanded_location &exploc)
+  : m_line (exploc.line),
+    m_column (exploc.column) {}
+
+  int m_line;
+  int m_column;
+};
+
+/* A class for use by "class layout" below: a filtered location_range.  */
+
+class layout_range
+{
+ public:
+  layout_range (const location_range *loc_range);
+
+  bool contains_point (int row, int column) const;
+
+  layout_point m_start;
+  layout_point m_finish;
+  bool m_show_caret_p;
+  layout_point m_caret;
+};
+
+/* A class to control the overall layout when printing a diagnostic.
+
+   The layout is determined within the constructor.
+   It is then printed by repeatedly calling the "print_line" method.
+   Each such call can print two lines: one for the source line itself,
+   and potentially an "annotation" line, containing carets/underlines.
+
+   We assume we have disjoint ranges.  */
+
+class layout
+{
+ public:
+  layout (diagnostic_context *context,
+	  const diagnostic_info *diagnostic);
+
+  int get_first_line () const { return m_first_line; }
+  int get_last_line () const { return m_last_line; }
+
+  void print_line (int row);
+
+ private:
+  bool
+  get_state_at_point (/* Inputs.  */
+		      int row, int column,
+		      int first_non_ws, int last_non_ws,
+		      /* Outputs.  */
+		      point_state *out_state);
+
+  int
+  get_x_bound_for_row (int row, int caret_column,
+		       int last_non_ws);
+
+ private:
+  diagnostic_context *m_context;
+  pretty_printer *m_pp;
+  diagnostic_t m_diagnostic_kind;
+  expanded_location m_exploc;
+  colorizer m_colorizer;
+  auto_vec <layout_range> m_layout_ranges;
+  int m_first_line;
+  int m_last_line;
+  int m_x_offset;
+};
+
+/* Implementation of "class colorizer".  */
+
+/* The constructor for "colorizer".  Lookup and store color codes for the
+   different kinds of things we might need to print.  */
+
+colorizer::colorizer (diagnostic_context *context,
+		      const diagnostic_info *diagnostic) :
+  m_context (context),
+  m_diagnostic (diagnostic),
+  m_current_state (STATE_NORMAL_TEXT)
+{
+  m_caret_ce = colorize_stop (pp_show_color (context->printer));
+  m_range1_cs = colorize_start (pp_show_color (context->printer), "range1");
+  m_range2_cs = colorize_start (pp_show_color (context->printer), "range2");
+  m_range_ce = colorize_stop (pp_show_color (context->printer));
+}
+
+/* The destructor for "colorize".  If colorization is on, print a code to
+   turn it off.  */
+
+colorizer::~colorizer ()
+{
+  finish_state (m_current_state);
+}
+
+/* Update state, printing color codes if necessary if there's a state
+   change.  */
+
+void
+colorizer::set_state (int new_state)
+{
+  if (m_current_state != new_state)
+    {
+      finish_state (m_current_state);
+      m_current_state = new_state;
+      begin_state (new_state);
+    }
+}
+
+/* Turn on any colorization for STATE.  */
+
+void
+colorizer::begin_state (int state)
+{
+  switch (state)
+    {
+    case STATE_NORMAL_TEXT:
+      break;
+
+    case 0:
+      /* Make range 0 be the same color as the "kind" text
+	 (error vs warning vs note).  */
+      pp_string
+	(m_context->printer,
+	 colorize_start (pp_show_color (m_context->printer),
+			 diagnostic_get_color_for_kind (m_diagnostic->kind)));
+      break;
+
+    case 1:
+      pp_string (m_context->printer, m_range1_cs);
+      break;
+
+    case 2:
+      pp_string (m_context->printer, m_range2_cs);
+      break;
+
+    default:
+      /* We don't expect more than 3 ranges per diagnostic.  */
+      gcc_unreachable ();
+      break;
+    }
+}
+
+/* Turn off any colorization for STATE.  */
+
+void
+colorizer::finish_state (int state)
+{
+  switch (state)
+    {
+    case STATE_NORMAL_TEXT:
+      break;
+
+    case 0:
+      pp_string (m_context->printer, m_caret_ce);
+      break;
+
+    default:
+      /* Within a range.  */
+      gcc_assert (state > 0);
+      pp_string (m_context->printer, m_range_ce);
+      break;
+    }
+}
+
+/* Implementation of class layout_range.  */
+
+/* The constructor for class layout_range.
+   Initialize various layout_point fields from expanded_location
+   equivalents; we've already filtered on file.  */
+
+layout_range::layout_range (const location_range *loc_range)
+: m_start (loc_range->m_start),
+  m_finish (loc_range->m_finish),
+  m_show_caret_p (loc_range->m_show_caret_p),
+  m_caret (loc_range->m_caret)
+{
+}
+
+/* Is (column, row) within the given range?
+   We've already filtered on the file.
+
+   Ranges are closed (both limits are within the range).
+
+   Example A: a single-line range:
+     start:  (col=22, line=2)
+     finish: (col=38, line=2)
+
+  |00000011111111112222222222333333333344444444444
+  |34567890123456789012345678901234567890123456789
+--+-----------------------------------------------
+01|bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
+02|bbbbbbbbbbbbbbbbbbbSwwwwwwwwwwwwwwwFaaaaaaaaaaa
+03|aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+
+   Example B: a multiline range with
+     start:  (col=14, line=3)
+     finish: (col=08, line=5)
+
+  |00000011111111112222222222333333333344444444444
+  |34567890123456789012345678901234567890123456789
+--+-----------------------------------------------
+01|bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
+02|bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
+03|bbbbbbbbbbbSwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
+04|wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
+05|wwwwwFaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+06|aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+--+-----------------------------------------------
+
+   Legend:
+   - 'b' indicates a point *before* the range
+   - 'S' indicates the start of the range
+   - 'w' indicates a point within the range
+   - 'F' indicates the finish of the range (which is
+	 within it).
+   - 'a' indicates a subsequent point *after* the range.  */
+
+bool
+layout_range::contains_point (int row, int column) const
+{
+  gcc_assert (m_start.m_line <= m_finish.m_line);
+  /* ...but the equivalent isn't true for the columns;
+     consider example B in the comment above.  */
+
+  if (row < m_start.m_line)
+    /* Points before the first line of the range are
+       outside it (corresponding to line 01 in example A
+       and lines 01 and 02 in example B above).  */
+    return false;
+
+  if (row == m_start.m_line)
+    /* On same line as start of range (corresponding
+       to line 02 in example A and line 03 in example B).  */
+    {
+      if (column < m_start.m_column)
+	/* Points on the starting line of the range, but
+	   before the column in which it begins.  */
+	return false;
+
+      if (row < m_finish.m_line)
+	/* This is a multiline range; the point
+	   is within it (corresponds to line 03 in example B
+	   from column 14 onwards) */
+	return true;
+      else
+	{
+	  /* This is a single-line range.  */
+	  gcc_assert (row == m_finish.m_line);
+	  return column <= m_finish.m_column;
+	}
+    }
+
+  /* The point is in a line beyond that containing the
+     start of the range: lines 03 onwards in example A,
+     and lines 04 onwards in example B.  */
+  gcc_assert (row > m_start.m_line);
+
+  if (row > m_finish.m_line)
+    /* The point is beyond the final line of the range
+       (lines 03 onwards in example A, and lines 06 onwards
+       in example B).  */
+    return false;
+
+  if (row < m_finish.m_line)
+    {
+      /* The point is in a line that's fully within a multiline
+	 range (e.g. line 04 in example B).  */
+      gcc_assert (m_start.m_line < m_finish.m_line);
+      return true;
+    }
+
+  gcc_assert (row ==  m_finish.m_line);
+
+  return column <= m_finish.m_column;
+}
+
+/* Given a source line LINE of length LINE_WIDTH, determine the width
+   without any trailing whitespace.  */
+
+static int
+get_line_width_without_trailing_whitespace (const char *line, int line_width)
+{
+  int result = line_width;
+  while (result > 0)
+    {
+      char ch = line[result - 1];
+      if (ch == ' ' || ch == '\t')
+	result--;
+      else
+	break;
+    }
+  gcc_assert (result >= 0);
+  gcc_assert (result <= line_width);
+  gcc_assert (result == 0 ||
+	      (line[result - 1] != ' '
+	       && line[result -1] != '\t'));
+  return result;
+}
+
+/* Implementation of class layout.  */
+
+/* Constructor for class layout.
+
+   Filter the ranges from the rich_location to those that we can
+   sanely print, populating m_layout_ranges.
+   Determine the range of lines that we will print.
+   Determine m_x_offset, to ensure that the primary caret
+   will fit within the max_width provided by the diagnostic_context.  */
+
+layout::layout (diagnostic_context * context,
+		const diagnostic_info *diagnostic)
+: m_context (context),
+  m_pp (context->printer),
+  m_diagnostic_kind (diagnostic->kind),
+  m_exploc (diagnostic->richloc->lazily_expand_location ()),
+  m_colorizer (context, diagnostic),
+  m_layout_ranges (rich_location::MAX_RANGES),
+  m_first_line (m_exploc.line),
+  m_last_line  (m_exploc.line),
+  m_x_offset (0)
+{
+  rich_location *richloc = diagnostic->richloc;
+  for (unsigned int idx = 0; idx < richloc->get_num_locations (); idx++)
+    {
+      /* This diagnostic printer can only cope with "sufficiently sane" ranges.
+	 Ignore any ranges that are awkward to handle.  */
+      location_range *loc_range = richloc->get_range (idx);
+
+      /* If any part of the range isn't in the same file as the primary
+	 location of this diagnostic, ignore the range.  */
+      if (loc_range->m_start.file != m_exploc.file)
+	continue;
+      if (loc_range->m_finish.file != m_exploc.file)
+	continue;
+      if (loc_range->m_show_caret_p)
+	if (loc_range->m_caret.file != m_exploc.file)
+	  continue;
+
+      /* Passed all the tests; add the range to m_layout_ranges so that
+	 it will be printed.  */
+      layout_range ri (loc_range);
+      m_layout_ranges.safe_push (ri);
+
+      /* Update m_first_line/m_last_line if necessary.  */
+      if (loc_range->m_start.line < m_first_line)
+	m_first_line = loc_range->m_start.line;
+      if (loc_range->m_finish.line > m_last_line)
+	m_last_line = loc_range->m_finish.line;
+    }
+
+  /* Adjust m_x_offset.
+     Center the primary caret to fit in max_width; all columns
+     will be adjusted accordingly.  */
+  int max_width = m_context->caret_max_width;
+  int line_width;
+  const char *line = location_get_source_line (m_exploc.file, m_exploc.line,
+					       &line_width);
+  if (line && m_exploc.column <= line_width)
+    {
+      int right_margin = CARET_LINE_MARGIN;
+      int column = m_exploc.column;
+      right_margin = MIN (line_width - column, right_margin);
+      right_margin = max_width - right_margin;
+      if (line_width >= max_width && column > right_margin)
+	m_x_offset = column - right_margin;
+      gcc_assert (m_x_offset >= 0);
+    }
+
+  if (0)
+    show_ruler (context, line_width, m_x_offset);
+}
+
+/* Print text describing a line of source code.
+   This typically prints two lines:
+
+   (1) the source code itself, colorized at any ranges, and
+   (2) an annotation line containing any carets/underlines
+   describing the ranges.  */
+
+void
+layout::print_line (int row)
+{
+  int line_width;
+  const char *line = location_get_source_line (m_exploc.file, row,
+					       &line_width);
+  if (!line)
+    return;
+
+  line += m_x_offset;
+
+  m_colorizer.set_normal_text ();
+
+  /* Step 1: print the source code line.  */
+
+  /* We will stop printing at any trailing whitespace.  */
+  line_width
+    = get_line_width_without_trailing_whitespace (line,
+						  line_width);
+  pp_space (m_pp);
+  int first_non_ws = INT_MAX;
+  int last_non_ws = 0;
+  int column;
+  for (column = 1 + m_x_offset; column <= line_width; column++)
+    {
+      bool in_range_p;
+      point_state state;
+      in_range_p = get_state_at_point (row, column,
+				       0, INT_MAX,
+				       &state);
+      if (in_range_p)
+	m_colorizer.set_range (state.range_idx);
+      else
+	m_colorizer.set_normal_text ();
+      char c = *line == '\t' ? ' ' : *line;
+      if (c == '\0')
+	c = ' ';
+      if (c != ' ')
+	{
+	  last_non_ws = column;
+	  if (first_non_ws == INT_MAX)
+	    first_non_ws = column;
+	}
+      pp_character (m_pp, c);
+      line++;
+    }
+  pp_newline (m_pp);
+
+  /* Step 2: print a line consisting of the caret/underlines for the
+     given source line.  */
+  int x_bound = get_x_bound_for_row (row, m_exploc.column,
+				     last_non_ws);
+
+  pp_space (m_pp);
+  for (int column = 1 + m_x_offset; column < x_bound; column++)
+    {
+      bool in_range_p;
+      point_state state;
+      in_range_p = get_state_at_point (row, column,
+				       first_non_ws, last_non_ws,
+				       &state);
+      if (in_range_p)
+	{
+	  /* Within a range.  Draw either the caret or an underline.  */
+	  m_colorizer.set_range (state.range_idx);
+	  if (state.draw_caret_p)
+	    /* Draw the caret.  */
+	    pp_character (m_pp, m_context->caret_chars[state.range_idx]);
+	  else
+	    pp_character (m_pp, '~');
+	}
+      else
+	{
+	  /* Not in a range.  */
+	  m_colorizer.set_normal_text ();
+	  pp_character (m_pp, ' ');
+	}
+    }
+  pp_newline (m_pp);
+}
+
+/* Return true if (ROW/COLUMN) is within a range of the layout.
+   If it returns true, OUT_STATE is written to, with the
+   range index, and whether we should draw the caret at
+   (ROW/COLUMN) (as opposed to an underline).  */
+
+bool
+layout::get_state_at_point (/* Inputs.  */
+			    int row, int column,
+			    int first_non_ws, int last_non_ws,
+			    /* Outputs.  */
+			    point_state *out_state)
+{
+  /* Within a multiline range, don't display any underline or caret
+     in any leading or trailing whitespace on a line.  */
+  if (column < first_non_ws || column > last_non_ws)
+    return false;
+
+  layout_range *range;
+  int i;
+  FOR_EACH_VEC_ELT (m_layout_ranges, i, range)
+    {
+      if (0)
+	fprintf (stderr,
+		 "range ( (%i, %i), (%i, %i))->contains_point (%i, %i): %s\n",
+		 range->m_start.m_line,
+		 range->m_start.m_column,
+		 range->m_finish.m_line,
+		 range->m_finish.m_column,
+		 row,
+		 column,
+		 range->contains_point (row, column) ? "true" : "false");
+
+      if (range->contains_point (row, column))
+	{
+	  out_state->range_idx = i;
+
+	  /* Are we at the range's caret?  is it visible? */
+	  out_state->draw_caret_p = false;
+	  if (row == range->m_caret.m_line
+	      && column == range->m_caret.m_column)
+	    out_state->draw_caret_p = range->m_show_caret_p;
+
+	  /* We are within a range.  */
+	  return true;
+	}
+    }
+
+  return false;
+}
+
+/* Helper function for use by layout::print_line when printing the
+   annotation line under the source line.
+   Get the column beyond the rightmost one that could contain a caret or
+   range marker, given that we stop rendering at trailing whitespace.
+   ROW is the source line within the given file.
+   CARET_COLUMN is the column of range 0's caret.
+   LAST_NON_WS_COLUMN is the last column containing a non-whitespace
+   character of source (as determined when printing the source line).  */
+
+int
+layout::get_x_bound_for_row (int row, int caret_column,
+			     int last_non_ws_column)
+{
+  int result = caret_column + 1;
+
+  layout_range *range;
+  int i;
+  FOR_EACH_VEC_ELT (m_layout_ranges, i, range)
+    {
+      if (row >= range->m_start.m_line)
+	{
+	  if (range->m_finish.m_line == row)
+	    {
+	      /* On the final line within a range; ensure that
+		 we render up to the end of the range.  */
+	      if (result <= range->m_finish.m_column)
+		result = range->m_finish.m_column + 1;
+	    }
+	  else if (row < range->m_finish.m_line)
+	    {
+	      /* Within a multiline range; ensure that we render up to the
+		 last non-whitespace column.  */
+	      if (result <= last_non_ws_column)
+		result = last_non_ws_column + 1;
+	    }
+	}
+    }
+
+  return result;
+}
+
+} /* End of anonymous namespace.  */
+
+/* For debugging layout issues in diagnostic_show_locus and friends,
+   render a ruler giving column numbers (after the 1-column indent).  */
+
+static void
+show_ruler (diagnostic_context *context, int max_width, int x_offset)
+{
+  /* Hundreds.  */
+  if (max_width > 99)
+    {
+      pp_space (context->printer);
+      for (int column = 1 + x_offset; column < max_width; column++)
+	if (0 == column % 10)
+	  pp_character (context->printer, '0' + (column / 100) % 10);
+	else
+	  pp_space (context->printer);
+      pp_newline (context->printer);
+    }
+
+  /* Tens.  */
+  pp_space (context->printer);
+  for (int column = 1 + x_offset; column < max_width; column++)
+    if (0 == column % 10)
+      pp_character (context->printer, '0' + (column / 10) % 10);
+    else
+      pp_space (context->printer);
+  pp_newline (context->printer);
+
+  /* Units.  */
+  pp_space (context->printer);
+  for (int column = 1 + x_offset; column < max_width; column++)
+    pp_character (context->printer, '0' + (column % 10));
+  pp_newline (context->printer);
+}
+
+/* Print the physical source code corresponding to the location of
+   this diagnostic, with additional annotations.
+   If CONTEXT has set frontend_calls_diagnostic_print_caret_line_p,
+   the code is printed using diagnostic_print_caret_line; otherwise
+   it is printed using diagnostic_print_ranges.  */
+
 void
 diagnostic_show_locus (diagnostic_context * context,
 		       const diagnostic_info *diagnostic)
@@ -75,16 +734,25 @@ diagnostic_show_locus (diagnostic_context * context,
     return;
 
   context->last_location = diagnostic_location (diagnostic, 0);
-  expanded_location s0 = diagnostic_expand_location (diagnostic, 0);
-  expanded_location s1 = { };
-  /* Zero-initialized. This is checked later by diagnostic_print_caret_line.  */
 
-  if (diagnostic_location (diagnostic, 1) > BUILTINS_LOCATION)
-    s1 = diagnostic_expand_location (diagnostic, 1);
+  if (context->frontend_calls_diagnostic_print_caret_line_p)
+    {
+      /* The GCC < 6 routine. */
+      expanded_location s0 = diagnostic_expand_location (diagnostic, 0);
+      expanded_location s1 = { };
+      /* Zero-initialized. This is checked later by
+	 diagnostic_print_caret_line.  */
+
+      if (diagnostic_num_locations (diagnostic) >= 2)
+	s1 = diagnostic->message.m_richloc->get_range (1)->m_start;
 
-  diagnostic_print_caret_line (context, s0, s1,
-			       context->caret_chars[0],
-			       context->caret_chars[1]);
+      diagnostic_print_caret_line (context, s0, s1,
+				   context->caret_chars[0],
+				   context->caret_chars[1]);
+    }
+  else
+    /* The GCC >= 6 routine.  */
+    diagnostic_print_ranges (context, diagnostic);
 }
 
 /* Print (part) of the source line given by xloc1 with caret1 pointing
@@ -164,3 +832,33 @@ diagnostic_print_caret_line (diagnostic_context * context,
   pp_set_prefix (context->printer, saved_prefix);
   pp_needs_newline (context->printer) = true;
 }
+
+/* Print all source lines covered by the locations and any ranges
+   within DIAGNOSTIC, displaying one or more carets and zero or more
+   underlines as appropriate.  */
+
+static void
+diagnostic_print_ranges (diagnostic_context * context,
+			 const diagnostic_info *diagnostic)
+{
+  pp_newline (context->printer);
+
+  const char *saved_prefix = pp_get_prefix (context->printer);
+  pp_set_prefix (context->printer, NULL);
+
+  {
+    layout layout (context, diagnostic);
+    int last_line = layout.get_last_line ();
+    for (int row = layout.get_first_line ();
+	 row <= last_line;
+	 row++)
+      layout.print_line (row);
+
+    /* The closing scope here leads to the dtor for layout and thus
+       colorizer being called here, which affects the precise
+       place where colorization is turned off in the unittest
+       for colorized output.  */
+  }
+
+  pp_set_prefix (context->printer, saved_prefix);
+}
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 831859a..5fe6627 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -144,7 +144,7 @@ diagnostic_initialize (diagnostic_context *context, int n_opts)
     context->classify_diagnostic[i] = DK_UNSPECIFIED;
   context->show_caret = false;
   diagnostic_set_caret_max_width (context, pp_line_cutoff (context->printer));
-  for (i = 0; i < MAX_LOCATIONS_PER_MESSAGE; i++)
+  for (i = 0; i < rich_location::MAX_RANGES; i++)
     context->caret_chars[i] = '^';
   context->show_option_requested = false;
   context->abort_on_error = false;
@@ -234,16 +234,15 @@ diagnostic_finish (diagnostic_context *context)
    translated.  */
 void
 diagnostic_set_info_translated (diagnostic_info *diagnostic, const char *msg,
-				va_list *args, location_t location,
+				va_list *args, rich_location *richloc,
 				diagnostic_t kind)
 {
+  gcc_assert (richloc);
   diagnostic->message.err_no = errno;
   diagnostic->message.args_ptr = args;
   diagnostic->message.format_spec = msg;
-  diagnostic->message.set_location (0, location);
-  for (int i = 1; i < MAX_LOCATIONS_PER_MESSAGE; i++)
-    diagnostic->message.set_location (i, UNKNOWN_LOCATION);
-  diagnostic->override_column = 0;
+  diagnostic->message.m_richloc = richloc;
+  diagnostic->richloc = richloc;
   diagnostic->kind = kind;
   diagnostic->option_index = 0;
 }
@@ -252,10 +251,27 @@ diagnostic_set_info_translated (diagnostic_info *diagnostic, const char *msg,
    translated.  */
 void
 diagnostic_set_info (diagnostic_info *diagnostic, const char *gmsgid,
-		     va_list *args, location_t location,
+		     va_list *args, rich_location *richloc,
 		     diagnostic_t kind)
 {
-  diagnostic_set_info_translated (diagnostic, _(gmsgid), args, location, kind);
+  gcc_assert (richloc);
+  diagnostic_set_info_translated (diagnostic, _(gmsgid), args, richloc, kind);
+}
+
+static const char *const diagnostic_kind_color[] = {
+#define DEFINE_DIAGNOSTIC_KIND(K, T, C) (C),
+#include "diagnostic.def"
+#undef DEFINE_DIAGNOSTIC_KIND
+  NULL
+};
+
+/* Get a color name for diagnostics of type KIND
+   Result could be NULL.  */
+
+const char *
+diagnostic_get_color_for_kind (diagnostic_t kind)
+{
+  return diagnostic_kind_color[kind];
 }
 
 /* Return a malloc'd string describing a location.  The caller is
@@ -270,12 +286,6 @@ diagnostic_build_prefix (diagnostic_context *context,
 #undef DEFINE_DIAGNOSTIC_KIND
     "must-not-happen"
   };
-  static const char *const diagnostic_kind_color[] = {
-#define DEFINE_DIAGNOSTIC_KIND(K, T, C) (C),
-#include "diagnostic.def"
-#undef DEFINE_DIAGNOSTIC_KIND
-    NULL
-  };
   gcc_assert (diagnostic->kind < DK_LAST_DIAGNOSTIC_KIND);
 
   const char *text = _(diagnostic_kind_text[diagnostic->kind]);
@@ -771,10 +781,14 @@ diagnostic_report_diagnostic (diagnostic_context *context,
 
       if (option_text)
 	{
+	  const char *cs
+	    = colorize_start (pp_show_color (context->printer),
+			      diagnostic_kind_color[diagnostic->kind]);
+	  const char *ce = colorize_stop (pp_show_color (context->printer));
 	  diagnostic->message.format_spec
 	    = ACONCAT ((diagnostic->message.format_spec,
 			" ", 
-			"[", option_text, "]",
+			"[", cs, option_text, ce, "]",
 			NULL));
 	  free (option_text);
 	}
@@ -854,9 +868,40 @@ diagnostic_append_note (diagnostic_context *context,
   diagnostic_info diagnostic;
   va_list ap;
   const char *saved_prefix;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_NOTE);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_NOTE);
+  if (context->inhibit_notes_p)
+    {
+      va_end (ap);
+      return;
+    }
+  saved_prefix = pp_get_prefix (context->printer);
+  pp_set_prefix (context->printer,
+                 diagnostic_build_prefix (context, &diagnostic));
+  pp_newline (context->printer);
+  pp_format (context->printer, &diagnostic.message);
+  pp_output_formatted_text (context->printer);
+  pp_destroy_prefix (context->printer);
+  pp_set_prefix (context->printer, saved_prefix);
+  diagnostic_show_locus (context, &diagnostic);
+  va_end (ap);
+}
+
+/* Same as diagnostic_append_note, but at RICHLOC. */
+
+void
+diagnostic_append_note_at_rich_loc (diagnostic_context *context,
+				    rich_location *richloc,
+				    const char * gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
+  const char *saved_prefix;
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, richloc, DK_NOTE);
   if (context->inhibit_notes_p)
     {
       va_end (ap);
@@ -881,16 +926,17 @@ emit_diagnostic (diagnostic_t kind, location_t location, int opt,
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
   if (kind == DK_PERMERROR)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 			   permissive_error_kind (global_dc));
       diagnostic.option_index = permissive_error_option (global_dc);
     }
   else {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location, kind);
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, kind);
       if (kind == DK_WARNING || kind == DK_PEDWARN)
 	diagnostic.option_index = opt;
   }
@@ -907,9 +953,23 @@ inform (location_t location, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_NOTE);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_NOTE);
+  report_diagnostic (&diagnostic);
+  va_end (ap);
+}
+
+/* Same as "inform", but at RICHLOC.  */
+void
+inform_at_rich_loc (rich_location *richloc, const char *gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, richloc, DK_NOTE);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -922,11 +982,12 @@ inform_n (location_t location, int n, const char *singular_gmsgid,
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
 
   va_start (ap, plural_gmsgid);
   diagnostic_set_info_translated (&diagnostic,
                                   ngettext (singular_gmsgid, plural_gmsgid, n),
-                                  &ap, location, DK_NOTE);
+                                  &ap, &richloc, DK_NOTE);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -940,9 +1001,10 @@ warning (int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_WARNING);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_WARNING);
   diagnostic.option_index = opt;
 
   ret = report_diagnostic (&diagnostic);
@@ -960,9 +1022,27 @@ warning_at (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_WARNING);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_WARNING);
+  diagnostic.option_index = opt;
+  ret = report_diagnostic (&diagnostic);
+  va_end (ap);
+  return ret;
+}
+
+/* Same as warning at, but using RICHLOC.  */
+
+bool
+warning_at_rich_loc (rich_location *richloc, int opt, const char *gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
+  bool ret;
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, richloc, DK_WARNING);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
   va_end (ap);
@@ -980,11 +1060,13 @@ warning_n (location_t location, int opt, int n, const char *singular_gmsgid,
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, plural_gmsgid);
   diagnostic_set_info_translated (&diagnostic,
                                   ngettext (singular_gmsgid, plural_gmsgid, n),
-                                  &ap, location, DK_WARNING);
+                                  &ap, &richloc, DK_WARNING
+);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
   va_end (ap);
@@ -1010,9 +1092,10 @@ pedwarn (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,  DK_PEDWARN);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,  DK_PEDWARN);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
   va_end (ap);
@@ -1032,9 +1115,28 @@ permerror (location_t location, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
+                       permissive_error_kind (global_dc));
+  diagnostic.option_index = permissive_error_option (global_dc);
+  ret = report_diagnostic (&diagnostic);
+  va_end (ap);
+  return ret;
+}
+
+/* Same as "permerror", but at RICHLOC.  */
+
+bool
+permerror_at_rich_loc (rich_location *richloc, const char *gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
+  bool ret;
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, richloc,
                        permissive_error_kind (global_dc));
   diagnostic.option_index = permissive_error_option (global_dc);
   ret = report_diagnostic (&diagnostic);
@@ -1049,9 +1151,10 @@ error (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_ERROR);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ERROR);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -1064,11 +1167,12 @@ error_n (location_t location, int n, const char *singular_gmsgid,
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
 
   va_start (ap, plural_gmsgid);
   diagnostic_set_info_translated (&diagnostic,
                                   ngettext (singular_gmsgid, plural_gmsgid, n),
-                                  &ap, location, DK_ERROR);
+                                  &ap, &richloc, DK_ERROR);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -1079,9 +1183,25 @@ error_at (location_t loc, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (loc);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, loc, DK_ERROR);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ERROR);
+  report_diagnostic (&diagnostic);
+  va_end (ap);
+}
+
+/* Same as above, but use RICH_LOC.  */
+
+void
+error_at_rich_loc (rich_location *rich_loc, const char *gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, rich_loc,
+		       DK_ERROR);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -1094,9 +1214,10 @@ sorry (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_SORRY);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_SORRY);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -1117,9 +1238,10 @@ fatal_error (location_t loc, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (loc);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, loc, DK_FATAL);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_FATAL);
   report_diagnostic (&diagnostic);
   va_end (ap);
 
@@ -1135,9 +1257,10 @@ internal_error (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_ICE);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ICE);
   report_diagnostic (&diagnostic);
   va_end (ap);
 
@@ -1152,9 +1275,10 @@ internal_error_no_backtrace (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_ICE_NOBT);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ICE_NOBT);
   report_diagnostic (&diagnostic);
   va_end (ap);
 
@@ -1218,3 +1342,11 @@ real_abort (void)
 {
   abort ();
 }
+
+void
+source_range::debug (const char *msg) const
+{
+  rich_location richloc (m_start);
+  richloc.add_range (m_start, m_finish);
+  inform_at_rich_loc (&richloc, "%s", msg);
+}
diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h
index 7fcb6a8..66a867c 100644
--- a/gcc/diagnostic.h
+++ b/gcc/diagnostic.h
@@ -29,10 +29,12 @@ along with GCC; see the file COPYING3.  If not see
    list in diagnostic.def.  */
 struct diagnostic_info
 {
-  /* Text to be formatted. It also contains the location(s) for this
-     diagnostic.  */
+  /* Text to be formatted.  */
   text_info message;
-  unsigned int override_column;
+
+  /* The location at which the diagnostic is to be reported.  */
+  rich_location *richloc;
+
   /* Auxiliary data for client.  */
   void *x_data;
   /* The kind of diagnostic it is about.  */
@@ -102,8 +104,8 @@ struct diagnostic_context
   /* Maximum width of the source line printed.  */
   int caret_max_width;
 
-  /* Characters used for caret diagnostics.  */
-  char caret_chars[MAX_LOCATIONS_PER_MESSAGE];
+  /* Character used for caret diagnostics.  */
+  char caret_chars[rich_location::MAX_RANGES];
 
   /* True if we should print the command line option which controls
      each diagnostic, if known.  */
@@ -181,6 +183,11 @@ struct diagnostic_context
   int lock;
 
   bool inhibit_notes_p;
+
+  /* Does the frontend make calls to diagnostic_print_caret_line?
+     If so, we fall back to the old implementation of
+     diagnostic_show_locus.  */
+  bool frontend_calls_diagnostic_print_caret_line_p;
 };
 
 static inline void
@@ -252,10 +259,6 @@ extern diagnostic_context *global_dc;
 
 #define report_diagnostic(D) diagnostic_report_diagnostic (global_dc, D)
 
-/* Override the column number to be used for reporting a
-   diagnostic.  */
-#define diagnostic_override_column(DI, COL) (DI)->override_column = (COL)
-
 /* Override the option index to be used for reporting a
    diagnostic.  */
 #define diagnostic_override_option_index(DI, OPTIDX) \
@@ -279,13 +282,17 @@ extern bool diagnostic_report_diagnostic (diagnostic_context *,
 					  diagnostic_info *);
 #ifdef ATTRIBUTE_GCC_DIAG
 extern void diagnostic_set_info (diagnostic_info *, const char *, va_list *,
-				 location_t, diagnostic_t) ATTRIBUTE_GCC_DIAG(2,0);
+				 rich_location *, diagnostic_t) ATTRIBUTE_GCC_DIAG(2,0);
 extern void diagnostic_set_info_translated (diagnostic_info *, const char *,
-					    va_list *, location_t,
+					    va_list *, rich_location *,
 					    diagnostic_t)
      ATTRIBUTE_GCC_DIAG(2,0);
 extern void diagnostic_append_note (diagnostic_context *, location_t,
                                     const char *, ...) ATTRIBUTE_GCC_DIAG(3,4);
+extern void diagnostic_append_note_at_rich_loc (diagnostic_context *,
+						rich_location *,
+						const char *, ...)
+  ATTRIBUTE_GCC_DIAG(3,4);
 #endif
 extern char *diagnostic_build_prefix (diagnostic_context *, const diagnostic_info *);
 void default_diagnostic_starter (diagnostic_context *, diagnostic_info *);
@@ -306,6 +313,14 @@ diagnostic_location (const diagnostic_info * diagnostic, int which = 0)
   return diagnostic->message.get_location (which);
 }
 
+/* Return the number of locations to be printed in DIAGNOSTIC.  */
+
+static inline unsigned int
+diagnostic_num_locations (const diagnostic_info * diagnostic)
+{
+  return diagnostic->message.m_richloc->get_num_locations ();
+}
+
 /* Expand the location of this diagnostic. Use this function for
    consistency.  Parameter WHICH specifies which location. By default,
    expand the first one.  */
@@ -313,12 +328,7 @@ diagnostic_location (const diagnostic_info * diagnostic, int which = 0)
 static inline expanded_location
 diagnostic_expand_location (const diagnostic_info * diagnostic, int which = 0)
 {
-  expanded_location s
-    = expand_location_to_spelling_point (diagnostic_location (diagnostic,
-							      which));
-  if (which == 0 && diagnostic->override_column)
-    s.column = diagnostic->override_column;
-  return s;
+  return diagnostic->richloc->get_range (which)->m_caret;
 }
 
 /* This is somehow the right-side margin of a caret line, that is, we
@@ -344,6 +354,10 @@ diagnostic_print_caret_line (diagnostic_context * context,
 			     expanded_location xloc2,
 			     char caret1, char caret2);
 
+
+extern const char *
+diagnostic_get_color_for_kind (diagnostic_t kind);
+
 /* Pure text formatting support functions.  */
 extern char *file_name_as_prefix (diagnostic_context *, const char *);
 
diff --git a/gcc/fortran/cpp.c b/gcc/fortran/cpp.c
index daffc20..92dc584 100644
--- a/gcc/fortran/cpp.c
+++ b/gcc/fortran/cpp.c
@@ -149,9 +149,9 @@ static void cb_include (cpp_reader *, source_location, const unsigned char *,
 static void cb_ident (cpp_reader *, source_location, const cpp_string *);
 static void cb_used_define (cpp_reader *, source_location, cpp_hashnode *);
 static void cb_used_undef (cpp_reader *, source_location, cpp_hashnode *);
-static bool cb_cpp_error (cpp_reader *, int, int, location_t, unsigned int,
+static bool cb_cpp_error (cpp_reader *, int, int, rich_location *,
 			  const char *, va_list *)
-     ATTRIBUTE_GCC_DIAG(6,0);
+     ATTRIBUTE_GCC_DIAG(5,0);
 void pp_dir_change (cpp_reader *, const char *);
 
 static int dump_macro (cpp_reader *, cpp_hashnode *, void *);
@@ -1026,13 +1026,12 @@ cb_used_define (cpp_reader *pfile, source_location line ATTRIBUTE_UNUSED,
 /* Callback from cpp_error for PFILE to print diagnostics from the
    preprocessor.  The diagnostic is of type LEVEL, with REASON set
    to the reason code if LEVEL is represents a warning, at location
-   LOCATION, with column number possibly overridden by COLUMN_OVERRIDE
-   if not zero; MSG is the translated message and AP the arguments.
+   RICHLOC; MSG is the translated message and AP the arguments.
    Returns true if a diagnostic was emitted, false otherwise.  */
 
 static bool
 cb_cpp_error (cpp_reader *pfile ATTRIBUTE_UNUSED, int level, int reason,
-	      location_t location, unsigned int column_override,
+	      rich_location *richloc,
 	      const char *msg, va_list *ap)
 {
   diagnostic_info diagnostic;
@@ -1067,9 +1066,7 @@ cb_cpp_error (cpp_reader *pfile ATTRIBUTE_UNUSED, int level, int reason,
       gcc_unreachable ();
     }
   diagnostic_set_info_translated (&diagnostic, msg, ap,
-				  location, dlevel);
-  if (column_override)
-    diagnostic_override_column (&diagnostic, column_override);
+				  richloc, dlevel);
   if (reason == CPP_W_WARNING_DIRECTIVE)
     diagnostic_override_option_index (&diagnostic, OPT_Wcpp);
   ret = report_diagnostic (&diagnostic);
diff --git a/gcc/fortran/error.c b/gcc/fortran/error.c
index 3825751..3d9deb0 100644
--- a/gcc/fortran/error.c
+++ b/gcc/fortran/error.c
@@ -773,6 +773,7 @@ gfc_warning (int opt, const char *gmsgid, va_list ap)
   va_copy (argp, ap);
 
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
   bool fatal_errors = global_dc->fatal_errors;
   pretty_printer *pp = global_dc->printer;
   output_buffer *tmp_buffer = pp->buffer;
@@ -787,7 +788,7 @@ gfc_warning (int opt, const char *gmsgid, va_list ap)
       --werrorcount;
     }
 
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION,
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc,
 		       DK_WARNING);
   diagnostic.option_index = opt;
   bool ret = report_diagnostic (&diagnostic);
@@ -938,10 +939,12 @@ gfc_format_decoder (pretty_printer *pp,
 	/* If location[0] != UNKNOWN_LOCATION means that we already
 	   processed one of %C/%L.  */
 	int loc_num = text->get_location (0) == UNKNOWN_LOCATION ? 0 : 1;
-	text->set_location (loc_num,
-			    linemap_position_for_loc_and_offset (line_table,
-								 loc->lb->location,
-								 offset));
+	source_range range
+	  = source_range::from_location (
+	      linemap_position_for_loc_and_offset (line_table,
+						   loc->lb->location,
+						   offset));
+	text->set_range (loc_num, range, true);
 	pp_string (pp, result[loc_num]);
 	return true;
       }
@@ -1075,7 +1078,7 @@ gfc_diagnostic_starter (diagnostic_context *context,
 
   expanded_location s1 = diagnostic_expand_location (diagnostic);
   expanded_location s2;
-  bool one_locus = diagnostic_location (diagnostic, 1) == UNKNOWN_LOCATION;
+  bool one_locus = diagnostic->richloc->get_num_locations () < 2;
   bool same_locus = false;
 
   if (!one_locus) 
@@ -1173,10 +1176,11 @@ gfc_warning_now_at (location_t loc, int opt, const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (loc);
   bool ret;
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, loc, DK_WARNING);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc, DK_WARNING);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
   va_end (argp);
@@ -1190,10 +1194,11 @@ gfc_warning_now (int opt, const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
   bool ret;
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION,
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc,
 		       DK_WARNING);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
@@ -1209,11 +1214,12 @@ gfc_error_now (const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
 
   error_buffer.flag = true;
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION, DK_ERROR);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc, DK_ERROR);
   report_diagnostic (&diagnostic);
   va_end (argp);
 }
@@ -1226,9 +1232,10 @@ gfc_fatal_error (const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION, DK_FATAL);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc, DK_FATAL);
   report_diagnostic (&diagnostic);
   va_end (argp);
 
@@ -1291,6 +1298,7 @@ gfc_error (const char *gmsgid, va_list ap)
     }
 
   diagnostic_info diagnostic;
+  rich_location richloc (UNKNOWN_LOCATION);
   bool fatal_errors = global_dc->fatal_errors;
   pretty_printer *pp = global_dc->printer;
   output_buffer *tmp_buffer = pp->buffer;
@@ -1306,7 +1314,7 @@ gfc_error (const char *gmsgid, va_list ap)
       --errorcount;
     }
 
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION, DK_ERROR);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &richloc, DK_ERROR);
   report_diagnostic (&diagnostic);
 
   if (buffered_p)
@@ -1336,9 +1344,10 @@ gfc_internal_error (const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION, DK_ICE);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc, DK_ICE);
   report_diagnostic (&diagnostic);
   va_end (argp);
 
@@ -1472,6 +1481,7 @@ gfc_diagnostics_init (void)
   diagnostic_format_decoder (global_dc) = gfc_format_decoder;
   global_dc->caret_chars[0] = '1';
   global_dc->caret_chars[1] = '2';
+  global_dc->frontend_calls_diagnostic_print_caret_line_p = true;
   pp_warning_buffer = new (XNEW (output_buffer)) output_buffer ();
   pp_warning_buffer->flush_p = false;
   /* pp_error_buffer is statically allocated.  This simplifies memory
diff --git a/gcc/genmatch.c b/gcc/genmatch.c
index 102a635..6bfde06 100644
--- a/gcc/genmatch.c
+++ b/gcc/genmatch.c
@@ -53,14 +53,23 @@ unsigned verbose;
 
 static struct line_maps *line_table;
 
+expanded_location
+linemap_client_expand_location_to_spelling_point (source_location loc)
+{
+  const struct line_map_ordinary *map;
+  loc = linemap_resolve_location (line_table, loc, LRK_SPELLING_LOCATION, &map);
+  return linemap_expand_location (line_table, map, loc);
+}
+
 static bool
 #if GCC_VERSION >= 4001
-__attribute__((format (printf, 6, 0)))
+__attribute__((format (printf, 5, 0)))
 #endif
-error_cb (cpp_reader *, int errtype, int, source_location location,
-	  unsigned int, const char *msg, va_list *ap)
+error_cb (cpp_reader *, int errtype, int, rich_location *richloc,
+	  const char *msg, va_list *ap)
 {
   const line_map_ordinary *map;
+  source_location location = richloc->get_loc ();
   linemap_resolve_location (line_table, location, LRK_SPELLING_LOCATION, &map);
   expanded_location loc = linemap_expand_location (line_table, map, location);
   fprintf (stderr, "%s:%d:%d %s: ", loc.file, loc.line, loc.column,
@@ -102,9 +111,10 @@ __attribute__((format (printf, 2, 3)))
 #endif
 fatal_at (const cpp_token *tk, const char *msg, ...)
 {
+  rich_location richloc (tk->src_loc);
   va_list ap;
   va_start (ap, msg);
-  error_cb (NULL, CPP_DL_FATAL, 0, tk->src_loc, 0, msg, &ap);
+  error_cb (NULL, CPP_DL_FATAL, 0, &richloc, msg, &ap);
   va_end (ap);
 }
 
@@ -114,9 +124,10 @@ __attribute__((format (printf, 2, 3)))
 #endif
 fatal_at (source_location loc, const char *msg, ...)
 {
+  rich_location richloc (loc);
   va_list ap;
   va_start (ap, msg);
-  error_cb (NULL, CPP_DL_FATAL, 0, loc, 0, msg, &ap);
+  error_cb (NULL, CPP_DL_FATAL, 0, &richloc, msg, &ap);
   va_end (ap);
 }
 
@@ -126,9 +137,10 @@ __attribute__((format (printf, 2, 3)))
 #endif
 warning_at (const cpp_token *tk, const char *msg, ...)
 {
+  rich_location richloc (tk->src_loc);
   va_list ap;
   va_start (ap, msg);
-  error_cb (NULL, CPP_DL_WARNING, 0, tk->src_loc, 0, msg, &ap);
+  error_cb (NULL, CPP_DL_WARNING, 0, &richloc, msg, &ap);
   va_end (ap);
 }
 
@@ -138,9 +150,10 @@ __attribute__((format (printf, 2, 3)))
 #endif
 warning_at (source_location loc, const char *msg, ...)
 {
+  rich_location richloc (loc);
   va_list ap;
   va_start (ap, msg);
-  error_cb (NULL, CPP_DL_WARNING, 0, loc, 0, msg, &ap);
+  error_cb (NULL, CPP_DL_WARNING, 0, &richloc, msg, &ap);
   va_end (ap);
 }
 
diff --git a/gcc/input.c b/gcc/input.c
index e7302a4..bdba20f 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -751,6 +751,13 @@ expand_location_to_spelling_point (source_location loc)
   return expand_location_1 (loc, /*expansion_point_p=*/false);
 }
 
+expanded_location
+linemap_client_expand_location_to_spelling_point (source_location loc)
+{
+  return expand_location_to_spelling_point (loc);
+}
+
+
 /* If LOCATION is in a system header and if it is a virtual location for
    a token coming from the expansion of a macro, unwind it to the
    location of the expansion point of the macro.  Otherwise, just return
diff --git a/gcc/pretty-print.c b/gcc/pretty-print.c
index fdc7b4d..fe50df8 100644
--- a/gcc/pretty-print.c
+++ b/gcc/pretty-print.c
@@ -31,6 +31,27 @@ along with GCC; see the file COPYING3.  If not see
 #include <iconv.h>
 #endif
 
+/* Overwrite the range within this text_info's rich_location.
+   For use e.g. when implementing "+" in client format decoders.  */
+
+void
+text_info::set_range (unsigned int idx, source_range range, bool caret_p)
+{
+  gcc_checking_assert (m_richloc);
+  m_richloc->set_range (idx, range, caret_p, true);
+}
+
+location_t
+text_info::get_location (unsigned int index_of_location) const
+{
+  gcc_checking_assert (m_richloc);
+
+  if (index_of_location == 0)
+    return m_richloc->get_loc ();
+  else
+    return UNKNOWN_LOCATION;
+}
+
 // Default construct an output buffer.
 
 output_buffer::output_buffer ()
diff --git a/gcc/pretty-print.h b/gcc/pretty-print.h
index 36d4e37..d10272c 100644
--- a/gcc/pretty-print.h
+++ b/gcc/pretty-print.h
@@ -27,11 +27,6 @@ along with GCC; see the file COPYING3.  If not see
 /* Maximum number of format string arguments.  */
 #define PP_NL_ARGMAX   30
 
-/* Maximum number of locations associated to each message.  If
-   location 'i' is UNKNOWN_LOCATION, then location 'i+1' is not
-   valid.  */
-#define MAX_LOCATIONS_PER_MESSAGE 2
-
 /* The type of a text to be formatted according a format specification
    along with a list of things.  */
 struct text_info
@@ -40,21 +35,17 @@ struct text_info
   va_list *args_ptr;
   int err_no;  /* for %m */
   void **x_data;
+  rich_location *m_richloc;
 
-  inline void set_location (unsigned int index_of_location, location_t loc)
+  inline void set_location (unsigned int idx, location_t loc, bool caret_p)
   {
-    gcc_checking_assert (index_of_location < MAX_LOCATIONS_PER_MESSAGE);
-    this->locations[index_of_location] = loc;
+    source_range src_range;
+    src_range.m_start = loc;
+    src_range.m_finish = loc;
+    set_range (idx, src_range, caret_p);
   }
-
-  inline location_t get_location (unsigned int index_of_location) const
-  {
-    gcc_checking_assert (index_of_location < MAX_LOCATIONS_PER_MESSAGE);
-    return this->locations[index_of_location];
-  }
-
-private:
-  location_t locations[MAX_LOCATIONS_PER_MESSAGE];
+  void set_range (unsigned int idx, source_range range, bool caret_p);
+  location_t get_location (unsigned int index_of_location) const;
 };
 
 /* How often diagnostics are prefixed by their locations:
diff --git a/gcc/rtl-error.c b/gcc/rtl-error.c
index 8b9b391..d28be1d 100644
--- a/gcc/rtl-error.c
+++ b/gcc/rtl-error.c
@@ -69,9 +69,10 @@ diagnostic_for_asm (const rtx_insn *insn, const char *msg, va_list *args_ptr,
 		    diagnostic_t kind)
 {
   diagnostic_info diagnostic;
+  rich_location richloc (location_for_asm (insn));
 
   diagnostic_set_info (&diagnostic, msg, args_ptr,
-		       location_for_asm (insn), kind);
+		       &richloc, kind);
   report_diagnostic (&diagnostic);
 }
 
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-bw.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-bw.c
new file mode 100644
index 0000000..ab74a11
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-bw.c
@@ -0,0 +1,135 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdiagnostics-show-caret" } */
+
+/* This is a collection of unittests for diagnostic_show_locus;
+   see the overview in diagnostic_plugin_test_show_locus.c.
+
+   In particular, note the discussion of why we need a very long line here:
+01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
+   and that we can't use macros in this file.  */
+
+void test_simple (void)
+{
+#if 0
+  myvar = myvar.x; /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   myvar = myvar.x;
+           ~~~~~^~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_simple_2 (void)
+{
+#if 0
+  x = first_function () + second_function ();  /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   x = first_function () + second_function ();
+       ~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+
+void test_multiline (void)
+{
+#if 0
+  x = (first_function ()
+       + second_function ()); /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   x = (first_function ()
+        ~~~~~~~~~~~~~~~~~
+        + second_function ());
+        ^ ~~~~~~~~~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_many_lines (void)
+{
+#if 0
+  x = (first_function_with_a_very_long_name (lorem, ipsum, dolor, sit, amet,
+                                            consectetur, adipiscing, elit,
+                                            sed, eiusmod, tempor,
+                                            incididunt, ut, labore, et,
+                                            dolore, magna, aliqua)
+       + second_function_with_a_very_long_name (lorem, ipsum, dolor, sit, /* { dg-warning "test" } */
+                                                amet, consectetur,
+                                                adipiscing, elit, sed,
+                                                eiusmod, tempor, incididunt,
+                                                ut, labore, et, dolore,
+                                                magna, aliqua));
+
+/* { dg-begin-multiline-output "" }
+   x = (first_function_with_a_very_long_name (lorem, ipsum, dolor, sit, amet,
+        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                             consectetur, adipiscing, elit,
+                                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                             sed, eiusmod, tempor,
+                                             ~~~~~~~~~~~~~~~~~~~~~
+                                             incididunt, ut, labore, et,
+                                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                             dolore, magna, aliqua)
+                                             ~~~~~~~~~~~~~~~~~~~~~~
+        + second_function_with_a_very_long_name (lorem, ipsum, dolor, sit,
+        ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                                 amet, consectetur,
+                                                 ~~~~~~~~~~~~~~~~~~
+                                                 adipiscing, elit, sed,
+                                                 ~~~~~~~~~~~~~~~~~~~~~~
+                                                 eiusmod, tempor, incididunt,
+                                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                                 ut, labore, et, dolore,
+                                                 ~~~~~~~~~~~~~~~~~~~~~~~
+                                                 magna, aliqua));
+                                                 ~~~~~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_richloc_from_proper_range (void)
+{
+#if 0
+  float f = 98.6f; /* { dg-warning "test" } */
+/* { dg-begin-multiline-output "" }
+   float f = 98.6f;
+             ^~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_caret_within_proper_range (void)
+{
+#if 0
+  float f = foo * bar; /* { dg-warning "17: test" } */
+/* { dg-begin-multiline-output "" }
+   float f = foo * bar;
+             ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_very_wide_line (void)
+{
+#if 0
+                                                                                float f = foo * bar; /* { dg-warning "95: test" } */
+/* { dg-begin-multiline-output "" }
+                                              float f = foo * bar;
+                                                        ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_multiple_carets (void)
+{
+#if 0
+   x = x + y /* { dg-warning "8: test" } */
+/* { dg-begin-multiline-output "" }
+    x = x + y
+        A   B
+   { dg-end-multiline-output "" } */
+#endif
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-color.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-color.c
new file mode 100644
index 0000000..6789a47
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-color.c
@@ -0,0 +1,143 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdiagnostics-show-caret -fplugin-arg-diagnostic_plugin_test_show_locus-color" } */
+
+/* This is a collection of unittests for diagnostic_show_locus;
+   see the overview in diagnostic_plugin_test_show_locus.c.
+
+   In particular, note the discussion of why we need a very long line here:
+01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
+   and that we can't use macros in this file.  */
+
+void test_simple (void)
+{
+#if 0
+  myvar = myvar.x; /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   myvar = ^[[32m^[[Kmyvar^[[m^[[K^[[01;35m^[[K.^[[m^[[K^[[34m^[[Kx^[[m^[[K;
+           ^[[32m^[[K~~~~~^[[m^[[K^[[01;35m^[[K^^[[m^[[K^[[34m^[[K~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_simple_2 (void)
+{
+#if 0
+  x = first_function () + second_function ();  /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   x = ^[[32m^[[Kfirst_function ()^[[m^[[K ^[[01;35m^[[K+^[[m^[[K ^[[34m^[[Ksecond_function ()^[[m^[[K;
+       ^[[32m^[[K~~~~~~~~~~~~~~~~~^[[m^[[K ^[[01;35m^[[K^^[[m^[[K ^[[34m^[[K~~~~~~~~~~~~~~~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+
+void test_multiline (void)
+{
+#if 0
+  x = (first_function ()
+       + second_function ()); /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   x = (^[[32m^[[Kfirst_function ()
+ ^[[m^[[K       ^[[32m^[[K~~~~~~~~~~~~~~~~~
+^[[m^[[K        ^[[01;35m^[[K+^[[m^[[K ^[[34m^[[Ksecond_function ()^[[m^[[K);
+        ^[[01;35m^[[K^^[[m^[[K ^[[34m^[[K~~~~~~~~~~~~~~~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_many_lines (void)
+{
+#if 0
+  x = (first_function_with_a_very_long_name (lorem, ipsum, dolor, sit, amet,
+                                            consectetur, adipiscing, elit,
+                                            sed, eiusmod, tempor,
+                                            incididunt, ut, labore, et,
+                                            dolore, magna, aliqua)
+       + second_function_with_a_very_long_name (lorem, ipsum, dolor, sit, /* { dg-warning "test" } */
+                                                amet, consectetur,
+                                                adipiscing, elit, sed,
+                                                eiusmod, tempor, incididunt,
+                                                ut, labore, et, dolore,
+                                                magna, aliqua));
+
+/* { dg-begin-multiline-output "" }
+   x = (^[[32m^[[Kfirst_function_with_a_very_long_name (lorem, ipsum, dolor, sit, amet,
+ ^[[m^[[K       ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[32m^[[K                                            consectetur, adipiscing, elit,
+ ^[[m^[[K                                            ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[32m^[[K                                            sed, eiusmod, tempor,
+ ^[[m^[[K                                            ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[32m^[[K                                            incididunt, ut, labore, et,
+ ^[[m^[[K                                            ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[32m^[[K                                            dolore, magna, aliqua)
+ ^[[m^[[K                                            ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K        ^[[01;35m^[[K+^[[m^[[K ^[[34m^[[Ksecond_function_with_a_very_long_name (lorem, ipsum, dolor, sit,
+ ^[[m^[[K       ^[[01;35m^[[K^^[[m^[[K ^[[34m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                amet, consectetur,
+ ^[[m^[[K                                                ^[[34m^[[K~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                adipiscing, elit, sed,
+ ^[[m^[[K                                                ^[[34m^[[K~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                eiusmod, tempor, incididunt,
+ ^[[m^[[K                                                ^[[34m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                ut, labore, et, dolore,
+ ^[[m^[[K                                                ^[[34m^[[K~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                magna, aliqua)^[[m^[[K);
+                                                 ^[[34m^[[K~~~~~~~~~~~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_richloc_from_proper_range (void)
+{
+#if 0
+  float f = 98.6f; /* { dg-warning "test" } */
+/* { dg-begin-multiline-output "" }
+   float f = ^[[01;35m^[[K98.6f^[[m^[[K;
+             ^[[01;35m^[[K^~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_caret_within_proper_range (void)
+{
+#if 0
+  float f = foo * bar; /* { dg-warning "17: test" } */
+/* { dg-begin-multiline-output "" }
+   float f = ^[[01;35m^[[Kfoo * bar^[[m^[[K;
+             ^[[01;35m^[[K~~~~^~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_very_wide_line (void)
+{
+#if 0
+                                                                                float f = foo * bar; /* { dg-warning "95: test" } */
+/* { dg-begin-multiline-output "" }
+                                              float f = ^[[01;35m^[[Kfoo * bar^[[m^[[K;
+                                                        ^[[01;35m^[[K~~~~^~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_multiple_carets (void)
+{
+#if 0
+   x = x + y /* { dg-warning "8: test" } */
+/* { dg-begin-multiline-output "" }
+    x = ^[[01;35m^[[Kx^[[m^[[K + ^[[32m^[[Ky^[[m^[[K
+        ^[[01;35m^[[KA^[[m^[[K   ^[[32m^[[KB
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
new file mode 100644
index 0000000..4917fad
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
@@ -0,0 +1,300 @@
+/* { dg-options "-O" } */
+
+/* This plugin exercises the diagnostics-printing code.
+
+   The goal is to unit-test the range-printing code without needing any
+   correct range data within the compiler's IR.  We can't use any real
+   diagnostics for this, so we have to fake it, hence this plugin.
+
+   There are two test files used with this code:
+
+     diagnostic-test-show-locus-ascii-bw.c
+     ..........................-ascii-color.c
+
+   to exercise uncolored vs colored output by supplying plugin arguments
+   to hack in the desired behavior:
+
+     -fplugin-arg-diagnostic_plugin_test_show_locus-color
+
+   The test files contain functions, but the body of each
+   function is disabled using the preprocessor.  The plugin detects
+   the functions by name, and inject diagnostics within them, using
+   hard-coded locations relative to the top of each function.
+
+   The plugin uses a function "get_loc" below to map from line/column
+   numbers to source_location, and this relies on input_location being in
+   the same ordinary line_map as the locations in question.  The plugin
+   runs after parsing, so input_location will be at the end of the file.
+
+   This need for all of the test code to be in a single ordinary line map
+   means that each test file needs to have a very long line near the top
+   (potentially to cover the extra byte-count of colorized data),
+   to ensure that further very long lines don't start a new linemap.
+   This also means that we can't use macros in the test files.  */
+
+#include "gcc-plugin.h"
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "stringpool.h"
+#include "toplev.h"
+#include "basic-block.h"
+#include "hash-table.h"
+#include "vec.h"
+#include "ggc.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "internal-fn.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "tree.h"
+#include "tree-pass.h"
+#include "intl.h"
+#include "plugin-version.h"
+#include "diagnostic.h"
+#include "context.h"
+#include "print-tree.h"
+
+int plugin_is_GPL_compatible;
+
+const pass_data pass_data_test_show_locus =
+{
+  GIMPLE_PASS, /* type */
+  "test_show_locus", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_NONE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  0, /* todo_flags_finish */
+};
+
+class pass_test_show_locus : public gimple_opt_pass
+{
+public:
+  pass_test_show_locus(gcc::context *ctxt)
+    : gimple_opt_pass(pass_data_test_show_locus, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  bool gate (function *) { return true; }
+  virtual unsigned int execute (function *);
+
+}; // class pass_test_show_locus
+
+/* Given LINE_NUM and COL_NUM, generate a source_location in the
+   current file, relative to input_location.  This relies on the
+   location being expressible in the same ordinary line_map as
+   input_location (which is typically at the end of the source file
+   when this is called).  Hence the test files we compile with this
+   plugin must have an initial very long line (to avoid long lines
+   starting a new line map), and must not use macros.
+
+   COL_NUM uses the Emacs convention of 0-based column numbers.  */
+
+static source_location
+get_loc (unsigned int line_num, unsigned int col_num)
+{
+  /* Use input_location to get the relevant line_map */
+  const struct line_map_ordinary *line_map
+    = (const line_map_ordinary *)(linemap_lookup (line_table,
+						  input_location));
+
+  /* Convert from 0-based column numbers to 1-based column numbers.  */
+  source_location loc
+    = linemap_position_for_line_and_column (line_map,
+					    line_num, col_num + 1);
+
+  return loc;
+}
+
+/* Was "color" passed in as a plugin argument?  */
+static bool force_show_locus_color = false;
+
+/* We want to verify the colorized output of diagnostic_show_locus,
+   but turning on colorization for everything confuses "dg-warning" etc.
+   Hence we special-case it within this plugin by using this modified
+   version of default_diagnostic_finalizer, which, if "color" is
+   passed in as a plugin argument turns on colorization, but just
+   for diagnostic_show_locus.  */
+
+static void
+custom_diagnostic_finalizer (diagnostic_context *context,
+			     diagnostic_info *diagnostic)
+{
+  bool old_show_color = pp_show_color (context->printer);
+  if (force_show_locus_color)
+    pp_show_color (context->printer) = true;
+  diagnostic_show_locus (context, diagnostic);
+  pp_show_color (context->printer) = old_show_color;
+
+  pp_destroy_prefix (context->printer);
+  pp_newline_and_flush (context->printer);
+}
+
+/* Exercise the diagnostic machinery to emit various warnings,
+   for use by diagnostic-test-show-locus-*.c.
+
+   We inject each warning relative to the start of a function,
+   which avoids lots of hardcoded absolute locations.  */
+
+static void
+test_show_locus (function *fun)
+{
+  tree fndecl = fun->decl;
+  tree identifier = DECL_NAME (fndecl);
+  const char *fnname = IDENTIFIER_POINTER (identifier);
+  location_t fnstart = fun->function_start_locus;
+  int fnstart_line = LOCATION_LINE (fnstart);
+
+  diagnostic_finalizer (global_dc) = custom_diagnostic_finalizer;
+
+  /* Hardcode the "terminal width", to verify the behavior of
+     very wide lines.  */
+  global_dc->caret_max_width = 70;
+
+  if (0 == strcmp (fnname, "test_simple"))
+    {
+      const int line = fnstart_line + 2;
+      rich_location richloc (get_loc (line, 15));
+      richloc.add_range (get_loc (line, 10), get_loc (line, 14));
+      richloc.add_range (get_loc (line, 16), get_loc (line, 16));
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  if (0 == strcmp (fnname, "test_simple_2"))
+    {
+      const int line = fnstart_line + 2;
+      rich_location richloc (get_loc (line, 24));
+      richloc.add_range (get_loc (line, 6),
+			 get_loc (line, 22));
+      richloc.add_range (get_loc (line, 26),
+			 get_loc (line, 43));
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  if (0 == strcmp (fnname, "test_multiline"))
+    {
+      const int line = fnstart_line + 2;
+      rich_location richloc (get_loc (line + 1, 7));
+      richloc.add_range (get_loc (line, 7),
+			 get_loc (line, 23));
+      richloc.add_range (get_loc (line + 1, 9),
+			 get_loc (line + 1, 26));
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  if (0 == strcmp (fnname, "test_many_lines"))
+    {
+      const int line = fnstart_line + 2;
+      rich_location richloc (get_loc (line + 5, 7));
+      richloc.add_range (get_loc (line, 7),
+			 get_loc (line + 4, 65));
+      richloc.add_range (get_loc (line + 5, 9),
+			 get_loc (line + 10, 61));
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  /* Example of a rich_location constructed directly from a
+     source_range where the range is larger than one character.  */
+  if (0 == strcmp (fnname, "test_richloc_from_proper_range"))
+    {
+      const int line = fnstart_line + 2;
+      source_range src_range;
+      src_range.m_start = get_loc (line, 12);
+      src_range.m_finish = get_loc (line, 16);
+      rich_location richloc (src_range);
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  /* Example of a single-range location where the range starts
+     before the caret.  */
+  if (0 == strcmp (fnname, "test_caret_within_proper_range"))
+    {
+      const int line = fnstart_line + 2;
+      location_t caret = get_loc (line, 16);
+      source_range src_range;
+      src_range.m_start = get_loc (line, 12);
+      src_range.m_finish = get_loc (line, 20);
+      rich_location richloc (caret);
+      richloc.set_range (0, src_range, true, false);
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  /* Example of a very wide line, where the information of interest
+     is beyond the width of the terminal (hardcoded above).  */
+  if (0 == strcmp (fnname, "test_very_wide_line"))
+    {
+      const int line = fnstart_line + 2;
+      location_t caret = get_loc (line, 94);
+      source_range src_range;
+      src_range.m_start = get_loc (line, 90);
+      src_range.m_finish = get_loc (line, 98);
+      rich_location richloc (caret);
+      richloc.set_range (0, src_range, true, false);
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  /* Example of multiple carets.  */
+  if (0 == strcmp (fnname, "test_multiple_carets"))
+    {
+      const int line = fnstart_line + 2;
+      location_t caret_a = get_loc (line, 7);
+      location_t caret_b = get_loc (line, 11);
+      rich_location richloc (caret_a);
+      richloc.add_range (caret_b, caret_b, true);
+      global_dc->caret_chars[0] = 'A';
+      global_dc->caret_chars[1] = 'B';
+      warning_at_rich_loc (&richloc, 0, "test");
+      global_dc->caret_chars[0] = '^';
+      global_dc->caret_chars[1] = '^';
+    }
+}
+
+unsigned int
+pass_test_show_locus::execute (function *fun)
+{
+  test_show_locus (fun);
+  return 0;
+}
+
+static gimple_opt_pass *
+make_pass_test_show_locus (gcc::context *ctxt)
+{
+  return new pass_test_show_locus (ctxt);
+}
+
+int
+plugin_init (struct plugin_name_args *plugin_info,
+	     struct plugin_gcc_version *version)
+{
+  struct register_pass_info pass_info;
+  const char *plugin_name = plugin_info->base_name;
+  int argc = plugin_info->argc;
+  struct plugin_argument *argv = plugin_info->argv;
+
+  if (!plugin_default_version_check (version, &gcc_version))
+    return 1;
+
+  for (int i = 0; i < argc; i++)
+    {
+      if (0 == strcmp (argv[i].key, "color"))
+	force_show_locus_color = true;
+    }
+
+  pass_info.pass = make_pass_test_show_locus (g);
+  pass_info.reference_pass_name = "ssa";
+  pass_info.ref_pass_instance_number = 1;
+  pass_info.pos_op = PASS_POS_INSERT_AFTER;
+  register_callback (plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL,
+		     &pass_info);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/plugin.exp b/gcc/testsuite/gcc.dg/plugin/plugin.exp
index 39fab6e..941bccc 100644
--- a/gcc/testsuite/gcc.dg/plugin/plugin.exp
+++ b/gcc/testsuite/gcc.dg/plugin/plugin.exp
@@ -63,6 +63,9 @@ set plugin_test_list [list \
     { start_unit_plugin.c start_unit-test-1.c } \
     { finish_unit_plugin.c finish_unit-test-1.c } \
     { wide-int_plugin.c wide-int-test-1.c } \
+    { diagnostic_plugin_test_show_locus.c \
+	  diagnostic-test-show-locus-bw.c \
+	  diagnostic-test-show-locus-color.c } \
 ]
 
 foreach plugin_test $plugin_test_list {
diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp
index 7c1ab85..8cc1d87 100644
--- a/gcc/testsuite/lib/gcc-dg.exp
+++ b/gcc/testsuite/lib/gcc-dg.exp
@@ -29,6 +29,7 @@ load_lib libgloss.exp
 load_lib target-libpath.exp
 load_lib torture-options.exp
 load_lib fortran-modules.exp
+load_lib multiline.exp
 
 # We set LC_ALL and LANG to C so that we get the same error messages as expected.
 setenv LC_ALL C
diff --git a/gcc/tree-diagnostic.c b/gcc/tree-diagnostic.c
index 135f142..02009d8 100644
--- a/gcc/tree-diagnostic.c
+++ b/gcc/tree-diagnostic.c
@@ -289,7 +289,7 @@ default_tree_printer (pretty_printer *pp, text_info *text, const char *spec,
     }
 
   if (set_locus)
-    text->set_location (0, DECL_SOURCE_LOCATION (t));
+    text->set_location (0, DECL_SOURCE_LOCATION (t), true);
 
   if (DECL_P (t))
     {
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 7cd1fe7..3c34d51 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -3602,7 +3602,7 @@ void
 percent_K_format (text_info *text)
 {
   tree t = va_arg (*text->args_ptr, tree), block;
-  text->set_location (0, EXPR_LOCATION (t));
+  text->set_location (0, EXPR_LOCATION (t), true);
   gcc_assert (pp_ti_abstract_origin (text) != NULL);
   block = TREE_BLOCK (t);
   *pp_ti_abstract_origin (text) = NULL;
diff --git a/libcpp/errors.c b/libcpp/errors.c
index a33196e..c351c11 100644
--- a/libcpp/errors.c
+++ b/libcpp/errors.c
@@ -57,7 +57,8 @@ cpp_diagnostic (cpp_reader * pfile, int level, int reason,
 
   if (!pfile->cb.error)
     abort ();
-  ret = pfile->cb.error (pfile, level, reason, src_loc, 0, _(msgid), ap);
+  rich_location richloc (src_loc);
+  ret = pfile->cb.error (pfile, level, reason, &richloc, _(msgid), ap);
 
   return ret;
 }
@@ -139,7 +140,9 @@ cpp_diagnostic_with_line (cpp_reader * pfile, int level, int reason,
   
   if (!pfile->cb.error)
     abort ();
-  ret = pfile->cb.error (pfile, level, reason, src_loc, column, _(msgid), ap);
+  rich_location richloc (src_loc);
+  richloc.override_column (column);
+  ret = pfile->cb.error (pfile, level, reason, &richloc, _(msgid), ap);
 
   return ret;
 }
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 5eaea6b..a2bdfa0 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -573,9 +573,9 @@ struct cpp_callbacks
 
   /* Called to emit a diagnostic.  This callback receives the
      translated message.  */
-  bool (*error) (cpp_reader *, int, int, source_location, unsigned int,
+  bool (*error) (cpp_reader *, int, int, rich_location *,
 		 const char *, va_list *)
-       ATTRIBUTE_FPTR_PRINTF(6,0);
+       ATTRIBUTE_FPTR_PRINTF(5,0);
 
   /* Callbacks for when a macro is expanded, or tested (whether
      defined or not at the time) in #ifdef, #ifndef or "defined".  */
diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index bc747c1..bd73780 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -118,6 +118,35 @@ typedef unsigned int linenum_type;
   libcpp/location-example.txt.  */
 typedef unsigned int source_location;
 
+/* A range of source locations.
+
+   Ranges are closed:
+   m_start is the first location within the range,
+   m_finish is the last location within the range.
+
+   We may need a more compact way to store these, but for now,
+   let's do it the simple way, as a pair.  */
+struct GTY(()) source_range
+{
+  source_location m_start;
+  source_location m_finish;
+
+  void debug (const char *msg) const;
+
+  /* We avoid using constructors, since various structs that
+     don't yet have constructors will embed instances of
+     source_range.  */
+
+  /* Make a source_range from a source_location.  */
+  static source_range from_location (source_location loc)
+  {
+    source_range result;
+    result.m_start = loc;
+    result.m_finish = loc;
+    return result;
+  }
+};
+
 /* Memory allocation function typedef.  Works like xrealloc.  */
 typedef void *(*line_map_realloc) (void *, size_t);
 
@@ -1015,6 +1044,175 @@ typedef struct
   bool sysp;
 } expanded_location;
 
+/* Both gcc and emacs number source *lines* starting at 1, but
+   they have differing conventions for *columns*.
+
+   GCC uses a 1-based convention for source columns,
+   whereas Emacs's M-x column-number-mode uses a 0-based convention.
+
+   For example, an error in the initial, left-hand
+   column of source line 3 is reported by GCC as:
+
+      some-file.c:3:1: error: ...etc...
+
+   On navigating to the location of that error in Emacs
+   (e.g. via "next-error"),
+   the locus is reported in the Mode Line
+   (assuming M-x column-number-mode) as:
+
+     some-file.c   10%   (3, 0)
+
+   i.e. "3:1:" in GCC corresponds to "(3, 0)" in Emacs.  */
+
+/* Ranges are closed
+   m_start is the first location within the range, and
+   m_finish is the last location within the range.  */
+struct location_range
+{
+  expanded_location m_start;
+  expanded_location m_finish;
+
+  /* Should a caret be drawn for this range?  Typically this is
+     true for the 0th range, and false for subsequent ranges,
+     but the Fortran frontend overrides this for rendering things like:
+
+       x = x + y
+           1   2
+       Error: Shapes for operands at (1) and (2) are not conformable
+
+     where "1" and "2" are notionally carets.  */
+  bool m_show_caret_p;
+  expanded_location m_caret;
+};
+
+/* A "rich" source code location, for use when printing diagnostics.
+   A rich_location has one or more ranges, each optionally with
+   a caret.   Typically the zeroth range has a caret; other ranges
+   sometimes have carets.
+
+   The "primary" location of a rich_location is the caret of range 0,
+   used for determining the line/column when printing diagnostic
+   text, such as:
+
+      some-file.c:3:1: error: ...etc...
+
+   Additional ranges may be added to help the user identify other
+   pertinent clauses in a diagnostic.
+
+   rich_location instances are intended to be allocated on the stack
+   when generating diagnostics, and to be short-lived.
+
+   Examples of rich locations
+   --------------------------
+
+   Example A
+   *********
+      int i = "foo";
+              ^
+   This "rich" location is simply a single range (range 0), with
+   caret = start = finish at the given point.
+
+   Example B
+   *********
+      a = (foo && bar)
+          ~~~~~^~~~~~~
+   This rich location has a single range (range 0), with the caret
+   at the first "&", and the start/finish at the parentheses.
+   Compare with example C below.
+
+   Example C
+   *********
+      a = (foo && bar)
+           ~~~ ^~ ~~~
+   This rich location has three ranges:
+   - Range 0 has its caret and start location at the first "&" and
+     end at the second "&.
+   - Range 1 has its start and finish at the "f" and "o" of "foo";
+     the caret is not flagged for display, but is perhaps at the "f"
+     of "foo".
+   - Similarly, range 2 has its start and finish at the "b" and "r" of
+     "bar"; the caret is not flagged for display, but is perhaps at the
+     "b" of "bar".
+   Compare with example B above.
+
+   Example D (Fortran frontend)
+   ****************************
+       x = x + y
+           1   2
+   This rich location has range 0 at "1", and range 1 at "2".
+   Both are flagged for caret display.  Both ranges have start/finish
+   equal to their caret point.  The frontend overrides the diagnostic
+   context's default caret character for these ranges.
+
+   Example E
+   *********
+      printf ("arg0: %i  arg1: %s arg2: %i",
+                               ^~
+              100, 101, 102);
+                   ~~~
+   This rich location has two ranges:
+   - range 0 is at the "%s" with start = caret = "%" and finish at
+     the "s".
+   - range 1 has start/finish covering the "101" and is not flagged for
+     caret printing; it is perhaps at the start of "101".  */
+
+class rich_location
+{
+ public:
+  /* Constructors.  */
+
+  /* Constructing from a location.  */
+  rich_location (source_location loc);
+
+  /* Constructing from a source_range.  */
+  rich_location (source_range src_range);
+
+  /* Accessors.  */
+  source_location get_loc () const { return m_loc; }
+
+  source_location *get_loc_addr () { return &m_loc; }
+
+  void
+  add_range (source_location start, source_location finish,
+	     bool show_caret_p = false);
+
+  void
+  add_range (source_range src_range,
+	     bool show_caret_p = false);
+
+  void
+  add_range (location_range *src_range);
+
+  void
+  set_range (unsigned int idx, source_range src_range,
+	     bool show_caret_p, bool overwrite_loc_p);
+
+  unsigned int get_num_locations () const { return m_num_ranges; }
+
+  location_range *get_range (unsigned int idx)
+  {
+    linemap_assert (idx < m_num_ranges);
+    return &m_ranges[idx];
+  }
+
+  expanded_location lazily_expand_location ();
+
+  void
+  override_column (int column);
+
+public:
+  static const int MAX_RANGES = 3;
+
+protected:
+  source_location m_loc;
+
+  unsigned int m_num_ranges;
+  location_range m_ranges[MAX_RANGES];
+
+  bool m_have_expanded_location;
+  expanded_location m_expanded_location;
+};
+
 /* This is enum is used by the function linemap_resolve_location
    below.  The meaning of the values is explained in the comment of
    that function.  */
@@ -1158,4 +1356,13 @@ void linemap_dump (FILE *, struct line_maps *, unsigned, bool);
    specifies how many macro maps to dump.  */
 void line_table_dump (FILE *, struct line_maps *, unsigned int, unsigned int);
 
+/* The rich_location class requires a way to expand source_location instances.
+   We would directly use expand_location_to_spelling_point, which is
+   implemented in gcc/input.c, but we also need to use it for rich_location
+   within genmatch.c.
+   Hence we require client code of libcpp to implement the following
+   symbol.  */
+extern expanded_location
+linemap_client_expand_location_to_spelling_point (source_location );
+
 #endif /* !LIBCPP_LINE_MAP_H  */
diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index 3d82e9b..a6fa782 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -1752,3 +1752,133 @@ line_table_dump (FILE *stream, struct line_maps *set, unsigned int num_ordinary,
       fprintf (stream, "\n");
     }
 }
+
+/* class rich_location.  */
+
+/* Construct a rich_location with location LOC as its initial range.  */
+
+rich_location::rich_location (source_location loc) :
+  m_loc (loc),
+  m_num_ranges (0),
+  m_have_expanded_location (false)
+{
+  /* Set up the 0th range: */
+  add_range (loc, loc, true);
+  m_ranges[0].m_caret = lazily_expand_location ();
+}
+
+/* Construct a rich_location with source_range SRC_RANGE as its
+   initial range.  */
+
+rich_location::rich_location (source_range src_range)
+: m_loc (src_range.m_start),
+  m_num_ranges (0),
+  m_have_expanded_location (false)
+{
+  /* Set up the 0th range: */
+  add_range (src_range, true);
+}
+
+/* Get an expanded_location for this rich_location's primary
+   location.  */
+
+expanded_location
+rich_location::lazily_expand_location ()
+{
+  if (!m_have_expanded_location)
+    {
+      m_expanded_location
+	= linemap_client_expand_location_to_spelling_point (m_loc);
+      m_have_expanded_location = true;
+    }
+
+  return m_expanded_location;
+}
+
+/* Set the column of the primary location.  */
+
+void
+rich_location::override_column (int column)
+{
+  lazily_expand_location ();
+  m_expanded_location.column = column;
+}
+
+/* Add the given range.  */
+
+void
+rich_location::add_range (source_location start, source_location finish,
+			  bool show_caret_p)
+{
+  linemap_assert (m_num_ranges < MAX_RANGES);
+
+  location_range *range = &m_ranges[m_num_ranges++];
+  range->m_start = linemap_client_expand_location_to_spelling_point (start);
+  range->m_finish = linemap_client_expand_location_to_spelling_point (finish);
+  range->m_caret = range->m_start;
+  range->m_show_caret_p = show_caret_p;
+}
+
+/* Add the given range.  */
+
+void
+rich_location::add_range (source_range src_range, bool show_caret_p)
+{
+  linemap_assert (m_num_ranges < MAX_RANGES);
+
+  add_range (src_range.m_start, src_range.m_finish, show_caret_p);
+}
+
+void
+rich_location::add_range (location_range *src_range)
+{
+  linemap_assert (m_num_ranges < MAX_RANGES);
+
+  m_ranges[m_num_ranges++] = *src_range;
+}
+
+/* Add or overwrite the range given by IDX.  It must either
+   overwrite an existing range, or add one *exactly* on the end of
+   the array.
+
+   This is primarily for use by gcc when implementing diagnostic
+   format decoders e.g. the "+" in the C/C++ frontends, for handling
+   format codes like "%q+D" (which writes the source location of a
+   tree back into range 0 of the rich_location).
+
+   If SHOW_CARET_P is true, then the range should be rendered with
+   a caret at its starting location.  This
+   is for use by the Fortran frontend, for implementing the
+   "%C" and "%L" format codes.  */
+
+void
+rich_location::set_range (unsigned int idx, source_range src_range,
+			  bool show_caret_p, bool overwrite_loc_p)
+{
+  linemap_assert (idx < MAX_RANGES);
+
+  /* We can either overwrite an existing range, or add one exactly
+     on the end of the array.  */
+  linemap_assert (idx <= m_num_ranges);
+
+  location_range *locrange = &m_ranges[idx];
+  locrange->m_start
+    = linemap_client_expand_location_to_spelling_point (src_range.m_start);
+  locrange->m_finish
+    = linemap_client_expand_location_to_spelling_point (src_range.m_finish);
+
+  locrange->m_show_caret_p = show_caret_p;
+  if (overwrite_loc_p)
+    locrange->m_caret = locrange->m_start;
+
+  /* Are we adding a range onto the end?  */
+  if (idx == m_num_ranges)
+    m_num_ranges = idx + 1;
+
+  if (idx == 0 && overwrite_loc_p)
+    {
+      m_loc = src_range.m_start;
+      /* Mark any cached value here as dirty.  */
+      m_have_expanded_location = false;
+    }
+}

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH] v3 of diagnostic_show_locus and rich_location (was Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2))
  2015-09-25 20:39     ` [PATCH] v3 of diagnostic_show_locus and rich_location (was Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2)) David Malcolm
@ 2015-09-25 20:42       ` Manuel López-Ibáñez
  2015-09-25 21:14         ` Manuel López-Ibáñez
  2015-09-27 14:19       ` Dodji Seketeli
  2015-12-29 20:55       ` [PATCH] v3 of diagnostic_show_locus and rich_location (was Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2)) Mike Stump
  2 siblings, 1 reply; 83+ messages in thread
From: Manuel López-Ibáñez @ 2015-09-25 20:42 UTC (permalink / raw)
  To: David Malcolm
  Cc: Dodji Seketeli, Gcc Patch List, Jason Merrill, Tobias Burnus,
	Joseph S. Myers, Mike Stump, Rainer Orth

On 25 September 2015 at 22:11, David Malcolm <dmalcolm@redhat.com> wrote:
>>
>> +  if (0)
>> +    show_ruler (context, line_width, m_x_offset);
>>
>> This should probably be removed from the final code to be committed.
>
> FWIW, the ruler is very helpful to me when debugging the locus-printing
> (e.g. when adding fix-it-hints), and if we remove that if (0) call, we
> get:
>
> warning: ‘void show_ruler(diagnostic_context*, int, int)’ defined but
> not used [-Wunused-function]
>
> which will break bootstrap, so perhaps it instead should be an option?
> "-fdiagnostics-show-ruler" or somesuch?
>
> I don't know that it would be helpful to end-users though.

Functions that are useful only for debugging GCC usually start with
debug_* and have special attribute annotation (grep ^debug_) which
prevents those kinds of warnings (or the optimizers being too smart
and removing them).

Cheers,

Manuel.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH] v3 of diagnostic_show_locus and rich_location (was Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2))
  2015-09-25 20:42       ` Manuel López-Ibáñez
@ 2015-09-25 21:14         ` Manuel López-Ibáñez
  2015-09-25 22:10           ` Manuel López-Ibáñez
  2015-09-25 22:40           ` David Malcolm
  0 siblings, 2 replies; 83+ messages in thread
From: Manuel López-Ibáñez @ 2015-09-25 21:14 UTC (permalink / raw)
  To: David Malcolm
  Cc: Dodji Seketeli, Gcc Patch List, Jason Merrill, Tobias Burnus,
	Joseph S. Myers, Mike Stump, Rainer Orth

On 25 September 2015 at 22:18, Manuel López-Ibáñez
<lopezibanez@gmail.com> wrote:
> On 25 September 2015 at 22:11, David Malcolm <dmalcolm@redhat.com> wrote:


   context->last_location = diagnostic_location (diagnostic, 0);
-  expanded_location s0 = diagnostic_expand_location (diagnostic, 0);
-  expanded_location s1 = { };
-  /* Zero-initialized. This is checked later by
diagnostic_print_caret_line.  */

-  if (diagnostic_location (diagnostic, 1) > BUILTINS_LOCATION)
-    s1 = diagnostic_expand_location (diagnostic, 1);
+  if (context->frontend_calls_diagnostic_print_caret_line_p)
+    {
+      /* The GCC < 6 routine. */
+      expanded_location s0 = diagnostic_expand_location (diagnostic, 0);
+      expanded_location s1 = { };
+      /* Zero-initialized. This is checked later by
+     diagnostic_print_caret_line.  */
+
+      if (diagnostic_num_locations (diagnostic) >= 2)
+    s1 = diagnostic->message.m_richloc->get_range (1)->m_start;

-  diagnostic_print_caret_line (context, s0, s1,
-                   context->caret_chars[0],
-                   context->caret_chars[1]);
+      diagnostic_print_caret_line (context, s0, s1,
+                   context->caret_chars[0],
+                   context->caret_chars[1]);
+    }
+  else
+    /* The GCC >= 6 routine.  */
+    diagnostic_print_ranges (context, diagnostic);
 }


I haven't had time to look at the patch in detail, so please excuse me
if this is answered elsewhere.

Why do you need this hack? The whole point of moving Fortran to the
common machinery is to not have this duplication.

Can't the new code print one caret without ranges ever? Something like:

error: expected ';'
  }
   ^

If it can, then the function responsible for doing that can be called
by Fortran and it should replace diagnostic_print_caret_line.

Or is it that the new diagnostic_print_ranges cannot print multiple
carets in the same line? Like this

error: error at (1) and (2)
  adfadfafd asdfdaffa
   1            2

If this is the case, this is a missing functionality that
diagnostic_print_caret_line already has and that was ready to be used
in C/C++. See example at  (O) here:
https://gcc.gnu.org/wiki/Better_Diagnostics

In my mind, it should be possible for Fortran to pass to the
diagnostics machinery two locations with range width 1 (or 0,
depending how you want to represent a range that covers exactly one
char) and get a caret line like the example above. Why is this not
possible?

Cheers,

Manuel.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH] v3 of diagnostic_show_locus and rich_location (was Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2))
  2015-09-25 21:14         ` Manuel López-Ibáñez
@ 2015-09-25 22:10           ` Manuel López-Ibáñez
  2015-09-26  4:51             ` David Malcolm
  2015-09-25 22:40           ` David Malcolm
  1 sibling, 1 reply; 83+ messages in thread
From: Manuel López-Ibáñez @ 2015-09-25 22:10 UTC (permalink / raw)
  To: David Malcolm
  Cc: Dodji Seketeli, Gcc Patch List, Jason Merrill, Tobias Burnus,
	Joseph S. Myers, Mike Stump, Rainer Orth

+   If SHOW_CARET_P is true, then the range should be rendered with
+   a caret at its starting location.  This
+   is for use by the Fortran frontend, for implementing the
+   "%C" and "%L" format codes.  */
+
+void
+rich_location::set_range (unsigned int idx, source_range src_range,
+              bool show_caret_p, bool overwrite_loc_p)

I do not understand when is this show_caret_p used by Fortran given
the diagnostic_show_locus code mentioned earlier.

Related to this:

inline void set_location (unsigned int idx, location_t loc, bool caret_p)

is always called with the last parameter 'true' (boolean parameters
are always almost bad API). Do you really need this parameter?

+/* Overwrite the range within this text_info's rich_location.
+   For use e.g. when implementing "+" in client format decoders.  */

If we got rid of '+' we would not need this extra work. Also '+'
breaks #pragma diagnostics. Not the fault of your patch, but it just
shows that technical debt keeps accumulating.
https://gcc.gnu.org/wiki/Partial_Transitions

Cheers,

Manuel.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH] v3 of diagnostic_show_locus and rich_location (was Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2))
  2015-09-25 21:14         ` Manuel López-Ibáñez
  2015-09-25 22:10           ` Manuel López-Ibáñez
@ 2015-09-25 22:40           ` David Malcolm
  2015-09-26  6:41             ` Manuel López-Ibáñez
  1 sibling, 1 reply; 83+ messages in thread
From: David Malcolm @ 2015-09-25 22:40 UTC (permalink / raw)
  To: Manuel López-Ibáñez
  Cc: Dodji Seketeli, Gcc Patch List, Jason Merrill, Tobias Burnus,
	Joseph S. Myers, Mike Stump, Rainer Orth

On Fri, 2015-09-25 at 22:39 +0200, Manuel López-Ibáñez wrote:
> On 25 September 2015 at 22:18, Manuel López-Ibáñez
> <lopezibanez@gmail.com> wrote:
> > On 25 September 2015 at 22:11, David Malcolm <dmalcolm@redhat.com> wrote:
> 
> 
>    context->last_location = diagnostic_location (diagnostic, 0);
> -  expanded_location s0 = diagnostic_expand_location (diagnostic, 0);
> -  expanded_location s1 = { };
> -  /* Zero-initialized. This is checked later by
> diagnostic_print_caret_line.  */
> 
> -  if (diagnostic_location (diagnostic, 1) > BUILTINS_LOCATION)
> -    s1 = diagnostic_expand_location (diagnostic, 1);
> +  if (context->frontend_calls_diagnostic_print_caret_line_p)
> +    {
> +      /* The GCC < 6 routine. */
> +      expanded_location s0 = diagnostic_expand_location (diagnostic, 0);
> +      expanded_location s1 = { };
> +      /* Zero-initialized. This is checked later by
> +     diagnostic_print_caret_line.  */
> +
> +      if (diagnostic_num_locations (diagnostic) >= 2)
> +    s1 = diagnostic->message.m_richloc->get_range (1)->m_start;
> 
> -  diagnostic_print_caret_line (context, s0, s1,
> -                   context->caret_chars[0],
> -                   context->caret_chars[1]);
> +      diagnostic_print_caret_line (context, s0, s1,
> +                   context->caret_chars[0],
> +                   context->caret_chars[1]);
> +    }
> +  else
> +    /* The GCC >= 6 routine.  */
> +    diagnostic_print_ranges (context, diagnostic);
>  }
> 
> 
> I haven't had time to look at the patch in detail, so please excuse me
> if this is answered elsewhere.

(nods; the discussion has gotten large).

> Why do you need this hack? The whole point of moving Fortran to the
> common machinery is to not have this duplication.

I attempted to address this in:
  https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01700.html
where I said:

  * The Fortran frontend has its own logic for printing multiple
  locations, repeatedly calling in to diagnostic_print_caret_line.
  I hope the new printing logic is suitable for use by Fortran, but I
  wanted to keep the job of "introducing range-capable printing logic"
  separate from that of "updating Fortran diagnostics to use it",
  since I'm not very familiar with Fortran, and what is desirable
  there.  Hence to faithfully preserve the existing behavior, I
  introduced a flag into the diagnostic_context:
    "frontend_calls_diagnostic_print_caret_line_p"
  which is set by the Fortran frontend, and makes diagnostic_show_locus
  use the existing printing logic.  Hopefully that's acceptable,
  say, as a migration path.

My recollection is that I saw that the Fortran frontend has logic for
calling into diagnostic_print_caret_line, noticed that the fortran
testsuite has dg- assertions about finding specific messages, and I got
worried that they embed assumptions about how the old printer worked.
Hence I wanted to avoid touching that for the first version, and so in
this patch it's a hybrid of the old Fortran printing code with the new
representation for multiple locations.

Maybe that's a cop-out.  Would you prefer that the patch goes all the
way, and that I attempt to eliminate all calls to
diagnostic_print_caret_line from the Fortran FE, and eliminate the old
implementation?  (either now, or as a followup patch?)  I may need
assistance with that; I suspect that some of the dg- assertions in the
Fortran test suite may need updating.

> Can't the new code print one caret without ranges ever? Something like:
> 
> error: expected ';'
>   }
>    ^

It can handle that just fine.  See the examples in line-map.h in the
patch for the kinds of things that a rich_location can represent.


> If it can, then the function responsible for doing that can be called
> by Fortran and it should replace diagnostic_print_caret_line.
> 
> Or is it that the new diagnostic_print_ranges cannot print multiple
> carets in the same line? Like this
> 
> error: error at (1) and (2)
>   adfadfafd asdfdaffa
>    1            2

It can do that too; again see the big comment in line-map.h

> If this is the case, this is a missing functionality that
> diagnostic_print_caret_line already has and that was ready to be used
> in C/C++.

We're good, I believe.

> See example at  (O) here:
> https://gcc.gnu.org/wiki/Better_Diagnostics

Pasting it here:

foo.cc: 3:17,3:22: warning: missing braces around initializer for ‘int
[2]’ [-Wmissing-braces]
int a[2][2] = { 0, 1 , 2, 3 }; // { dg-warning "" }
                ^    ^ 
                {    }


It can support printing carets at both locations.  For the braces line,
I'd prefer to do those by explicitly adding a fixit-hints API.  I have a
followup patch to do that; see
  https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00732.html
for an earlier version of said patch, which in fact uses that example in
the unit tests "test_fixit_insert".  Although to be fair, I did that
with a single range rather than a pair of carets:

    int a[2][2] = { 0, 1 , 2, 3 };
                    ^~~~
                    {   }

The code supports both approaches (I feel the latter is slightly more
user-friendly as it's more clearly identifying the initializer for
int[2]  ...but this is veering off into bike-shed territory).

> In my mind, it should be possible for Fortran to pass to the
> diagnostics machinery two locations with range width 1 (or 0,
> depending how you want to represent a range that covers exactly one
> char) and get a caret line like the example above. Why is this not
> possible?

It is possible.

> Cheers,
> 
> Manuel.

Thanks for the comments

Dave

[as noted before, I'm about to disappear on vacation for 10 days, so
replies to followups may be delayed]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH] v3 of diagnostic_show_locus and rich_location (was Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2))
  2015-09-25 22:10           ` Manuel López-Ibáñez
@ 2015-09-26  4:51             ` David Malcolm
  2015-09-26  6:18               ` Manuel López-Ibáñez
  0 siblings, 1 reply; 83+ messages in thread
From: David Malcolm @ 2015-09-26  4:51 UTC (permalink / raw)
  To: Manuel López-Ibáñez
  Cc: Dodji Seketeli, Gcc Patch List, Jason Merrill, Tobias Burnus,
	Joseph S. Myers, Mike Stump, Rainer Orth

On Fri, 2015-09-25 at 23:13 +0200, Manuel López-Ibáñez wrote:
> +   If SHOW_CARET_P is true, then the range should be rendered with
> +   a caret at its starting location.  This
> +   is for use by the Fortran frontend, for implementing the
> +   "%C" and "%L" format codes.  */
> +
> +void
> +rich_location::set_range (unsigned int idx, source_range src_range,
> +              bool show_caret_p, bool overwrite_loc_p)
> 
> I do not understand when is this show_caret_p used by Fortran given
> the diagnostic_show_locus code mentioned earlier.

The patch is something of a hybrid: on the one hand it's using the new
rich_location class for storing multiple locations for a diagnostic (and
this replaces the existing way we did this in struct text_info), but on
the other hand, for Fortran, it's using the old printing code.

rich_location::set_range exists to ensure that the %C and %L codes used
by Fortran (and "+" in the C family of FEs) can write back into the
rich_location instance, faithfully emulating the old code that wrote
back to
struct text_info's:
  location_t locations[MAX_LOCATIONS_PER_MESSAGE];

(that array is replaced in the patch by a rich_location *, pointing back
at the rich_location in the diagnostic_info).

> Related to this:
> 
> inline void set_location (unsigned int idx, location_t loc, bool caret_p)
> 
> is always called with the last parameter 'true' (boolean parameters
> are always almost bad API). Do you really need this parameter?

Ah, OK.  Maybe not there.

> +/* Overwrite the range within this text_info's rich_location.
> +   For use e.g. when implementing "+" in client format decoders.  */
> 
> If we got rid of '+' we would not need this extra work. Also '+'
> breaks #pragma diagnostics. Not the fault of your patch, but it just
> shows that technical debt keeps accumulating.
> https://gcc.gnu.org/wiki/Partial_Transitions

(nods)   That "+" thing was one of the surprises I ran into when working
on this, and is the reason that it isn't a :
  const rich_location *
but just a:
  rich_location *
given that formatting the diagnostic text can lead to the location being
modified.  I'm just emulating/supporting the existing behavior.

Dave


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH] v3 of diagnostic_show_locus and rich_location (was Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2))
  2015-09-26  4:51             ` David Malcolm
@ 2015-09-26  6:18               ` Manuel López-Ibáñez
  0 siblings, 0 replies; 83+ messages in thread
From: Manuel López-Ibáñez @ 2015-09-26  6:18 UTC (permalink / raw)
  To: David Malcolm
  Cc: Dodji Seketeli, Gcc Patch List, Jason Merrill, Tobias Burnus,
	Joseph S. Myers, Mike Stump, Rainer Orth

On 25 September 2015 at 23:24, David Malcolm <dmalcolm@redhat.com> wrote:
> On Fri, 2015-09-25 at 23:13 +0200, Manuel López-Ibáñez wrote:
>> +   If SHOW_CARET_P is true, then the range should be rendered with
>> +   a caret at its starting location.  This
>> +   is for use by the Fortran frontend, for implementing the
>> +   "%C" and "%L" format codes.  */
>> +
>> +void
>> +rich_location::set_range (unsigned int idx, source_range src_range,
>> +              bool show_caret_p, bool overwrite_loc_p)
>>
>> I do not understand when is this show_caret_p used by Fortran given
>> the diagnostic_show_locus code mentioned earlier.

[...]
> rich_location::set_range exists to ensure that the %C and %L codes used
> by Fortran (and "+" in the C family of FEs) can write back into the
> rich_location instance, faithfully emulating the old code that wrote
> back to
> struct text_info's:
>   location_t locations[MAX_LOCATIONS_PER_MESSAGE];

Why Fortran cannot use text->set_location like the other FEs? This way
you do not need set_range at all. In fact, you do:

+    source_range range
+      = source_range::from_location (
+          linemap_position_for_loc_and_offset (line_table,
+                           loc->lb->location,
+                           offset));
+    text->set_range (loc_num, range, true);

But I guess this doesn't actually create a range like ^~~~ but as single ^.

The other issue that confuses me is that show_caret_p is always true
when reaching this function via the pretty-printer. Thus, show_caret_p
is also used by C/C++. In fact, I'm not sure when it can be false.

Cheers,

Manuel.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH] v3 of diagnostic_show_locus and rich_location (was Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2))
  2015-09-25 22:40           ` David Malcolm
@ 2015-09-26  6:41             ` Manuel López-Ibáñez
  0 siblings, 0 replies; 83+ messages in thread
From: Manuel López-Ibáñez @ 2015-09-26  6:41 UTC (permalink / raw)
  To: David Malcolm
  Cc: Dodji Seketeli, Gcc Patch List, Jason Merrill, Tobias Burnus,
	Joseph S. Myers, Mike Stump, Rainer Orth

On 25 September 2015 at 23:15, David Malcolm <dmalcolm@redhat.com> wrote:
> My recollection is that I saw that the Fortran frontend has logic for
> calling into diagnostic_print_caret_line, noticed that the fortran
> testsuite has dg- assertions about finding specific messages, and I got
> worried that they embed assumptions about how the old printer worked.
> Hence I wanted to avoid touching that for the first version, and so in
> this patch it's a hybrid of the old Fortran printing code with the new
> representation for multiple locations.

It is quite simple, one you understand the logic. Fortran has three
types of output:

(a) #     [name]:[locus]:
    #
    #        some code
    #              1
    #     Error: Some error at (1)

which can call the same function used by other FEs to print the caret
line (I call the caret line, the line that contains the caret
character/ranges, 1 in this case).

(b) #     [name]:[locus]:
    #
    #       some code and some more code
    #              1       2
    #     Error: Some error at (1) and (2)


which according to what you explained should also be possible by
calling diagnostic_show_locus with the appropriate location info and

(c) #     [name]:[locus]:
    #
    #       some code
    #              1
    #     [name]:[locus2]:
    #
    #       some other code
    #         2
    #     Error: Some error at (1) and (2)
    # or

which was implemented by calling diagnostic_show_locus with just the
location of 1, then calling diagnostic_print_caret_line with just the
expanded_location of 2. I could have just called diagnostic_show_locus
also to print 2 by overriding diagnostic->location[0] =
diagnostic->location[1] and caret_char[0] = caret_char[1], but that
seemed a bit hackish and more expensive (but perhaps less confusing?).

If you have a function that you can call with one or more
location_t/expanded_location  (or something that can be converted from
a location_t) and pass explicitly the caret_char, then you just need
to call that function with the right parameters to get the second part
of (c). Otherwise, you may simply temporarily do caret_char[0] =
caret_char[1], before calling the same function that prints the
caret-line for (a).

> Maybe that's a cop-out.  Would you prefer that the patch goes all the
> way, and that I attempt to eliminate all calls to
> diagnostic_print_caret_line from the Fortran FE, and eliminate the old
> implementation?  (either now, or as a followup patch?)  I may need
> assistance with that; I suspect that some of the dg- assertions in the
> Fortran test suite may need updating.

There is only one call! I just think this hack is really not necessary
(in fact, it seems more complicated than the alternatives outlined
above). And I'm afraid that once it goes in, it will stay there
forever. You are in a far better position than the Fortran devs to
understand how to call your new interfaces to get the output you
desire.

Cheers,

Manuel.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 1/5] Testsuite: add dg-{begin|end}-multiline-output commands
  2015-09-25 17:22   ` Jeff Law
@ 2015-09-27  1:29     ` Bernhard Reutner-Fischer
  0 siblings, 0 replies; 83+ messages in thread
From: Bernhard Reutner-Fischer @ 2015-09-27  1:29 UTC (permalink / raw)
  To: Jeff Law, David Malcolm, gcc-patches

On September 25, 2015 6:52:37 PM GMT+02:00, Jeff Law <law@redhat.com> wrote:
>On 09/22/2015 03:26 PM, David Malcolm wrote:
>> This patch is essentially identical to v1 here:
>>    https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00729.html
>> The only change is in the ChangeLog, moving the libgo.exp
>> ChangeLog entry into gcc/testsuite/ChangeLog, analogous to
>> where Ian put it when introducing the file in r167407.
>>
>> OK for trunk?
>>
>> Blurb from v1 follows:
>>
>> This patch adds an easy way to write tests for expected multiline
>> output.  For example we can test carets and underlines for
>> a particular diagnostic with:
>>
>> /* { dg-begin-multiline-output "" }
>>   typedef struct _GMutex GMutex;
>>                  ^~~~~~~
>>     { dg-end-multiline-output "" } */
>>
>> It is used extensively by the rest of the patch kit.
>>
>> multiline.exp is used by prune.exp; hence we need to load it before
>> prune.exp via *load_gcc_lib* for the testsuites of the various
>> non-"gcc" support libraries (e.g. boehm-gc).
>>
>> gcc/testsuite/ChangeLog:
>> 	* lib/multiline.exp: New file.
>> 	* lib/prune.exp: Load multiline.exp.
>> 	(prune_gcc_output): Call into multiline.exp to handle any
>> 	multiline output directives.
>> 	* lib/libgo.exp: Load multiline.exp before prune.exp, using
>> 	load_gcc_lib.
>>
>> boehm-gc/ChangeLog:
>> 	* testsuite/lib/boehm-gc.exp: Load multiline.exp before
>> 	prune.exp, using load_gcc_lib.
>>
>> libatomic/ChangeLog:
>> 	* testsuite/lib/libatomic.exp: Load multiline.exp before
>> 	prune.exp, using load_gcc_lib.
>>
>> libgomp/ChangeLog:
>> 	* testsuite/lib/libgomp.exp: Load multiline.exp before prune.exp,
>> 	using load_gcc_lib.
>>
>> libitm/ChangeLog:
>> 	* testsuite/lib/libitm.exp: Load multiline.exp before prune.exp,
>> 	using load_gcc_lib.
>>
>> libvtv/ChangeLog:
>> 	* testsuite/lib/libvtv.exp: Load multiline.exp before prune.exp,
>> 	using load_gcc_lib.
>This stalled due to the dejagnu version discussion, which itself has 
>stalled :(
>
>I think the only issue was the loading of prune.exp and until we've 
>jumped to the latest dejagnu, using load_gcc_lib is the approved way to
>
>deal with that problem.
>
>Soooo.
>
>Approved.  Hopefully we'll be able to clean up the load_gcc_lib mess in
>
>the near future, but I don't see a good reason to continue to hold up 
>this patch.

Indeed, didn't mean to stall that one.
It's just seeing folks scratching their head over solved non-problems is something I cannot easily overcome. Bad habit, I know. I fear I'll never learn it ;)

Cheers,
>
>jeff


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH] v3 of diagnostic_show_locus and rich_location (was Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2))
  2015-09-25 20:39     ` [PATCH] v3 of diagnostic_show_locus and rich_location (was Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2)) David Malcolm
  2015-09-25 20:42       ` Manuel López-Ibáñez
@ 2015-09-27 14:19       ` Dodji Seketeli
  2015-10-12 15:45         ` [PATCH] v4 of diagnostic_show_locus and rich_location David Malcolm
  2015-12-29 20:55       ` [PATCH] v3 of diagnostic_show_locus and rich_location (was Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2)) Mike Stump
  2 siblings, 1 reply; 83+ messages in thread
From: Dodji Seketeli @ 2015-09-27 14:19 UTC (permalink / raw)
  To: David Malcolm
  Cc: gcc-patches, Jason Merrill, Tobias Burnus, Joseph S. Myers,
	Manuel López-Ibáñez

[Note to libcpp, C, and Fortran maintainers: we still need your input :-)]

Hello,

David Malcolm <dmalcolm@redhat.com> writes:

[...]

> Here's the revised comment I put in the attached patch:

[...]

> +   The class caches the lookup of the color codes for the above.
> +
> +   The class also has responsibility for tracking which of the above is
> +   active, filtering out unnecessary changes.  This allows layout::print_line
> +   to simply request a colorization code for *every* character it prints
> +   thorough this class, and have the filtering be done for it here.

You probably meant "*through* this class" ?

> */

> Hopefully that comment explains the possible states the colorizer can
> have.

Yes it does, great comment, thank you.


> FWIW I have a follow-up patch to add support for fix-it hints, so they
> might be another kind of colorization state.
> (see https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00732.html for the
> earlier version of said patch, in v1 of the kit).

Yeah, I'll comment on that one separatly.

>> Also, I am thinking that there should maybe be a layout::state type,
>> which would have two notional properties (for now): range_index and
>> draw_caret_p. So that this function:
>> 
>> +bool
>> +layout::get_state_at_point (/* Inputs.  */
>> +			    int row, int column,
>> +			    int first_non_ws, int last_non_ws,
>> +			    /* Outputs.  */
>> +			    int *out_range_idx,
>> +			    bool *out_draw_caret_p)
>> 
>> Would take just one output parameter, e.g, a reference to
>> layout::state.
>
> Fixed, though I called it "struct point_state", given that it's coming
> from get_state_at_point.  I passed it by pointer, since AFAIK our coding
> standards don't yet approve of the use of references in the codebase
> (outside of places where we need them e.g. container classes).

Great.  Thanks.

>
> I also added a unit test for a rich_location with two caret locations
> (mimicking one of the Fortran examples), to give us coverage for this
> case:
>
> +void test_multiple_carets (void)
> +{
> +#if 0
> +   x = x + y /* { dg-warning "8: test" } */
> +/* { dg-begin-multiline-output "" }
> +    x = x + y
> +        A   B
> +   { dg-end-multiline-output "" } */
> +#endif
> +}
>
> where the "A" and "B" as caret chars are coming from new code in the
> show_locus unittest plugin.

Yeah, saw that.  Excellent, thanks.

[...]

>> +  if (0)
>> +    show_ruler (context, line_width, m_x_offset);
>> 
>> This should probably be removed from the final code to be committed.
>
> FWIW, the ruler is very helpful to me when debugging the locus-printing
> (e.g. when adding fix-it-hints), and if we remove that if (0) call, we
> get:
>
> warning: ‘void show_ruler(diagnostic_context*, int, int)’ defined but
> not used [-Wunused-function]
>
> which will break bootstrap, so perhaps it instead should be an option?
> "-fdiagnostics-show-ruler" or somesuch?
>
> I don't know that it would be helpful to end-users though.
>
> I'd prefer to just keep it in the code with the
>   if (0)
> as-is, since it's useful "scaffolding" for hacking on the code.
>

OK, I understand; though, as Manuel noted elsewhere, you might rename
that function debug_show_ruler and declare it as:

    DEBUG_FUNCTION static void
    debug_show_ruler (diagnostic_context *context, int max_width, int x_offset)
    {
      /* ...  */
    }
to comply with that is generally done in the compiler.

[...]

>> +/* Get the column beyond the rightmost one that could contain a caret or
>> +   range marker, given that we stop rendering at trailing whitespace.  */
>> +
>> +int
>> +layout::get_x_bound_for_row (int row, int caret_column,
>> +			     int last_non_ws)
>> 
>> Please describe what the parameters mean here, especially last_non_ws.
>> I had to read its code to know that last_non_ws was the *column* of
>> the last non white space character.
>
> I renamed it to "last_non_ws_column", and fleshed out the comment

OK.

[...]

>>  void
>>  diagnostic_show_locus (diagnostic_context * context,
>>  		       const diagnostic_info *diagnostic)
>> @@ -75,16 +710,25 @@ diagnostic_show_locus (diagnostic_context * context,
>>      return;
>> 
>> +      /* The GCC 5 routine. */
>> 
>> I'd say the GCC <= 5 routine ;-)
>
>> +  else
>> +    /* The GCC 6 routine.  */
>> 
>> And here, the GCC > 5 routine.
>
> Changed to "GCC < 6" and "GCC >= 6", on the pedantic grounds that e.g.
> 5.1 > 5

OK.

>
>> I would be surprised to see this patch in particular incur any
>> noticeable increase in time and space consumption, but, have you noticed
>> anythying related to that during bootstrap?
>
> I hadn't noticed it, but I wasn't timing.  I'll have a look.

Ok, thanks.

> One possible nit here is that the patch expands locations when
> constructing rich_location instances, and it does that for warnings
> before the logic to ignore them.  So there may be some extra calls there
> that aren't present in trunk, for discarded warnings.  I don't expect
> that to affect the speed of the compiler though (I expect it to be lost
> in the noise).

Fair enough.

> OK for trunk if it passes bootstrap/regrtest?

The core diagnostics bits are IMHO in good shape.  I'd like to see the
discussion with Manuel be resolved before it goes in, though.  And we
still need the "Go" from the FE maintainers.

So I am continuing the discussion with Manuel below.

Manuel López-Ibáñez <lopezibanez@gmail.com> writes:

> On 25 September 2015 at 23:15, David Malcolm <dmalcolm@redhat.com> wrote:
>> My recollection is that I saw that the Fortran frontend has logic for
>> calling into diagnostic_print_caret_line, noticed that the fortran
>> testsuite has dg- assertions about finding specific messages, and I got
>> worried that they embed assumptions about how the old printer worked.
>> Hence I wanted to avoid touching that for the first version, and so in
>> this patch it's a hybrid of the old Fortran printing code with the new
>> representation for multiple locations.
>
> It is quite simple, one you understand the logic. Fortran has three
> types of output:
>
> (a) #     [name]:[locus]:
>     #
>     #        some code
>     #              1
>     #     Error: Some error at (1)
>
> which can call the same function used by other FEs to print the caret
> line (I call the caret line, the line that contains the caret
> character/ranges, 1 in this case).
>
> (b) #     [name]:[locus]:
>     #
>     #       some code and some more code
>     #              1       2
>     #     Error: Some error at (1) and (2)
>
>
> which according to what you explained should also be possible by
> calling diagnostic_show_locus with the appropriate location info and
>
> (c) #     [name]:[locus]:
>     #
>     #       some code
>     #              1
>     #     [name]:[locus2]:
>     #
>     #       some other code
>     #         2
>     #     Error: Some error at (1) and (2)
>     # or
>
> which was implemented by calling diagnostic_show_locus with just the
> location of 1, then calling diagnostic_print_caret_line with just the
> expanded_location of 2. I could have just called diagnostic_show_locus
> also to print 2 by overriding diagnostic->location[0] =
> diagnostic->location[1] and caret_char[0] = caret_char[1], but that
> seemed a bit hackish and more expensive (but perhaps less confusing?).
>
> If you have a function that you can call with one or more
> location_t/expanded_location  (or something that can be converted from
> a location_t) and pass explicitly the caret_char, then you just need
> to call that function with the right parameters to get the second part
> of (c). Otherwise, you may simply temporarily do caret_char[0] =
> caret_char[1], before calling the same function that prints the
> caret-line for (a).
>
>> Maybe that's a cop-out.  Would you prefer that the patch goes all the
>> way, and that I attempt to eliminate all calls to
>> diagnostic_print_caret_line from the Fortran FE, and eliminate the old
>> implementation?  (either now, or as a followup patch?)  I may need
>> assistance with that; I suspect that some of the dg- assertions in the
>> Fortran test suite may need updating.
>
> There is only one call! I just think this hack is really not necessary
> (in fact, it seems more complicated than the alternatives outlined
> above).

I agree that David's approach of adding an impedance adaptation layer in
gcc/diagnostic-show-locus.c to keep the Fortran FE unchanged *for now*
complicates diagnostic-show-locus.c.  But then the benefit is to
break-up the tasks at hand in smaller (less coupled) tasks, making the
whole process more manageable.

IOW, I think the Fortran FE can then be updated later in a follow-up
patch, and then the adaptation layer code can be removed from
diagnostic-show-locus.c *if* we all agree that the Fortran FE can be
adapted to use the foundation being put in place in this current patch.

And I think all the functionalities the Fortran FE needs for that
matter, are supported by this patch.  Please correct me if I am wrong.

> And I'm afraid that once it goes in, it will stay there forever.

:-)

Or maybe not :-)

It looks we are in motion (with some good energy) to change these things
at the moment, thanks to David and you, amongst others.  So I would
agree to bet on a positive outcome of this, and to let the patch go in.
I guess if the Fortran FE doesn't get updated afterwards, I'll be the
one on the hook; I'd take the bet nevertheless.

> You are in a far better position than the Fortran devs to understand
> how to call your new interfaces to get the output you desire.

That is correct.  And hopefully, letting the current patch go in won't
prevent the further needed changes of the FEs to happen, including the
Fortran one.

[...]

Manuel López-Ibáñez <lopezibanez@gmail.com> writes:

> On 25 September 2015 at 23:24, David Malcolm <dmalcolm@redhat.com> wrote:
>> On Fri, 2015-09-25 at 23:13 +0200, Manuel López-Ibáñez wrote:
>>> +   If SHOW_CARET_P is true, then the range should be rendered with
>>> +   a caret at its starting location.  This
>>> +   is for use by the Fortran frontend, for implementing the
>>> +   "%C" and "%L" format codes.  */
>>> +
>>> +void
>>> +rich_location::set_range (unsigned int idx, source_range src_range,
>>> +              bool show_caret_p, bool overwrite_loc_p)
>>>
>>> I do not understand when is this show_caret_p used by Fortran given
>>> the diagnostic_show_locus code mentioned earlier.
>
> [...]
>> rich_location::set_range exists to ensure that the %C and %L codes used
>> by Fortran (and "+" in the C family of FEs) can write back into the
>> rich_location instance, faithfully emulating the old code that wrote
>> back to
>> struct text_info's:
>>   location_t locations[MAX_LOCATIONS_PER_MESSAGE];
>
> Why Fortran cannot use text->set_location like the other FEs? This way
> you do not need set_range at all.

Yes, it does look to me that this change:

    @@ -938,10 +939,12 @@ gfc_format_decoder (pretty_printer *pp,
            /* If location[0] != UNKNOWN_LOCATION means that we already
               processed one of %C/%L.  */
            int loc_num = text->get_location (0) == UNKNOWN_LOCATION ? 0 : 1;
    -	text->set_location (loc_num,
    -			    linemap_position_for_loc_and_offset (line_table,
    -								 loc->lb->location,
    -								 offset));
    +	source_range range
    +	  = source_range::from_location (
    +	      linemap_position_for_loc_and_offset (line_table,
    +						   loc->lb->location,
    +						   offset));
    +	text->set_range (loc_num, range, true);

is unnecessary because text_info::set_location() got updated as:

    @@ -40,21 +35,17 @@ struct text_info
       va_list *args_ptr;
       int err_no;  /* for %m */
       void **x_data;
    +  rich_location *m_richloc;

    -  inline void set_location (unsigned int index_of_location, location_t loc)
    +  inline void set_location (unsigned int idx, location_t loc, bool caret_p)
       {
    -    gcc_checking_assert (index_of_location < MAX_LOCATIONS_PER_MESSAGE);
    -    this->locations[index_of_location] = loc;
    +    source_range src_range;
    +    src_range.m_start = loc;
    +    src_range.m_finish = loc;
    +    set_range (idx, src_range, caret_p);
       }

So what is needed AIUI is to just add a 'caret_p' argument to the
initial call to:

        text->set_location (loc_num,
    			    linemap_position_for_loc_and_offset (line_table,
    								 loc->lb->location,
    								 offset));

> The other issue that confuses me is that show_caret_p is always true
> when reaching this function via the pretty-printer. Thus, show_caret_p
> is also used by C/C++. In fact, I'm not sure when it can be false.

I think the reason why show_caret_p is always true here is that the
patch updates the existing FE code so that it keeps emitting the *same*
diagnostics as before.  And today, FEs emits diagnostics with ranges
that always starts with a caret.

But I think it's easy to imagine hypothetic (and reasonably probable)
examples where we'd have only one range with a caret, associated to other
ranges with no caret.  The ranges with no carets would thus require the
show_caret_p argument to be non-null.  In fact, patch 2/5 has one of
these examples in a comment in line-map.h:

    +   Example C
    +   *********
    +      a = (foo && bar)
    +           ~~~ ^~ ~~~
    +   This rich location has three ranges:
    +   - Range 0 has its caret and start location at the first "&" and
    +     end at the second "&.
    +   - Range 1 has its start and finish at the "f" and "o" of "foo";
    +     the caret is not flagged for display, but is perhaps at the "f"
    +     of "foo".
    +   - Similarly, range 2 has its start and finish at the "b" and "r" of
    +     "bar"; the caret is not flagged for display, but is perhaps at the
    +     "b" of "bar".

We'll then start seeing cases with show_caret_p being false when *new*
diagnostics are added to FEs to leverage on this new feature.

So I don't think this "show_caret_p" argument's value is an issue, quite
the contrary; it's an interesting feature.

Cheers,

-- 
		Dodji

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 5/5] Add plugin to recursively dump the source-ranges in a tree (v2)
  2015-09-22 21:10 ` [PATCH 5/5] Add plugin to recursively dump the source-ranges in a tree (v2) David Malcolm
@ 2015-09-28  8:23   ` Dodji Seketeli
  0 siblings, 0 replies; 83+ messages in thread
From: Dodji Seketeli @ 2015-09-28  8:23 UTC (permalink / raw)
  To: David Malcolm; +Cc: gcc-patches, Jeff Law

David Malcolm <dmalcolm@redhat.com> a écrit:

> This patch adds a test plugin that recurses down an expression tree,
> printing diagnostics showing the ranges of each node in the tree.
>
> It corresponds to:
>   [PATCH 15/22] Add plugin to recursively dump the source-ranges in a tree
>     https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00741.html
> from v1 of the patch kit.
>
> Changes in v2:
>   * the output no longer contains the PARAM_DECL and INTEGER_CST
>     leaves since we no longer have range data for them; updated
>     the expected output accordingly.
>   * slightly updated to eliminate use of SOURCE_RANGE
>
> Updated screenshot:
>   https://dmalcolm.fedorapeople.org/gcc/2015-09-22/diagnostic-test-show-trees-1.html
>
> gcc/testsuite/ChangeLog:
> 	* gcc.dg/plugin/diagnostic-test-show-trees-1.c: New file.
> 	* gcc.dg/plugin/diagnostic_plugin_show_trees.c: New file.
> 	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add
> 	diagnostic_plugin_show_trees.c and
> 	diagnostic-test-show-trees-1.c.

For what it's worth, this looks good to me.

Thanks!

-- 
		Dodji

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH] v4 of diagnostic_show_locus and rich_location
  2015-09-27 14:19       ` Dodji Seketeli
@ 2015-10-12 15:45         ` David Malcolm
  2015-10-12 16:37           ` Manuel López-Ibáñez
  0 siblings, 1 reply; 83+ messages in thread
From: David Malcolm @ 2015-10-12 15:45 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: gcc-patches, Jason Merrill, Tobias Burnus, Joseph S. Myers,
	Manuel López-Ibáñez

[-- Attachment #1: Type: text/plain, Size: 17635 bytes --]

On Sun, 2015-09-27 at 02:55 +0200, Dodji Seketeli wrote:
> [Note to libcpp, C, and Fortran maintainers: we still need your input :-)]

Updated version of patch attached (v4); a diff relative to v3 can be
seen at:
https://dmalcolm.fedorapeople.org/gcc/2015-10-12/0003-Eliminate-special-casing-for-Fortran.patch

v4 eliminates the lingering parts of the old implementation of
diagnostic_show_locus, porting the Fortran frontend to use the new
implementation.

In the process I discovered an issue with the Fortran frontend: some of
the caret locations appear to have an off-by-one error.
For example, in gcc/testsuite/gfortran.dg/associate_5.f03, the old
implementation would issue this diagnostic:

associate_5.f03:33:6:

       y = 5 ! { dg-error "variable definition context" }
      1
associate_5.f03:32:20:

     ASSOCIATE (y => x) ! { dg-error "variable definition context" }
                    2
Error: Associate-name ‘y’ can not appear in a variable definition
context (assignment) at (1) because its target at (2) can not, either

Note how the carets 1 and 2 appear one column before the "y" and the "x"
that they refer to.

This seems to be a pre-existing bug in the Fortran FE, which I've now
filed as PR fortran/67936.

On porting the Fortran FE to fully use the new implementation of
diagnostic_show_locus, I found that the "1" caret in the above
disappeared, due to v3 of the layout printer suppressed carets and
underlines appearing within the leading whitespace before the text in
its line.  So I updated that to only suppress underlines in such a
location, and not carets, to ensure that we at least faithfully print
both carets, at the given (erroneous) locations.   I added test coverage
for this (test_caret_on_leading_whitespace).

The existing Fortran testcase for diagnostics with multiple locations
don't seem to verify the -fdiagnostics-show-caret case; I visually
inspected the results, but perhaps we could add some automated test
coverage there using the dg-{begin|end}-multiline directives from
earlier in this kit (which is now in trunk).  I don't know if adding
such test coverage is necessary for acceptance of this patch though.

Successfully bootstrapped&regrtested on x86_64-pc-linux-gnu.  OK for
trunk?

Some other comments inline.

> Hello,
> 
> David Malcolm <dmalcolm@redhat.com> writes:
> 
> [...]
> 
> > Here's the revised comment I put in the attached patch:
> 
> [...]
> 
> > +   The class caches the lookup of the color codes for the above.
> > +
> > +   The class also has responsibility for tracking which of the above is
> > +   active, filtering out unnecessary changes.  This allows layout::print_line
> > +   to simply request a colorization code for *every* character it prints
> > +   thorough this class, and have the filtering be done for it here.
> 
> You probably meant "*through* this class" ?

Yes, thanks.  Fixed.

> > */
> 
> > Hopefully that comment explains the possible states the colorizer can
> > have.
> 
> Yes it does, great comment, thank you.
> 
> 
> > FWIW I have a follow-up patch to add support for fix-it hints, so they
> > might be another kind of colorization state.
> > (see https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00732.html for the
> > earlier version of said patch, in v1 of the kit).
> 
> Yeah, I'll comment on that one separatly.
> 
> >> Also, I am thinking that there should maybe be a layout::state type,
> >> which would have two notional properties (for now): range_index and
> >> draw_caret_p. So that this function:
> >> 
> >> +bool
> >> +layout::get_state_at_point (/* Inputs.  */
> >> +			    int row, int column,
> >> +			    int first_non_ws, int last_non_ws,
> >> +			    /* Outputs.  */
> >> +			    int *out_range_idx,
> >> +			    bool *out_draw_caret_p)
> >> 
> >> Would take just one output parameter, e.g, a reference to
> >> layout::state.
> >
> > Fixed, though I called it "struct point_state", given that it's coming
> > from get_state_at_point.  I passed it by pointer, since AFAIK our coding
> > standards don't yet approve of the use of references in the codebase
> > (outside of places where we need them e.g. container classes).
> 
> Great.  Thanks.
> 
> >
> > I also added a unit test for a rich_location with two caret locations
> > (mimicking one of the Fortran examples), to give us coverage for this
> > case:
> >
> > +void test_multiple_carets (void)
> > +{
> > +#if 0
> > +   x = x + y /* { dg-warning "8: test" } */
> > +/* { dg-begin-multiline-output "" }
> > +    x = x + y
> > +        A   B
> > +   { dg-end-multiline-output "" } */
> > +#endif
> > +}
> >
> > where the "A" and "B" as caret chars are coming from new code in the
> > show_locus unittest plugin.
> 
> Yeah, saw that.  Excellent, thanks.
> 
> [...]
> 
> >> +  if (0)
> >> +    show_ruler (context, line_width, m_x_offset);
> >> 
> >> This should probably be removed from the final code to be committed.
> >
> > FWIW, the ruler is very helpful to me when debugging the locus-printing
> > (e.g. when adding fix-it-hints), and if we remove that if (0) call, we
> > get:
> >
> > warning: ‘void show_ruler(diagnostic_context*, int, int)’ defined but
> > not used [-Wunused-function]
> >
> > which will break bootstrap, so perhaps it instead should be an option?
> > "-fdiagnostics-show-ruler" or somesuch?
> >
> > I don't know that it would be helpful to end-users though.
> >
> > I'd prefer to just keep it in the code with the
> >   if (0)
> > as-is, since it's useful "scaffolding" for hacking on the code.
> >
> 
> OK, I understand; though, as Manuel noted elsewhere, you might rename
> that function debug_show_ruler and declare it as:
> 
>     DEBUG_FUNCTION static void
>     debug_show_ruler (diagnostic_context *context, int max_width, int x_offset)
>     {
>       /* ...  */
>     }
> to comply with that is generally done in the compiler.

FWIW. I didn't do this.  The function is probably only meaningful to
call from where I was calling it, not necessarily from gdb.

> [...]
> 
> >> +/* Get the column beyond the rightmost one that could contain a caret or
> >> +   range marker, given that we stop rendering at trailing whitespace.  */
> >> +
> >> +int
> >> +layout::get_x_bound_for_row (int row, int caret_column,
> >> +			     int last_non_ws)
> >> 
> >> Please describe what the parameters mean here, especially last_non_ws.
> >> I had to read its code to know that last_non_ws was the *column* of
> >> the last non white space character.
> >
> > I renamed it to "last_non_ws_column", and fleshed out the comment
> 
> OK.
> 
> [...]
> 
> >>  void
> >>  diagnostic_show_locus (diagnostic_context * context,
> >>  		       const diagnostic_info *diagnostic)
> >> @@ -75,16 +710,25 @@ diagnostic_show_locus (diagnostic_context * context,
> >>      return;
> >> 
> >> +      /* The GCC 5 routine. */
> >> 
> >> I'd say the GCC <= 5 routine ;-)
> >
> >> +  else
> >> +    /* The GCC 6 routine.  */
> >> 
> >> And here, the GCC > 5 routine.
> >
> > Changed to "GCC < 6" and "GCC >= 6", on the pedantic grounds that e.g.
> > 5.1 > 5
> 
> OK.
> 
> >
> >> I would be surprised to see this patch in particular incur any
> >> noticeable increase in time and space consumption, but, have you noticed
> >> anythying related to that during bootstrap?
> >
> > I hadn't noticed it, but I wasn't timing.  I'll have a look.
> 
> Ok, thanks.
> 
> > One possible nit here is that the patch expands locations when
> > constructing rich_location instances, and it does that for warnings
> > before the logic to ignore them.  So there may be some extra calls there
> > that aren't present in trunk, for discarded warnings.  I don't expect
> > that to affect the speed of the compiler though (I expect it to be lost
> > in the noise).
> 
> Fair enough.
> 
> > OK for trunk if it passes bootstrap/regrtest?
> 
> The core diagnostics bits are IMHO in good shape.  I'd like to see the
> discussion with Manuel be resolved before it goes in, though.  And we
> still need the "Go" from the FE maintainers.
> 
> So I am continuing the discussion with Manuel below.
> 
> Manuel López-Ibáñez <lopezibanez@gmail.com> writes:
> 
> > On 25 September 2015 at 23:15, David Malcolm <dmalcolm@redhat.com> wrote:
> >> My recollection is that I saw that the Fortran frontend has logic for
> >> calling into diagnostic_print_caret_line, noticed that the fortran
> >> testsuite has dg- assertions about finding specific messages, and I got
> >> worried that they embed assumptions about how the old printer worked.
> >> Hence I wanted to avoid touching that for the first version, and so in
> >> this patch it's a hybrid of the old Fortran printing code with the new
> >> representation for multiple locations.
> >
> > It is quite simple, one you understand the logic. Fortran has three
> > types of output:
> >
> > (a) #     [name]:[locus]:
> >     #
> >     #        some code
> >     #              1
> >     #     Error: Some error at (1)
> >
> > which can call the same function used by other FEs to print the caret
> > line (I call the caret line, the line that contains the caret
> > character/ranges, 1 in this case).
> >
> > (b) #     [name]:[locus]:
> >     #
> >     #       some code and some more code
> >     #              1       2
> >     #     Error: Some error at (1) and (2)
> >
> >
> > which according to what you explained should also be possible by
> > calling diagnostic_show_locus with the appropriate location info and
> >
> > (c) #     [name]:[locus]:
> >     #
> >     #       some code
> >     #              1
> >     #     [name]:[locus2]:
> >     #
> >     #       some other code
> >     #         2
> >     #     Error: Some error at (1) and (2)
> >     # or
> >
> > which was implemented by calling diagnostic_show_locus with just the
> > location of 1, then calling diagnostic_print_caret_line with just the
> > expanded_location of 2. I could have just called diagnostic_show_locus
> > also to print 2 by overriding diagnostic->location[0] =
> > diagnostic->location[1] and caret_char[0] = caret_char[1], but that
> > seemed a bit hackish and more expensive (but perhaps less confusing?).
> >
> > If you have a function that you can call with one or more
> > location_t/expanded_location  (or something that can be converted from
> > a location_t) and pass explicitly the caret_char, then you just need
> > to call that function with the right parameters to get the second part
> > of (c).

That's what I've ended up going with.


> Otherwise, you may simply temporarily do caret_char[0] =
> > caret_char[1], before calling the same function that prints the
> > caret-line for (a).
> >
> >> Maybe that's a cop-out.  Would you prefer that the patch goes all the
> >> way, and that I attempt to eliminate all calls to
> >> diagnostic_print_caret_line from the Fortran FE, and eliminate the old
> >> implementation?  (either now, or as a followup patch?)  I may need
> >> assistance with that; I suspect that some of the dg- assertions in the
> >> Fortran test suite may need updating.
> >
> > There is only one call! I just think this hack is really not necessary
> > (in fact, it seems more complicated than the alternatives outlined
> > above).
> 
> I agree that David's approach of adding an impedance adaptation layer in
> gcc/diagnostic-show-locus.c to keep the Fortran FE unchanged *for now*
> complicates diagnostic-show-locus.c.  But then the benefit is to
> break-up the tasks at hand in smaller (less coupled) tasks, making the
> whole process more manageable.
> 
> IOW, I think the Fortran FE can then be updated later in a follow-up
> patch, and then the adaptation layer code can be removed from
> diagnostic-show-locus.c *if* we all agree that the Fortran FE can be
> adapted to use the foundation being put in place in this current patch.
> 
> And I think all the functionalities the Fortran FE needs for that
> matter, are supported by this patch.  Please correct me if I am wrong.
> 
> > And I'm afraid that once it goes in, it will stay there forever.
> 
> :-)
> 
> Or maybe not :-)
> 
> It looks we are in motion (with some good energy) to change these things
> at the moment, thanks to David and you, amongst others.  So I would
> agree to bet on a positive outcome of this, and to let the patch go in.
> I guess if the Fortran FE doesn't get updated afterwards, I'll be the
> one on the hook; I'd take the bet nevertheless.
> 
> > You are in a far better position than the Fortran devs to understand
> > how to call your new interfaces to get the output you desire.
> 
> That is correct.  And hopefully, letting the current patch go in won't
> prevent the further needed changes of the FEs to happen, including the
> Fortran one.
> 
> [...]

v4 of the patch does the conversion of Fortran, and eliminates the
adaptation layer.  No partial transitions here!

As noted above, a diff relative to v3 can be seen at:
https://dmalcolm.fedorapeople.org/gcc/2015-10-12/0003-Eliminate-special-casing-for-Fortran.patch


Manu: I hope this addresses your concerns.

> Manuel López-Ibáñez <lopezibanez@gmail.com> writes:
> 
> > On 25 September 2015 at 23:24, David Malcolm <dmalcolm@redhat.com> wrote:
> >> On Fri, 2015-09-25 at 23:13 +0200, Manuel López-Ibáñez wrote:
> >>> +   If SHOW_CARET_P is true, then the range should be rendered with
> >>> +   a caret at its starting location.  This
> >>> +   is for use by the Fortran frontend, for implementing the
> >>> +   "%C" and "%L" format codes.  */
> >>> +
> >>> +void
> >>> +rich_location::set_range (unsigned int idx, source_range src_range,
> >>> +              bool show_caret_p, bool overwrite_loc_p)
> >>>
> >>> I do not understand when is this show_caret_p used by Fortran given
> >>> the diagnostic_show_locus code mentioned earlier.
> >
> > [...]
> >> rich_location::set_range exists to ensure that the %C and %L codes used
> >> by Fortran (and "+" in the C family of FEs) can write back into the
> >> rich_location instance, faithfully emulating the old code that wrote
> >> back to
> >> struct text_info's:
> >>   location_t locations[MAX_LOCATIONS_PER_MESSAGE];
> >
> > Why Fortran cannot use text->set_location like the other FEs? This way
> > you do not need set_range at all.
> 
> Yes, it does look to me that this change:
> 
>     @@ -938,10 +939,12 @@ gfc_format_decoder (pretty_printer *pp,
>             /* If location[0] != UNKNOWN_LOCATION means that we already
>                processed one of %C/%L.  */
>             int loc_num = text->get_location (0) == UNKNOWN_LOCATION ? 0 : 1;
>     -	text->set_location (loc_num,
>     -			    linemap_position_for_loc_and_offset (line_table,
>     -								 loc->lb->location,
>     -								 offset));
>     +	source_range range
>     +	  = source_range::from_location (
>     +	      linemap_position_for_loc_and_offset (line_table,
>     +						   loc->lb->location,
>     +						   offset));
>     +	text->set_range (loc_num, range, true);
> 
> is unnecessary because text_info::set_location() got updated as:
> 
>     @@ -40,21 +35,17 @@ struct text_info
>        va_list *args_ptr;
>        int err_no;  /* for %m */
>        void **x_data;
>     +  rich_location *m_richloc;
> 
>     -  inline void set_location (unsigned int index_of_location, location_t loc)
>     +  inline void set_location (unsigned int idx, location_t loc, bool caret_p)
>        {
>     -    gcc_checking_assert (index_of_location < MAX_LOCATIONS_PER_MESSAGE);
>     -    this->locations[index_of_location] = loc;
>     +    source_range src_range;
>     +    src_range.m_start = loc;
>     +    src_range.m_finish = loc;
>     +    set_range (idx, src_range, caret_p);
>        }
> 
> So what is needed AIUI is to just add a 'caret_p' argument to the
> initial call to:
> 
>         text->set_location (loc_num,
>     			    linemap_position_for_loc_and_offset (line_table,
>     								 loc->lb->location,
>     								 offset));
> 
> > The other issue that confuses me is that show_caret_p is always true
> > when reaching this function via the pretty-printer. Thus, show_caret_p
> > is also used by C/C++. In fact, I'm not sure when it can be false.
> 
> I think the reason why show_caret_p is always true here is that the
> patch updates the existing FE code so that it keeps emitting the *same*
> diagnostics as before.  And today, FEs emits diagnostics with ranges
> that always starts with a caret.
> 
> But I think it's easy to imagine hypothetic (and reasonably probable)
> examples where we'd have only one range with a caret, associated to other
> ranges with no caret.  The ranges with no carets would thus require the
> show_caret_p argument to be non-null.  In fact, patch 2/5 has one of
> these examples in a comment in line-map.h:
> 
>     +   Example C
>     +   *********
>     +      a = (foo && bar)
>     +           ~~~ ^~ ~~~
>     +   This rich location has three ranges:
>     +   - Range 0 has its caret and start location at the first "&" and
>     +     end at the second "&.
>     +   - Range 1 has its start and finish at the "f" and "o" of "foo";
>     +     the caret is not flagged for display, but is perhaps at the "f"
>     +     of "foo".
>     +   - Similarly, range 2 has its start and finish at the "b" and "r" of
>     +     "bar"; the caret is not flagged for display, but is perhaps at the
>     +     "b" of "bar".
> 
> We'll then start seeing cases with show_caret_p being false when *new*
> diagnostics are added to FEs to leverage on this new feature.
> 
> So I don't think this "show_caret_p" argument's value is an issue, quite
> the contrary; it's an interesting feature.
> 
> Cheers,

Thanks for the review and other comments.
Dave

[-- Attachment #2: 0001-Reimplement-diagnostic_show_locus-introducing-rich_l.patch --]
[-- Type: text/x-patch, Size: 114830 bytes --]

From 9ab85c1602bf91b5e24308f02796d963b3f32fc4 Mon Sep 17 00:00:00 2001
From: David Malcolm <dmalcolm@redhat.com>
Date: Mon, 31 Aug 2015 21:32:20 -0400
Subject: [PATCH] Reimplement diagnostic_show_locus, introducing rich_location
 classes (v4)

gcc/ChangeLog:
	* diagnostic-color.c (color_dict): Eliminate "caret"; add "range1"
	and "range2".
	(parse_gcc_colors): Update comment to describe default GCC_COLORS.
	* diagnostic-core.h (warning_at_rich_loc): New declaration.
	(error_at_rich_loc): New declaration.
	(permerror_at_rich_loc): New declaration.
	(inform_at_rich_loc): New declaration.
	* diagnostic-show-locus.c (adjust_line): Delete.
	(struct point_state): New struct.
	(class colorizer): New class.
	(class layout_point): New class.
	(class layout_range): New class.
	(class layout): New class.
	(colorizer::colorizer): New ctor.
	(colorizer::~colorizer): New dtor.
	(layout::layout): New ctor.
	(layout::print_line): New method.
	(layout::get_state_at_point): New method.
	(layout::get_x_bound_for_row): New method.
	(show_ruler): New function.
	(diagnostic_show_locus): Reimplement in terms of class layout.
	* diagnostic.c (diagnostic_initialize): Replace
	MAX_LOCATIONS_PER_MESSAGE with rich_location::MAX_RANGES.
	(diagnostic_set_info_translated): Convert param from location_t
	to rich_location *.  Eliminate calls to set_location on the
	message in favor of storing the rich_location ptr there.
	(diagnostic_set_info): Convert param from location_t to
	rich_location *.
	(diagnostic_build_prefix): Break out array into...
	(diagnostic_kind_color): New variable.
	(diagnostic_get_color_for_kind): New function.
	(diagnostic_report_diagnostic): Colorize the option_text
	using the color for the severity.
	(diagnostic_append_note): Update for change in signature of
	diagnostic_set_info.
	(diagnostic_append_note_at_rich_loc): New function.
	(emit_diagnostic): Update for change in signature of
	diagnostic_set_info.
	(inform): Likewise.
	(inform_at_rich_loc): New function.
	(inform_n): Update for change in signature of diagnostic_set_info.
	(warning): Likewise.
	(warning_at): Likewise.
	(warning_at_rich_loc): New function.
	(warning_n): Update for change in signature of diagnostic_set_info.
	(pedwarn): Likewise.
	(permerror): Likewise.
	(permerror_at_rich_loc): New function.
	(error): Update for change in signature of diagnostic_set_info.
	(error_n): Likewise.
	(error_at): Likewise.
	(error_at_rich_loc): New function.
	(sorry): Update for change in signature of diagnostic_set_info.
	(fatal_error): Likewise.
	(internal_error): Likewise.
	(internal_error_no_backtrace): Likewise.
	(source_range::debug): New function.
	* diagnostic.h (struct diagnostic_info): Eliminate field
	"override_column".  Add field "richloc".
	(diagnostic_set_info): Convert param from location_t to
	rich_location *.
	(diagnostic_set_info_translated): Likewise.
	(diagnostic_append_note_at_rich_loc): New function.
	(diagnostic_num_locations): New function.
	(diagnostic_expand_location): Get the location from the
	rich_location.
	(diagnostic_print_caret_line): Delete.
	(diagnostic_get_color_for_kind): New declaration.
	* genmatch.c (linemap_client_expand_location_to_spelling_point): New.
	(error_cb): Update for change in signature of "error" callback.
	(fatal_at): Likewise.
	(warning_at): Likewise.
	* input.c (linemap_client_expand_location_to_spelling_point): New.
	* pretty-print.c (text_info::set_range): New method.
	(text_info::get_location): New method.
	* pretty-print.h (MAX_LOCATIONS_PER_MESSAGE): Eliminate this macro.
	(struct text_info): Eliminate "locations" array in favor of
	"m_richloc", a rich_location *.
	(textinfo::set_location): Add a "caret_p" param, and reimplement
	in terms of a call to set_range.
	(textinfo::get_location): Eliminate inline implementation in favor of
	an out-of-line reimplementation.
	(textinfo::set_range): New method.
	* rtl-error.c (diagnostic_for_asm): Update for change in signature
	of diagnostic_set_info.
	* tree-diagnostic.c (default_tree_printer): Update for new
	"caret_p" param for textinfo::set_location.
	* tree-pretty-print.c (percent_K_format): Likewise.

gcc/c-family/ChangeLog:
	* c-common.c (c_cpp_error): Convert parameter from location_t to
	rich_location *.  Eliminate the "column_override" parameter and
	the call to diagnostic_override_column.
	Update the "done_lexing" clause to set range 0
	on the rich_location, rather than overwriting a location_t.
	* c-common.h (c_cpp_error): Convert parameter from location_t to
	rich_location *.  Eliminate the "column_override" parameter.

gcc/c/ChangeLog:
	* c-decl.c (warn_defaults_to): Update for change in signature
	of diagnostic_set_info.
	* c-errors.c (pedwarn_c99): Likewise.
	(pedwarn_c90): Likewise.
	* c-objc-common.c (c_tree_printer): Update for new "caret_p" param
	for textinfo::set_location.

gcc/cp/ChangeLog:
	* error.c (cp_printer): Update for new "caret_p" param for
	textinfo::set_location.
	(pedwarn_cxx98): Update for change in signature of
	diagnostic_set_info.

gcc/fortran/ChangeLog:
	* cpp.c (cb_cpp_error): Convert parameter from location_t to
	rich_location *.  Eliminate the "column_override" parameter.
	* error.c (gfc_warning): Update for change in signature of
	diagnostic_set_info.
	(gfc_format_decoder): Update handling of %C/%L for changes
	to struct text_info.
	(gfc_diagnostic_starter): Use richloc when determining whether to
	print one locus or two.  When handling a location that will
	involve a call to diagnostic_show_locus, only attempt to print the
	locus for the primary location, and don't call into
	diagnostic_print_caret_line.
	(gfc_warning_now_at): Update for change in signature of
	diagnostic_set_info.
	(gfc_warning_now): Likewise.
	(gfc_error_now): Likewise.
	(gfc_fatal_error): Likewise.
	(gfc_error): Likewise.
	(gfc_internal_error): Likewise.

gcc/testsuite/ChangeLog:
	* gcc.dg/plugin/diagnostic-test-show-locus-bw.c: New file.
	* gcc.dg/plugin/diagnostic-test-show-locus-color.c: New file.
	* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c: New file.
	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add the above.
	* lib/gcc-dg.exp: Load multiline.exp.

libcpp/ChangeLog:
	* errors.c (cpp_diagnostic): Update for change in signature
	of "error" callback.
	(cpp_diagnostic_with_line): Likewise, calling override_column
	on the rich_location.
	* include/cpplib.h (struct cpp_callbacks): Within "error"
	callback, convert param from source_location to rich_location *,
	and drop column_override param.
	* include/line-map.h (struct source_range): New struct.
	(struct location_range): New struct.
	(class rich_location): New class.
	(linemap_client_expand_location_to_spelling_point): New declaration.
	* line-map.c (rich_location::rich_location): New ctors.
	(rich_location::lazily_expand_location): New method.
	(rich_location::override_column): New method.
	(rich_location::add_range): New methods.
	(rich_location::set_range): New method.
---
 gcc/c-family/c-common.c                            |  15 +-
 gcc/c-family/c-common.h                            |   4 +-
 gcc/c/c-decl.c                                     |   3 +-
 gcc/c/c-errors.c                                   |  12 +-
 gcc/c/c-objc-common.c                              |   2 +-
 gcc/cp/error.c                                     |   5 +-
 gcc/diagnostic-color.c                             |   5 +-
 gcc/diagnostic-core.h                              |   8 +
 gcc/diagnostic-show-locus.c                        | 761 ++++++++++++++++++---
 gcc/diagnostic.c                                   | 196 +++++-
 gcc/diagnostic.h                                   |  46 +-
 gcc/fortran/cpp.c                                  |  13 +-
 gcc/fortran/error.c                                | 103 +--
 gcc/genmatch.c                                     |  27 +-
 gcc/input.c                                        |   7 +
 gcc/pretty-print.c                                 |  21 +
 gcc/pretty-print.h                                 |  25 +-
 gcc/rtl-error.c                                    |   3 +-
 .../gcc.dg/plugin/diagnostic-test-show-locus-bw.c  | 149 ++++
 .../plugin/diagnostic-test-show-locus-color.c      | 158 +++++
 .../plugin/diagnostic_plugin_test_show_locus.c     | 321 +++++++++
 gcc/testsuite/gcc.dg/plugin/plugin.exp             |   3 +
 gcc/testsuite/lib/gcc-dg.exp                       |   1 +
 gcc/tree-diagnostic.c                              |   2 +-
 gcc/tree-pretty-print.c                            |   2 +-
 libcpp/errors.c                                    |   7 +-
 libcpp/include/cpplib.h                            |   4 +-
 libcpp/include/line-map.h                          | 207 ++++++
 libcpp/line-map.c                                  | 130 ++++
 29 files changed, 1952 insertions(+), 288 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-bw.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-color.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 4b64a44..4a5ccb7 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -10477,15 +10477,14 @@ c_option_controlling_cpp_error (int reason)
 /* Callback from cpp_error for PFILE to print diagnostics from the
    preprocessor.  The diagnostic is of type LEVEL, with REASON set
    to the reason code if LEVEL is represents a warning, at location
-   LOCATION unless this is after lexing and the compiler's location
-   should be used instead, with column number possibly overridden by
-   COLUMN_OVERRIDE if not zero; MSG is the translated message and AP
+   RICHLOC unless this is after lexing and the compiler's location
+   should be used instead; MSG is the translated message and AP
    the arguments.  Returns true if a diagnostic was emitted, false
    otherwise.  */
 
 bool
 c_cpp_error (cpp_reader *pfile ATTRIBUTE_UNUSED, int level, int reason,
-	     location_t location, unsigned int column_override,
+	     rich_location *richloc,
 	     const char *msg, va_list *ap)
 {
   diagnostic_info diagnostic;
@@ -10526,11 +10525,11 @@ c_cpp_error (cpp_reader *pfile ATTRIBUTE_UNUSED, int level, int reason,
       gcc_unreachable ();
     }
   if (done_lexing)
-    location = input_location;
+    richloc->set_range (0,
+			source_range::from_location (input_location),
+			true, true);
   diagnostic_set_info_translated (&diagnostic, msg, ap,
-				  location, dlevel);
-  if (column_override)
-    diagnostic_override_column (&diagnostic, column_override);
+				  richloc, dlevel);
   diagnostic_override_option_index (&diagnostic,
                                     c_option_controlling_cpp_error (reason));
   ret = report_diagnostic (&diagnostic);
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index d5fb499..b0a7661 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -995,9 +995,9 @@ extern void init_c_lex (void);
 
 extern void c_cpp_builtins (cpp_reader *);
 extern void c_cpp_builtins_optimize_pragma (cpp_reader *, tree, tree);
-extern bool c_cpp_error (cpp_reader *, int, int, location_t, unsigned int,
+extern bool c_cpp_error (cpp_reader *, int, int, rich_location *,
 			 const char *, va_list *)
-     ATTRIBUTE_GCC_DIAG(6,0);
+     ATTRIBUTE_GCC_DIAG(5,0);
 extern int c_common_has_attribute (cpp_reader *);
 
 extern bool parse_optimize_options (tree, bool);
diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index ce8406a..732080a 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -5297,9 +5297,10 @@ warn_defaults_to (location_t location, int opt, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
                        flag_isoc99 ? DK_PEDWARN : DK_WARNING);
   diagnostic.option_index = opt;
   report_diagnostic (&diagnostic);
diff --git a/gcc/c/c-errors.c b/gcc/c/c-errors.c
index e5fbf05..0f8b933 100644
--- a/gcc/c/c-errors.c
+++ b/gcc/c/c-errors.c
@@ -42,13 +42,14 @@ pedwarn_c99 (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool warned = false;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
   /* If desired, issue the C99/C11 compat warning, which is more specific
      than -pedantic.  */
   if (warn_c99_c11_compat > 0)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 			   (pedantic && !flag_isoc11)
 			   ? DK_PEDWARN : DK_WARNING);
       diagnostic.option_index = OPT_Wc99_c11_compat;
@@ -60,7 +61,7 @@ pedwarn_c99 (location_t location, int opt, const char *gmsgid, ...)
   /* For -pedantic outside C11, issue a pedwarn.  */
   else if (pedantic && !flag_isoc11)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_PEDWARN);
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_PEDWARN);
       diagnostic.option_index = opt;
       warned = report_diagnostic (&diagnostic);
     }
@@ -80,6 +81,7 @@ pedwarn_c90 (location_t location, int opt, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
   /* Warnings such as -Wvla are the most specific ones.  */
@@ -90,7 +92,7 @@ pedwarn_c90 (location_t location, int opt, const char *gmsgid, ...)
         goto out;
       else if (opt_var > 0)
 	{
-	  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+	  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 			       (pedantic && !flag_isoc99)
 			       ? DK_PEDWARN : DK_WARNING);
 	  diagnostic.option_index = opt;
@@ -102,7 +104,7 @@ pedwarn_c90 (location_t location, int opt, const char *gmsgid, ...)
      specific than -pedantic.  */
   if (warn_c90_c99_compat > 0)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 			   (pedantic && !flag_isoc99)
 			   ? DK_PEDWARN : DK_WARNING);
       diagnostic.option_index = OPT_Wc90_c99_compat;
@@ -114,7 +116,7 @@ pedwarn_c90 (location_t location, int opt, const char *gmsgid, ...)
   /* For -pedantic outside C99, issue a pedwarn.  */
   else if (pedantic && !flag_isoc99)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_PEDWARN);
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_PEDWARN);
       diagnostic.option_index = opt;
       report_diagnostic (&diagnostic);
     }
diff --git a/gcc/c/c-objc-common.c b/gcc/c/c-objc-common.c
index 47fd7de..1e601f9 100644
--- a/gcc/c/c-objc-common.c
+++ b/gcc/c/c-objc-common.c
@@ -101,7 +101,7 @@ c_tree_printer (pretty_printer *pp, text_info *text, const char *spec,
     {
       t = va_arg (*text->args_ptr, tree);
       if (set_locus)
-	text->set_location (0, DECL_SOURCE_LOCATION (t));
+	text->set_location (0, DECL_SOURCE_LOCATION (t), true);
     }
 
   switch (*spec)
diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 17870b5..2e2ff10 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -3562,7 +3562,7 @@ cp_printer (pretty_printer *pp, text_info *text, const char *spec,
 
   pp_string (pp, result);
   if (set_locus && t != NULL)
-    text->set_location (0, location_of (t));
+    text->set_location (0, location_of (t), true);
   return true;
 #undef next_tree
 #undef next_tcode
@@ -3676,9 +3676,10 @@ pedwarn_cxx98 (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 		       (cxx_dialect == cxx98) ? DK_PEDWARN : DK_WARNING);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
diff --git a/gcc/diagnostic-color.c b/gcc/diagnostic-color.c
index 3fe49b2..d848dfc 100644
--- a/gcc/diagnostic-color.c
+++ b/gcc/diagnostic-color.c
@@ -164,7 +164,8 @@ static struct color_cap color_dict[] =
   { "warning", SGR_SEQ (COLOR_BOLD COLOR_SEPARATOR COLOR_FG_MAGENTA),
 	       7, false },
   { "note", SGR_SEQ (COLOR_BOLD COLOR_SEPARATOR COLOR_FG_CYAN), 4, false },
-  { "caret", SGR_SEQ (COLOR_BOLD COLOR_SEPARATOR COLOR_FG_GREEN), 5, false },
+  { "range1", SGR_SEQ (COLOR_FG_GREEN), 6, false },
+  { "range2", SGR_SEQ (COLOR_FG_BLUE), 6, false },
   { "locus", SGR_SEQ (COLOR_BOLD), 5, false },
   { "quote", SGR_SEQ (COLOR_BOLD), 5, false },
   { NULL, NULL, 0, false }
@@ -195,7 +196,7 @@ colorize_stop (bool show_color)
 }
 
 /* Parse GCC_COLORS.  The default would look like:
-   GCC_COLORS='error=01;31:warning=01;35:note=01;36:caret=01;32:locus=01:quote=01'
+   GCC_COLORS='error=01;31:warning=01;35:note=01;36:range1=32:range2=34;locus=01:quote=01'
    No character escaping is needed or supported.  */
 static bool
 parse_gcc_colors (void)
diff --git a/gcc/diagnostic-core.h b/gcc/diagnostic-core.h
index 66d2e42..a8a7c37 100644
--- a/gcc/diagnostic-core.h
+++ b/gcc/diagnostic-core.h
@@ -63,18 +63,26 @@ extern bool warning_n (location_t, int, int, const char *, const char *, ...)
     ATTRIBUTE_GCC_DIAG(4,6) ATTRIBUTE_GCC_DIAG(5,6);
 extern bool warning_at (location_t, int, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,4);
+extern bool warning_at_rich_loc (rich_location *, int, const char *, ...)
+    ATTRIBUTE_GCC_DIAG(3,4);
 extern void error (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
 extern void error_n (location_t, int, const char *, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,5) ATTRIBUTE_GCC_DIAG(4,5);
 extern void error_at (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
+extern void error_at_rich_loc (rich_location *, const char *, ...)
+  ATTRIBUTE_GCC_DIAG(2,3);
 extern void fatal_error (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3)
      ATTRIBUTE_NORETURN;
 /* Pass one of the OPT_W* from options.h as the second parameter.  */
 extern bool pedwarn (location_t, int, const char *, ...)
      ATTRIBUTE_GCC_DIAG(3,4);
 extern bool permerror (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
+extern bool permerror_at_rich_loc (rich_location *, const char *,
+				   ...) ATTRIBUTE_GCC_DIAG(2,3);
 extern void sorry (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
 extern void inform (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
+extern void inform_at_rich_loc (rich_location *, const char *,
+				...) ATTRIBUTE_GCC_DIAG(2,3);
 extern void inform_n (location_t, int, const char *, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,5) ATTRIBUTE_GCC_DIAG(4,5);
 extern void verbatim (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
index 147a2b8..b6c9040 100644
--- a/gcc/diagnostic-show-locus.c
+++ b/gcc/diagnostic-show-locus.c
@@ -36,131 +36,694 @@ along with GCC; see the file COPYING3.  If not see
 # include <sys/ioctl.h>
 #endif
 
-/* If LINE is longer than MAX_WIDTH, and COLUMN is not smaller than
-   MAX_WIDTH by some margin, then adjust the start of the line such
-   that the COLUMN is smaller than MAX_WIDTH minus the margin.  The
-   margin is either CARET_LINE_MARGIN characters or the difference
-   between the column and the length of the line, whatever is smaller.
-   The length of LINE is given by LINE_WIDTH.  */
-static const char *
-adjust_line (const char *line, int line_width,
-	     int max_width, int *column_p)
-{
-  int right_margin = CARET_LINE_MARGIN;
-  int column = *column_p;
-
-  gcc_checking_assert (line_width >= column);
-  right_margin = MIN (line_width - column, right_margin);
-  right_margin = max_width - right_margin;
-  if (line_width >= max_width && column > right_margin)
+static void
+show_ruler (diagnostic_context *context, int max_width, int x_offset);
+
+/* Classes for rendering source code and diagnostics, within an
+   anonymous namespace.
+   The work is done by "class layout", which embeds and uses
+   "class colorizer" and "class layout_range" to get things done.  */
+
+namespace {
+
+/* The state at a given point of the source code, assuming that we're
+   in a range: which range are we in, and whether we should draw a caret at
+   this point.  */
+
+struct point_state
+{
+  int range_idx;
+  bool draw_caret_p;
+};
+
+/* A class to inject colorization codes when printing the diagnostic locus.
+
+   It has one kind of colorization for each of:
+     - normal text
+     - range 0 (the "primary location")
+     - range 1
+     - range 2
+
+   The class caches the lookup of the color codes for the above.
+
+   The class also has responsibility for tracking which of the above is
+   active, filtering out unnecessary changes.  This allows layout::print_line
+   to simply request a colorization code for *every* character it prints
+   through this class, and have the filtering be done for it here.  */
+
+class colorizer
+{
+ public:
+  colorizer (diagnostic_context *context,
+	     const diagnostic_info *diagnostic);
+  ~colorizer ();
+
+  void set_range (int range_idx) { set_state (range_idx); }
+  void set_normal_text () { set_state (STATE_NORMAL_TEXT); }
+
+ private:
+  void set_state (int state);
+  void begin_state (int state);
+  void finish_state (int state);
+
+ private:
+  static const int STATE_NORMAL_TEXT = -1;
+
+  diagnostic_context *m_context;
+  const diagnostic_info *m_diagnostic;
+  int m_current_state;
+  const char *m_caret_cs;
+  const char *m_caret_ce;
+  const char *m_range1_cs;
+  const char *m_range2_cs;
+  const char *m_range_ce;
+};
+
+/* A point within a layout_range; similar to an expanded_location,
+   but after filtering on file.  */
+
+class layout_point
+{
+ public:
+  layout_point (const expanded_location &exploc)
+  : m_line (exploc.line),
+    m_column (exploc.column) {}
+
+  int m_line;
+  int m_column;
+};
+
+/* A class for use by "class layout" below: a filtered location_range.  */
+
+class layout_range
+{
+ public:
+  layout_range (const location_range *loc_range);
+
+  bool contains_point (int row, int column) const;
+
+  layout_point m_start;
+  layout_point m_finish;
+  bool m_show_caret_p;
+  layout_point m_caret;
+};
+
+/* A class to control the overall layout when printing a diagnostic.
+
+   The layout is determined within the constructor.
+   It is then printed by repeatedly calling the "print_line" method.
+   Each such call can print two lines: one for the source line itself,
+   and potentially an "annotation" line, containing carets/underlines.
+
+   We assume we have disjoint ranges.  */
+
+class layout
+{
+ public:
+  layout (diagnostic_context *context,
+	  const diagnostic_info *diagnostic);
+
+  int get_first_line () const { return m_first_line; }
+  int get_last_line () const { return m_last_line; }
+
+  void print_line (int row);
+
+ private:
+  bool
+  get_state_at_point (/* Inputs.  */
+		      int row, int column,
+		      int first_non_ws, int last_non_ws,
+		      /* Outputs.  */
+		      point_state *out_state);
+
+  int
+  get_x_bound_for_row (int row, int caret_column,
+		       int last_non_ws);
+
+ private:
+  diagnostic_context *m_context;
+  pretty_printer *m_pp;
+  diagnostic_t m_diagnostic_kind;
+  expanded_location m_exploc;
+  colorizer m_colorizer;
+  auto_vec <layout_range> m_layout_ranges;
+  int m_first_line;
+  int m_last_line;
+  int m_x_offset;
+};
+
+/* Implementation of "class colorizer".  */
+
+/* The constructor for "colorizer".  Lookup and store color codes for the
+   different kinds of things we might need to print.  */
+
+colorizer::colorizer (diagnostic_context *context,
+		      const diagnostic_info *diagnostic) :
+  m_context (context),
+  m_diagnostic (diagnostic),
+  m_current_state (STATE_NORMAL_TEXT)
+{
+  m_caret_ce = colorize_stop (pp_show_color (context->printer));
+  m_range1_cs = colorize_start (pp_show_color (context->printer), "range1");
+  m_range2_cs = colorize_start (pp_show_color (context->printer), "range2");
+  m_range_ce = colorize_stop (pp_show_color (context->printer));
+}
+
+/* The destructor for "colorize".  If colorization is on, print a code to
+   turn it off.  */
+
+colorizer::~colorizer ()
+{
+  finish_state (m_current_state);
+}
+
+/* Update state, printing color codes if necessary if there's a state
+   change.  */
+
+void
+colorizer::set_state (int new_state)
+{
+  if (m_current_state != new_state)
     {
-      line += column - right_margin;
-      *column_p = right_margin;
+      finish_state (m_current_state);
+      m_current_state = new_state;
+      begin_state (new_state);
     }
-  return line;
 }
 
-/* Print the physical source line corresponding to the location of
-   this diagnostic, and a caret indicating the precise column.  This
-   function only prints two caret characters if the two locations
-   given by DIAGNOSTIC are on the same line according to
-   diagnostic_same_line().  */
+/* Turn on any colorization for STATE.  */
+
 void
-diagnostic_show_locus (diagnostic_context * context,
-		       const diagnostic_info *diagnostic)
+colorizer::begin_state (int state)
 {
-  if (!context->show_caret
-      || diagnostic_location (diagnostic, 0) <= BUILTINS_LOCATION
-      || diagnostic_location (diagnostic, 0) == context->last_location)
-    return;
+  switch (state)
+    {
+    case STATE_NORMAL_TEXT:
+      break;
 
-  context->last_location = diagnostic_location (diagnostic, 0);
-  expanded_location s0 = diagnostic_expand_location (diagnostic, 0);
-  expanded_location s1 = { };
-  /* Zero-initialized. This is checked later by diagnostic_print_caret_line.  */
+    case 0:
+      /* Make range 0 be the same color as the "kind" text
+	 (error vs warning vs note).  */
+      pp_string
+	(m_context->printer,
+	 colorize_start (pp_show_color (m_context->printer),
+			 diagnostic_get_color_for_kind (m_diagnostic->kind)));
+      break;
+
+    case 1:
+      pp_string (m_context->printer, m_range1_cs);
+      break;
+
+    case 2:
+      pp_string (m_context->printer, m_range2_cs);
+      break;
+
+    default:
+      /* We don't expect more than 3 ranges per diagnostic.  */
+      gcc_unreachable ();
+      break;
+    }
+}
+
+/* Turn off any colorization for STATE.  */
+
+void
+colorizer::finish_state (int state)
+{
+  switch (state)
+    {
+    case STATE_NORMAL_TEXT:
+      break;
+
+    case 0:
+      pp_string (m_context->printer, m_caret_ce);
+      break;
+
+    default:
+      /* Within a range.  */
+      gcc_assert (state > 0);
+      pp_string (m_context->printer, m_range_ce);
+      break;
+    }
+}
+
+/* Implementation of class layout_range.  */
+
+/* The constructor for class layout_range.
+   Initialize various layout_point fields from expanded_location
+   equivalents; we've already filtered on file.  */
+
+layout_range::layout_range (const location_range *loc_range)
+: m_start (loc_range->m_start),
+  m_finish (loc_range->m_finish),
+  m_show_caret_p (loc_range->m_show_caret_p),
+  m_caret (loc_range->m_caret)
+{
+}
+
+/* Is (column, row) within the given range?
+   We've already filtered on the file.
+
+   Ranges are closed (both limits are within the range).
+
+   Example A: a single-line range:
+     start:  (col=22, line=2)
+     finish: (col=38, line=2)
+
+  |00000011111111112222222222333333333344444444444
+  |34567890123456789012345678901234567890123456789
+--+-----------------------------------------------
+01|bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
+02|bbbbbbbbbbbbbbbbbbbSwwwwwwwwwwwwwwwFaaaaaaaaaaa
+03|aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+
+   Example B: a multiline range with
+     start:  (col=14, line=3)
+     finish: (col=08, line=5)
+
+  |00000011111111112222222222333333333344444444444
+  |34567890123456789012345678901234567890123456789
+--+-----------------------------------------------
+01|bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
+02|bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
+03|bbbbbbbbbbbSwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
+04|wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
+05|wwwwwFaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+06|aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+--+-----------------------------------------------
+
+   Legend:
+   - 'b' indicates a point *before* the range
+   - 'S' indicates the start of the range
+   - 'w' indicates a point within the range
+   - 'F' indicates the finish of the range (which is
+	 within it).
+   - 'a' indicates a subsequent point *after* the range.  */
+
+bool
+layout_range::contains_point (int row, int column) const
+{
+  gcc_assert (m_start.m_line <= m_finish.m_line);
+  /* ...but the equivalent isn't true for the columns;
+     consider example B in the comment above.  */
+
+  if (row < m_start.m_line)
+    /* Points before the first line of the range are
+       outside it (corresponding to line 01 in example A
+       and lines 01 and 02 in example B above).  */
+    return false;
+
+  if (row == m_start.m_line)
+    /* On same line as start of range (corresponding
+       to line 02 in example A and line 03 in example B).  */
+    {
+      if (column < m_start.m_column)
+	/* Points on the starting line of the range, but
+	   before the column in which it begins.  */
+	return false;
+
+      if (row < m_finish.m_line)
+	/* This is a multiline range; the point
+	   is within it (corresponds to line 03 in example B
+	   from column 14 onwards) */
+	return true;
+      else
+	{
+	  /* This is a single-line range.  */
+	  gcc_assert (row == m_finish.m_line);
+	  return column <= m_finish.m_column;
+	}
+    }
+
+  /* The point is in a line beyond that containing the
+     start of the range: lines 03 onwards in example A,
+     and lines 04 onwards in example B.  */
+  gcc_assert (row > m_start.m_line);
+
+  if (row > m_finish.m_line)
+    /* The point is beyond the final line of the range
+       (lines 03 onwards in example A, and lines 06 onwards
+       in example B).  */
+    return false;
 
-  if (diagnostic_location (diagnostic, 1) > BUILTINS_LOCATION)
-    s1 = diagnostic_expand_location (diagnostic, 1);
+  if (row < m_finish.m_line)
+    {
+      /* The point is in a line that's fully within a multiline
+	 range (e.g. line 04 in example B).  */
+      gcc_assert (m_start.m_line < m_finish.m_line);
+      return true;
+    }
+
+  gcc_assert (row ==  m_finish.m_line);
+
+  return column <= m_finish.m_column;
+}
+
+/* Given a source line LINE of length LINE_WIDTH, determine the width
+   without any trailing whitespace.  */
+
+static int
+get_line_width_without_trailing_whitespace (const char *line, int line_width)
+{
+  int result = line_width;
+  while (result > 0)
+    {
+      char ch = line[result - 1];
+      if (ch == ' ' || ch == '\t')
+	result--;
+      else
+	break;
+    }
+  gcc_assert (result >= 0);
+  gcc_assert (result <= line_width);
+  gcc_assert (result == 0 ||
+	      (line[result - 1] != ' '
+	       && line[result -1] != '\t'));
+  return result;
+}
+
+/* Implementation of class layout.  */
+
+/* Constructor for class layout.
+
+   Filter the ranges from the rich_location to those that we can
+   sanely print, populating m_layout_ranges.
+   Determine the range of lines that we will print.
+   Determine m_x_offset, to ensure that the primary caret
+   will fit within the max_width provided by the diagnostic_context.  */
+
+layout::layout (diagnostic_context * context,
+		const diagnostic_info *diagnostic)
+: m_context (context),
+  m_pp (context->printer),
+  m_diagnostic_kind (diagnostic->kind),
+  m_exploc (diagnostic->richloc->lazily_expand_location ()),
+  m_colorizer (context, diagnostic),
+  m_layout_ranges (rich_location::MAX_RANGES),
+  m_first_line (m_exploc.line),
+  m_last_line  (m_exploc.line),
+  m_x_offset (0)
+{
+  rich_location *richloc = diagnostic->richloc;
+  for (unsigned int idx = 0; idx < richloc->get_num_locations (); idx++)
+    {
+      /* This diagnostic printer can only cope with "sufficiently sane" ranges.
+	 Ignore any ranges that are awkward to handle.  */
+      location_range *loc_range = richloc->get_range (idx);
+
+      /* If any part of the range isn't in the same file as the primary
+	 location of this diagnostic, ignore the range.  */
+      if (loc_range->m_start.file != m_exploc.file)
+	continue;
+      if (loc_range->m_finish.file != m_exploc.file)
+	continue;
+      if (loc_range->m_show_caret_p)
+	if (loc_range->m_caret.file != m_exploc.file)
+	  continue;
+
+      /* Passed all the tests; add the range to m_layout_ranges so that
+	 it will be printed.  */
+      layout_range ri (loc_range);
+      m_layout_ranges.safe_push (ri);
+
+      /* Update m_first_line/m_last_line if necessary.  */
+      if (loc_range->m_start.line < m_first_line)
+	m_first_line = loc_range->m_start.line;
+      if (loc_range->m_finish.line > m_last_line)
+	m_last_line = loc_range->m_finish.line;
+    }
+
+  /* Adjust m_x_offset.
+     Center the primary caret to fit in max_width; all columns
+     will be adjusted accordingly.  */
+  int max_width = m_context->caret_max_width;
+  int line_width;
+  const char *line = location_get_source_line (m_exploc.file, m_exploc.line,
+					       &line_width);
+  if (line && m_exploc.column <= line_width)
+    {
+      int right_margin = CARET_LINE_MARGIN;
+      int column = m_exploc.column;
+      right_margin = MIN (line_width - column, right_margin);
+      right_margin = max_width - right_margin;
+      if (line_width >= max_width && column > right_margin)
+	m_x_offset = column - right_margin;
+      gcc_assert (m_x_offset >= 0);
+    }
 
-  diagnostic_print_caret_line (context, s0, s1,
-			       context->caret_chars[0],
-			       context->caret_chars[1]);
+  if (0)
+    show_ruler (context, line_width, m_x_offset);
 }
 
-/* Print (part) of the source line given by xloc1 with caret1 pointing
-   at the column.  If xloc2.column != 0 and it fits within the same
-   line as xloc1 according to diagnostic_same_line (), then caret2 is
-   printed at xloc2.colum.  Otherwise, the caller has to set up things
-   to print a second caret line for xloc2.  */
+/* Print text describing a line of source code.
+   This typically prints two lines:
+
+   (1) the source code itself, colorized at any ranges, and
+   (2) an annotation line containing any carets/underlines
+   describing the ranges.  */
+
 void
-diagnostic_print_caret_line (diagnostic_context * context,
-			     expanded_location xloc1,
-			     expanded_location xloc2,
-			     char caret1, char caret2)
-{
-  if (!diagnostic_same_line (context, xloc1, xloc2))
-    /* This will mean ignore xloc2.  */
-    xloc2.column = 0;
-  else if (xloc1.column == xloc2.column)
-    xloc2.column++;
-
-  int cmax = MAX (xloc1.column, xloc2.column);
+layout::print_line (int row)
+{
   int line_width;
-  const char *line = location_get_source_line (xloc1.file, xloc1.line,
+  const char *line = location_get_source_line (m_exploc.file, row,
 					       &line_width);
-  if (line == NULL || cmax > line_width)
+  if (!line)
     return;
 
-  /* Center the interesting part of the source line to fit in
-     max_width, and adjust all columns accordingly.  */
-  int max_width = context->caret_max_width;
-  int offset = (int) cmax;
-  line = adjust_line (line, line_width, max_width, &offset);
-  offset -= cmax;
-  cmax += offset;
-  xloc1.column += offset;
-  if (xloc2.column)
-    xloc2.column += offset;
-
-  /* Print the source line.  */
-  pp_newline (context->printer);
-  const char *saved_prefix = pp_get_prefix (context->printer);
-  pp_set_prefix (context->printer, NULL);
-  pp_space (context->printer);
-  while (max_width > 0 && line_width > 0)
+  line += m_x_offset;
+
+  m_colorizer.set_normal_text ();
+
+  /* Step 1: print the source code line.  */
+
+  /* We will stop printing at any trailing whitespace.  */
+  line_width
+    = get_line_width_without_trailing_whitespace (line,
+						  line_width);
+  pp_space (m_pp);
+  int first_non_ws = INT_MAX;
+  int last_non_ws = 0;
+  int column;
+  for (column = 1 + m_x_offset; column <= line_width; column++)
     {
+      bool in_range_p;
+      point_state state;
+      in_range_p = get_state_at_point (row, column,
+				       0, INT_MAX,
+				       &state);
+      if (in_range_p)
+	m_colorizer.set_range (state.range_idx);
+      else
+	m_colorizer.set_normal_text ();
       char c = *line == '\t' ? ' ' : *line;
       if (c == '\0')
 	c = ' ';
-      pp_character (context->printer, c);
-      max_width--;
-      line_width--;
+      if (c != ' ')
+	{
+	  last_non_ws = column;
+	  if (first_non_ws == INT_MAX)
+	    first_non_ws = column;
+	}
+      pp_character (m_pp, c);
       line++;
     }
-  pp_newline (context->printer);
+  pp_newline (m_pp);
+
+  /* Step 2: print a line consisting of the caret/underlines for the
+     given source line.  */
+  int x_bound = get_x_bound_for_row (row, m_exploc.column,
+				     last_non_ws);
 
-  /* Print the caret under the line.  */
-  const char *caret_cs, *caret_ce;
-  caret_cs = colorize_start (pp_show_color (context->printer), "caret");
-  caret_ce = colorize_stop (pp_show_color (context->printer));
-  int cmin = xloc2.column
-    ? MIN (xloc1.column, xloc2.column) : xloc1.column;
-  int caret_min = cmin == xloc1.column ? caret1 : caret2;
-  int caret_max = cmin == xloc1.column ? caret2 : caret1;
-
-  /* cmin is >= 1, but we indent with an extra space at the start like
-     we did above.  */
+  pp_space (m_pp);
+  for (int column = 1 + m_x_offset; column < x_bound; column++)
+    {
+      bool in_range_p;
+      point_state state;
+      in_range_p = get_state_at_point (row, column,
+				       first_non_ws, last_non_ws,
+				       &state);
+      if (in_range_p)
+	{
+	  /* Within a range.  Draw either the caret or an underline.  */
+	  m_colorizer.set_range (state.range_idx);
+	  if (state.draw_caret_p)
+	    /* Draw the caret.  */
+	    pp_character (m_pp, m_context->caret_chars[state.range_idx]);
+	  else
+	    pp_character (m_pp, '~');
+	}
+      else
+	{
+	  /* Not in a range.  */
+	  m_colorizer.set_normal_text ();
+	  pp_character (m_pp, ' ');
+	}
+    }
+  pp_newline (m_pp);
+}
+
+/* Return true if (ROW/COLUMN) is within a range of the layout.
+   If it returns true, OUT_STATE is written to, with the
+   range index, and whether we should draw the caret at
+   (ROW/COLUMN) (as opposed to an underline).  */
+
+bool
+layout::get_state_at_point (/* Inputs.  */
+			    int row, int column,
+			    int first_non_ws, int last_non_ws,
+			    /* Outputs.  */
+			    point_state *out_state)
+{
+  layout_range *range;
   int i;
-  for (i = 0; i < cmin; i++)
-    pp_space (context->printer);
-  pp_printf (context->printer, "%s%c%s", caret_cs, caret_min, caret_ce);
+  FOR_EACH_VEC_ELT (m_layout_ranges, i, range)
+    {
+      if (0)
+	fprintf (stderr,
+		 "range ( (%i, %i), (%i, %i))->contains_point (%i, %i): %s\n",
+		 range->m_start.m_line,
+		 range->m_start.m_column,
+		 range->m_finish.m_line,
+		 range->m_finish.m_column,
+		 row,
+		 column,
+		 range->contains_point (row, column) ? "true" : "false");
+
+      if (range->contains_point (row, column))
+	{
+	  out_state->range_idx = i;
 
-  if (xloc2.column)
+	  /* Are we at the range's caret?  is it visible? */
+	  out_state->draw_caret_p = false;
+	  if (row == range->m_caret.m_line
+	      && column == range->m_caret.m_column)
+	    out_state->draw_caret_p = range->m_show_caret_p;
+
+	  /* Within a multiline range, don't display any underline
+	     in any leading or trailing whitespace on a line.
+	     We do display carets, however.  */
+	  if (!out_state->draw_caret_p)
+	    if (column < first_non_ws || column > last_non_ws)
+	      return false;
+
+	  /* We are within a range.  */
+	  return true;
+	}
+    }
+
+  return false;
+}
+
+/* Helper function for use by layout::print_line when printing the
+   annotation line under the source line.
+   Get the column beyond the rightmost one that could contain a caret or
+   range marker, given that we stop rendering at trailing whitespace.
+   ROW is the source line within the given file.
+   CARET_COLUMN is the column of range 0's caret.
+   LAST_NON_WS_COLUMN is the last column containing a non-whitespace
+   character of source (as determined when printing the source line).  */
+
+int
+layout::get_x_bound_for_row (int row, int caret_column,
+			     int last_non_ws_column)
+{
+  int result = caret_column + 1;
+
+  layout_range *range;
+  int i;
+  FOR_EACH_VEC_ELT (m_layout_ranges, i, range)
     {
-      for (i++; i < cmax; i++)
-	pp_space (context->printer);
-      pp_printf (context->printer, "%s%c%s", caret_cs, caret_max, caret_ce);
+      if (row >= range->m_start.m_line)
+	{
+	  if (range->m_finish.m_line == row)
+	    {
+	      /* On the final line within a range; ensure that
+		 we render up to the end of the range.  */
+	      if (result <= range->m_finish.m_column)
+		result = range->m_finish.m_column + 1;
+	    }
+	  else if (row < range->m_finish.m_line)
+	    {
+	      /* Within a multiline range; ensure that we render up to the
+		 last non-whitespace column.  */
+	      if (result <= last_non_ws_column)
+		result = last_non_ws_column + 1;
+	    }
+	}
     }
+
+  return result;
+}
+
+} /* End of anonymous namespace.  */
+
+/* For debugging layout issues in diagnostic_show_locus and friends,
+   render a ruler giving column numbers (after the 1-column indent).  */
+
+static void
+show_ruler (diagnostic_context *context, int max_width, int x_offset)
+{
+  /* Hundreds.  */
+  if (max_width > 99)
+    {
+      pp_space (context->printer);
+      for (int column = 1 + x_offset; column < max_width; column++)
+	if (0 == column % 10)
+	  pp_character (context->printer, '0' + (column / 100) % 10);
+	else
+	  pp_space (context->printer);
+      pp_newline (context->printer);
+    }
+
+  /* Tens.  */
+  pp_space (context->printer);
+  for (int column = 1 + x_offset; column < max_width; column++)
+    if (0 == column % 10)
+      pp_character (context->printer, '0' + (column / 10) % 10);
+    else
+      pp_space (context->printer);
+  pp_newline (context->printer);
+
+  /* Units.  */
+  pp_space (context->printer);
+  for (int column = 1 + x_offset; column < max_width; column++)
+    pp_character (context->printer, '0' + (column % 10));
+  pp_newline (context->printer);
+}
+
+/* Print the physical source code corresponding to the location of
+   this diagnostic, with additional annotations.  */
+
+void
+diagnostic_show_locus (diagnostic_context * context,
+		       const diagnostic_info *diagnostic)
+{
+  if (!context->show_caret
+      || diagnostic_location (diagnostic, 0) <= BUILTINS_LOCATION
+      || diagnostic_location (diagnostic, 0) == context->last_location)
+    return;
+
+  context->last_location = diagnostic_location (diagnostic, 0);
+
+  pp_newline (context->printer);
+
+  const char *saved_prefix = pp_get_prefix (context->printer);
+  pp_set_prefix (context->printer, NULL);
+
+  {
+    layout layout (context, diagnostic);
+    int last_line = layout.get_last_line ();
+    for (int row = layout.get_first_line ();
+	 row <= last_line;
+	 row++)
+      layout.print_line (row);
+
+    /* The closing scope here leads to the dtor for layout and thus
+       colorizer being called here, which affects the precise
+       place where colorization is turned off in the unittest
+       for colorized output.  */
+  }
+
   pp_set_prefix (context->printer, saved_prefix);
-  pp_needs_newline (context->printer) = true;
 }
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 831859a..5fe6627 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -144,7 +144,7 @@ diagnostic_initialize (diagnostic_context *context, int n_opts)
     context->classify_diagnostic[i] = DK_UNSPECIFIED;
   context->show_caret = false;
   diagnostic_set_caret_max_width (context, pp_line_cutoff (context->printer));
-  for (i = 0; i < MAX_LOCATIONS_PER_MESSAGE; i++)
+  for (i = 0; i < rich_location::MAX_RANGES; i++)
     context->caret_chars[i] = '^';
   context->show_option_requested = false;
   context->abort_on_error = false;
@@ -234,16 +234,15 @@ diagnostic_finish (diagnostic_context *context)
    translated.  */
 void
 diagnostic_set_info_translated (diagnostic_info *diagnostic, const char *msg,
-				va_list *args, location_t location,
+				va_list *args, rich_location *richloc,
 				diagnostic_t kind)
 {
+  gcc_assert (richloc);
   diagnostic->message.err_no = errno;
   diagnostic->message.args_ptr = args;
   diagnostic->message.format_spec = msg;
-  diagnostic->message.set_location (0, location);
-  for (int i = 1; i < MAX_LOCATIONS_PER_MESSAGE; i++)
-    diagnostic->message.set_location (i, UNKNOWN_LOCATION);
-  diagnostic->override_column = 0;
+  diagnostic->message.m_richloc = richloc;
+  diagnostic->richloc = richloc;
   diagnostic->kind = kind;
   diagnostic->option_index = 0;
 }
@@ -252,10 +251,27 @@ diagnostic_set_info_translated (diagnostic_info *diagnostic, const char *msg,
    translated.  */
 void
 diagnostic_set_info (diagnostic_info *diagnostic, const char *gmsgid,
-		     va_list *args, location_t location,
+		     va_list *args, rich_location *richloc,
 		     diagnostic_t kind)
 {
-  diagnostic_set_info_translated (diagnostic, _(gmsgid), args, location, kind);
+  gcc_assert (richloc);
+  diagnostic_set_info_translated (diagnostic, _(gmsgid), args, richloc, kind);
+}
+
+static const char *const diagnostic_kind_color[] = {
+#define DEFINE_DIAGNOSTIC_KIND(K, T, C) (C),
+#include "diagnostic.def"
+#undef DEFINE_DIAGNOSTIC_KIND
+  NULL
+};
+
+/* Get a color name for diagnostics of type KIND
+   Result could be NULL.  */
+
+const char *
+diagnostic_get_color_for_kind (diagnostic_t kind)
+{
+  return diagnostic_kind_color[kind];
 }
 
 /* Return a malloc'd string describing a location.  The caller is
@@ -270,12 +286,6 @@ diagnostic_build_prefix (diagnostic_context *context,
 #undef DEFINE_DIAGNOSTIC_KIND
     "must-not-happen"
   };
-  static const char *const diagnostic_kind_color[] = {
-#define DEFINE_DIAGNOSTIC_KIND(K, T, C) (C),
-#include "diagnostic.def"
-#undef DEFINE_DIAGNOSTIC_KIND
-    NULL
-  };
   gcc_assert (diagnostic->kind < DK_LAST_DIAGNOSTIC_KIND);
 
   const char *text = _(diagnostic_kind_text[diagnostic->kind]);
@@ -771,10 +781,14 @@ diagnostic_report_diagnostic (diagnostic_context *context,
 
       if (option_text)
 	{
+	  const char *cs
+	    = colorize_start (pp_show_color (context->printer),
+			      diagnostic_kind_color[diagnostic->kind]);
+	  const char *ce = colorize_stop (pp_show_color (context->printer));
 	  diagnostic->message.format_spec
 	    = ACONCAT ((diagnostic->message.format_spec,
 			" ", 
-			"[", option_text, "]",
+			"[", cs, option_text, ce, "]",
 			NULL));
 	  free (option_text);
 	}
@@ -854,9 +868,40 @@ diagnostic_append_note (diagnostic_context *context,
   diagnostic_info diagnostic;
   va_list ap;
   const char *saved_prefix;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_NOTE);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_NOTE);
+  if (context->inhibit_notes_p)
+    {
+      va_end (ap);
+      return;
+    }
+  saved_prefix = pp_get_prefix (context->printer);
+  pp_set_prefix (context->printer,
+                 diagnostic_build_prefix (context, &diagnostic));
+  pp_newline (context->printer);
+  pp_format (context->printer, &diagnostic.message);
+  pp_output_formatted_text (context->printer);
+  pp_destroy_prefix (context->printer);
+  pp_set_prefix (context->printer, saved_prefix);
+  diagnostic_show_locus (context, &diagnostic);
+  va_end (ap);
+}
+
+/* Same as diagnostic_append_note, but at RICHLOC. */
+
+void
+diagnostic_append_note_at_rich_loc (diagnostic_context *context,
+				    rich_location *richloc,
+				    const char * gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
+  const char *saved_prefix;
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, richloc, DK_NOTE);
   if (context->inhibit_notes_p)
     {
       va_end (ap);
@@ -881,16 +926,17 @@ emit_diagnostic (diagnostic_t kind, location_t location, int opt,
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
   if (kind == DK_PERMERROR)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 			   permissive_error_kind (global_dc));
       diagnostic.option_index = permissive_error_option (global_dc);
     }
   else {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location, kind);
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, kind);
       if (kind == DK_WARNING || kind == DK_PEDWARN)
 	diagnostic.option_index = opt;
   }
@@ -907,9 +953,23 @@ inform (location_t location, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_NOTE);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_NOTE);
+  report_diagnostic (&diagnostic);
+  va_end (ap);
+}
+
+/* Same as "inform", but at RICHLOC.  */
+void
+inform_at_rich_loc (rich_location *richloc, const char *gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, richloc, DK_NOTE);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -922,11 +982,12 @@ inform_n (location_t location, int n, const char *singular_gmsgid,
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
 
   va_start (ap, plural_gmsgid);
   diagnostic_set_info_translated (&diagnostic,
                                   ngettext (singular_gmsgid, plural_gmsgid, n),
-                                  &ap, location, DK_NOTE);
+                                  &ap, &richloc, DK_NOTE);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -940,9 +1001,10 @@ warning (int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_WARNING);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_WARNING);
   diagnostic.option_index = opt;
 
   ret = report_diagnostic (&diagnostic);
@@ -960,9 +1022,27 @@ warning_at (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_WARNING);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_WARNING);
+  diagnostic.option_index = opt;
+  ret = report_diagnostic (&diagnostic);
+  va_end (ap);
+  return ret;
+}
+
+/* Same as warning at, but using RICHLOC.  */
+
+bool
+warning_at_rich_loc (rich_location *richloc, int opt, const char *gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
+  bool ret;
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, richloc, DK_WARNING);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
   va_end (ap);
@@ -980,11 +1060,13 @@ warning_n (location_t location, int opt, int n, const char *singular_gmsgid,
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, plural_gmsgid);
   diagnostic_set_info_translated (&diagnostic,
                                   ngettext (singular_gmsgid, plural_gmsgid, n),
-                                  &ap, location, DK_WARNING);
+                                  &ap, &richloc, DK_WARNING
+);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
   va_end (ap);
@@ -1010,9 +1092,10 @@ pedwarn (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,  DK_PEDWARN);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,  DK_PEDWARN);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
   va_end (ap);
@@ -1032,9 +1115,28 @@ permerror (location_t location, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
+                       permissive_error_kind (global_dc));
+  diagnostic.option_index = permissive_error_option (global_dc);
+  ret = report_diagnostic (&diagnostic);
+  va_end (ap);
+  return ret;
+}
+
+/* Same as "permerror", but at RICHLOC.  */
+
+bool
+permerror_at_rich_loc (rich_location *richloc, const char *gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
+  bool ret;
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, richloc,
                        permissive_error_kind (global_dc));
   diagnostic.option_index = permissive_error_option (global_dc);
   ret = report_diagnostic (&diagnostic);
@@ -1049,9 +1151,10 @@ error (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_ERROR);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ERROR);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -1064,11 +1167,12 @@ error_n (location_t location, int n, const char *singular_gmsgid,
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
 
   va_start (ap, plural_gmsgid);
   diagnostic_set_info_translated (&diagnostic,
                                   ngettext (singular_gmsgid, plural_gmsgid, n),
-                                  &ap, location, DK_ERROR);
+                                  &ap, &richloc, DK_ERROR);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -1079,9 +1183,25 @@ error_at (location_t loc, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (loc);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, loc, DK_ERROR);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ERROR);
+  report_diagnostic (&diagnostic);
+  va_end (ap);
+}
+
+/* Same as above, but use RICH_LOC.  */
+
+void
+error_at_rich_loc (rich_location *rich_loc, const char *gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, rich_loc,
+		       DK_ERROR);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -1094,9 +1214,10 @@ sorry (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_SORRY);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_SORRY);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -1117,9 +1238,10 @@ fatal_error (location_t loc, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (loc);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, loc, DK_FATAL);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_FATAL);
   report_diagnostic (&diagnostic);
   va_end (ap);
 
@@ -1135,9 +1257,10 @@ internal_error (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_ICE);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ICE);
   report_diagnostic (&diagnostic);
   va_end (ap);
 
@@ -1152,9 +1275,10 @@ internal_error_no_backtrace (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_ICE_NOBT);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ICE_NOBT);
   report_diagnostic (&diagnostic);
   va_end (ap);
 
@@ -1218,3 +1342,11 @@ real_abort (void)
 {
   abort ();
 }
+
+void
+source_range::debug (const char *msg) const
+{
+  rich_location richloc (m_start);
+  richloc.add_range (m_start, m_finish);
+  inform_at_rich_loc (&richloc, "%s", msg);
+}
diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h
index 7fcb6a8..d4ebf86 100644
--- a/gcc/diagnostic.h
+++ b/gcc/diagnostic.h
@@ -29,10 +29,12 @@ along with GCC; see the file COPYING3.  If not see
    list in diagnostic.def.  */
 struct diagnostic_info
 {
-  /* Text to be formatted. It also contains the location(s) for this
-     diagnostic.  */
+  /* Text to be formatted.  */
   text_info message;
-  unsigned int override_column;
+
+  /* The location at which the diagnostic is to be reported.  */
+  rich_location *richloc;
+
   /* Auxiliary data for client.  */
   void *x_data;
   /* The kind of diagnostic it is about.  */
@@ -102,8 +104,8 @@ struct diagnostic_context
   /* Maximum width of the source line printed.  */
   int caret_max_width;
 
-  /* Characters used for caret diagnostics.  */
-  char caret_chars[MAX_LOCATIONS_PER_MESSAGE];
+  /* Character used for caret diagnostics.  */
+  char caret_chars[rich_location::MAX_RANGES];
 
   /* True if we should print the command line option which controls
      each diagnostic, if known.  */
@@ -252,10 +254,6 @@ extern diagnostic_context *global_dc;
 
 #define report_diagnostic(D) diagnostic_report_diagnostic (global_dc, D)
 
-/* Override the column number to be used for reporting a
-   diagnostic.  */
-#define diagnostic_override_column(DI, COL) (DI)->override_column = (COL)
-
 /* Override the option index to be used for reporting a
    diagnostic.  */
 #define diagnostic_override_option_index(DI, OPTIDX) \
@@ -279,13 +277,17 @@ extern bool diagnostic_report_diagnostic (diagnostic_context *,
 					  diagnostic_info *);
 #ifdef ATTRIBUTE_GCC_DIAG
 extern void diagnostic_set_info (diagnostic_info *, const char *, va_list *,
-				 location_t, diagnostic_t) ATTRIBUTE_GCC_DIAG(2,0);
+				 rich_location *, diagnostic_t) ATTRIBUTE_GCC_DIAG(2,0);
 extern void diagnostic_set_info_translated (diagnostic_info *, const char *,
-					    va_list *, location_t,
+					    va_list *, rich_location *,
 					    diagnostic_t)
      ATTRIBUTE_GCC_DIAG(2,0);
 extern void diagnostic_append_note (diagnostic_context *, location_t,
                                     const char *, ...) ATTRIBUTE_GCC_DIAG(3,4);
+extern void diagnostic_append_note_at_rich_loc (diagnostic_context *,
+						rich_location *,
+						const char *, ...)
+  ATTRIBUTE_GCC_DIAG(3,4);
 #endif
 extern char *diagnostic_build_prefix (diagnostic_context *, const diagnostic_info *);
 void default_diagnostic_starter (diagnostic_context *, diagnostic_info *);
@@ -306,6 +308,14 @@ diagnostic_location (const diagnostic_info * diagnostic, int which = 0)
   return diagnostic->message.get_location (which);
 }
 
+/* Return the number of locations to be printed in DIAGNOSTIC.  */
+
+static inline unsigned int
+diagnostic_num_locations (const diagnostic_info * diagnostic)
+{
+  return diagnostic->message.m_richloc->get_num_locations ();
+}
+
 /* Expand the location of this diagnostic. Use this function for
    consistency.  Parameter WHICH specifies which location. By default,
    expand the first one.  */
@@ -313,12 +323,7 @@ diagnostic_location (const diagnostic_info * diagnostic, int which = 0)
 static inline expanded_location
 diagnostic_expand_location (const diagnostic_info * diagnostic, int which = 0)
 {
-  expanded_location s
-    = expand_location_to_spelling_point (diagnostic_location (diagnostic,
-							      which));
-  if (which == 0 && diagnostic->override_column)
-    s.column = diagnostic->override_column;
-  return s;
+  return diagnostic->richloc->get_range (which)->m_caret;
 }
 
 /* This is somehow the right-side margin of a caret line, that is, we
@@ -338,11 +343,8 @@ diagnostic_same_line (const diagnostic_context *context,
     && context->caret_max_width - CARET_LINE_MARGIN > abs (s1.column - s2.column);
 }
 
-void
-diagnostic_print_caret_line (diagnostic_context * context,
-			     expanded_location xloc1,
-			     expanded_location xloc2,
-			     char caret1, char caret2);
+extern const char *
+diagnostic_get_color_for_kind (diagnostic_t kind);
 
 /* Pure text formatting support functions.  */
 extern char *file_name_as_prefix (diagnostic_context *, const char *);
diff --git a/gcc/fortran/cpp.c b/gcc/fortran/cpp.c
index daffc20..92dc584 100644
--- a/gcc/fortran/cpp.c
+++ b/gcc/fortran/cpp.c
@@ -149,9 +149,9 @@ static void cb_include (cpp_reader *, source_location, const unsigned char *,
 static void cb_ident (cpp_reader *, source_location, const cpp_string *);
 static void cb_used_define (cpp_reader *, source_location, cpp_hashnode *);
 static void cb_used_undef (cpp_reader *, source_location, cpp_hashnode *);
-static bool cb_cpp_error (cpp_reader *, int, int, location_t, unsigned int,
+static bool cb_cpp_error (cpp_reader *, int, int, rich_location *,
 			  const char *, va_list *)
-     ATTRIBUTE_GCC_DIAG(6,0);
+     ATTRIBUTE_GCC_DIAG(5,0);
 void pp_dir_change (cpp_reader *, const char *);
 
 static int dump_macro (cpp_reader *, cpp_hashnode *, void *);
@@ -1026,13 +1026,12 @@ cb_used_define (cpp_reader *pfile, source_location line ATTRIBUTE_UNUSED,
 /* Callback from cpp_error for PFILE to print diagnostics from the
    preprocessor.  The diagnostic is of type LEVEL, with REASON set
    to the reason code if LEVEL is represents a warning, at location
-   LOCATION, with column number possibly overridden by COLUMN_OVERRIDE
-   if not zero; MSG is the translated message and AP the arguments.
+   RICHLOC; MSG is the translated message and AP the arguments.
    Returns true if a diagnostic was emitted, false otherwise.  */
 
 static bool
 cb_cpp_error (cpp_reader *pfile ATTRIBUTE_UNUSED, int level, int reason,
-	      location_t location, unsigned int column_override,
+	      rich_location *richloc,
 	      const char *msg, va_list *ap)
 {
   diagnostic_info diagnostic;
@@ -1067,9 +1066,7 @@ cb_cpp_error (cpp_reader *pfile ATTRIBUTE_UNUSED, int level, int reason,
       gcc_unreachable ();
     }
   diagnostic_set_info_translated (&diagnostic, msg, ap,
-				  location, dlevel);
-  if (column_override)
-    diagnostic_override_column (&diagnostic, column_override);
+				  richloc, dlevel);
   if (reason == CPP_W_WARNING_DIRECTIVE)
     diagnostic_override_option_index (&diagnostic, OPT_Wcpp);
   ret = report_diagnostic (&diagnostic);
diff --git a/gcc/fortran/error.c b/gcc/fortran/error.c
index 3825751..4b3d31c 100644
--- a/gcc/fortran/error.c
+++ b/gcc/fortran/error.c
@@ -773,6 +773,7 @@ gfc_warning (int opt, const char *gmsgid, va_list ap)
   va_copy (argp, ap);
 
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
   bool fatal_errors = global_dc->fatal_errors;
   pretty_printer *pp = global_dc->printer;
   output_buffer *tmp_buffer = pp->buffer;
@@ -787,7 +788,7 @@ gfc_warning (int opt, const char *gmsgid, va_list ap)
       --werrorcount;
     }
 
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION,
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc,
 		       DK_WARNING);
   diagnostic.option_index = opt;
   bool ret = report_diagnostic (&diagnostic);
@@ -938,10 +939,12 @@ gfc_format_decoder (pretty_printer *pp,
 	/* If location[0] != UNKNOWN_LOCATION means that we already
 	   processed one of %C/%L.  */
 	int loc_num = text->get_location (0) == UNKNOWN_LOCATION ? 0 : 1;
-	text->set_location (loc_num,
-			    linemap_position_for_loc_and_offset (line_table,
-								 loc->lb->location,
-								 offset));
+	source_range range
+	  = source_range::from_location (
+	      linemap_position_for_loc_and_offset (line_table,
+						   loc->lb->location,
+						   offset));
+	text->set_range (loc_num, range, true);
 	pp_string (pp, result[loc_num]);
 	return true;
       }
@@ -1024,48 +1027,21 @@ gfc_diagnostic_build_locus_prefix (diagnostic_context *context,
 }
 
 /* This function prints the locus (file:line:column), the diagnostic kind
-   (Error, Warning) and (optionally) the caret line (a source line
-   with '1' and/or '2' below it).
+   (Error, Warning) and (optionally) the relevant lines of code with
+   annotation lines with '1' and/or '2' below them.
 
-   With -fdiagnostic-show-caret (the default) and for valid locations,
-   it prints for one location:
+   With -fdiagnostic-show-caret (the default) it prints:
 
-       [locus]:
+       [locus of primary range]:
        
           some code
                  1
        Error: Some error at (1)
         
-   for two locations that fit in the same locus line:
+  With -fno-diagnostic-show-caret or if the primary range is not
+  valid, it prints:
 
-       [locus]:
-       
-         some code and some more code
-                1       2
-       Error: Some error at (1) and (2)
-
-   and for two locations that do not fit in the same locus line:
-
-       [locus]:
-       
-         some code
-                1
-       [locus2]:
-       
-         some other code
-           2
-       Error: Some error at (1) and (2)
-       
-  With -fno-diagnostic-show-caret or if one of the locations is not
-  valid, it prints for one location (or for two locations that fit in
-  the same locus line):
-
-       [locus]: Error: Some error at (1) and (2)
-
-   and for two locations that do not fit in the same locus line:
-
-       [name]:[locus]: Error: (1)
-       [name]:[locus2]: Error: Some error at (1) and (2)
+       [locus of primary range]: Error: Some error at (1) and (2)
 */
 static void 
 gfc_diagnostic_starter (diagnostic_context *context,
@@ -1075,7 +1051,7 @@ gfc_diagnostic_starter (diagnostic_context *context,
 
   expanded_location s1 = diagnostic_expand_location (diagnostic);
   expanded_location s2;
-  bool one_locus = diagnostic_location (diagnostic, 1) == UNKNOWN_LOCATION;
+  bool one_locus = diagnostic->richloc->get_num_locations () < 2;
   bool same_locus = false;
 
   if (!one_locus) 
@@ -1125,35 +1101,6 @@ gfc_diagnostic_starter (diagnostic_context *context,
       /* If the caret line was shown, the prefix does not contain the
 	 locus.  */
       pp_set_prefix (context->printer, kind_prefix);
-
-      if (one_locus || same_locus)
-	  return;
-
-      locus_prefix = gfc_diagnostic_build_locus_prefix (context, s2);
-      if (diagnostic_location (diagnostic, 1) <= BUILTINS_LOCATION)
-	{
-	  /* No caret line for the second location. Override the previous
-	     prefix with [locus2]:[prefix].  */
-	  pp_set_prefix (context->printer,
-			 concat (locus_prefix, " ", kind_prefix, NULL));
-	  free (kind_prefix);
-	  free (locus_prefix);
-	}
-      else
-	{
-	  /* We print the caret for the second location.  */
-	  pp_verbatim (context->printer, locus_prefix);
-	  free (locus_prefix);
-	  /* Fortran uses an empty line between locus and caret line.  */
-	  pp_newline (context->printer);
-	  s1.column = 0; /* Print only a caret line for s2.  */
-	  diagnostic_print_caret_line (context, s2, s1,
-				       context->caret_chars[1], '\0');
-	  pp_newline (context->printer);
-	  /* If the caret line was shown, the prefix does not contain the
-	     locus.  */
-	  pp_set_prefix (context->printer, kind_prefix);
-	}
     }
 }
 
@@ -1173,10 +1120,11 @@ gfc_warning_now_at (location_t loc, int opt, const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (loc);
   bool ret;
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, loc, DK_WARNING);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc, DK_WARNING);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
   va_end (argp);
@@ -1190,10 +1138,11 @@ gfc_warning_now (int opt, const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
   bool ret;
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION,
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc,
 		       DK_WARNING);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
@@ -1209,11 +1158,12 @@ gfc_error_now (const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
 
   error_buffer.flag = true;
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION, DK_ERROR);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc, DK_ERROR);
   report_diagnostic (&diagnostic);
   va_end (argp);
 }
@@ -1226,9 +1176,10 @@ gfc_fatal_error (const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION, DK_FATAL);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc, DK_FATAL);
   report_diagnostic (&diagnostic);
   va_end (argp);
 
@@ -1291,6 +1242,7 @@ gfc_error (const char *gmsgid, va_list ap)
     }
 
   diagnostic_info diagnostic;
+  rich_location richloc (UNKNOWN_LOCATION);
   bool fatal_errors = global_dc->fatal_errors;
   pretty_printer *pp = global_dc->printer;
   output_buffer *tmp_buffer = pp->buffer;
@@ -1306,7 +1258,7 @@ gfc_error (const char *gmsgid, va_list ap)
       --errorcount;
     }
 
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION, DK_ERROR);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &richloc, DK_ERROR);
   report_diagnostic (&diagnostic);
 
   if (buffered_p)
@@ -1336,9 +1288,10 @@ gfc_internal_error (const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION, DK_ICE);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc, DK_ICE);
   report_diagnostic (&diagnostic);
   va_end (argp);
 
diff --git a/gcc/genmatch.c b/gcc/genmatch.c
index 102a635..6bfde06 100644
--- a/gcc/genmatch.c
+++ b/gcc/genmatch.c
@@ -53,14 +53,23 @@ unsigned verbose;
 
 static struct line_maps *line_table;
 
+expanded_location
+linemap_client_expand_location_to_spelling_point (source_location loc)
+{
+  const struct line_map_ordinary *map;
+  loc = linemap_resolve_location (line_table, loc, LRK_SPELLING_LOCATION, &map);
+  return linemap_expand_location (line_table, map, loc);
+}
+
 static bool
 #if GCC_VERSION >= 4001
-__attribute__((format (printf, 6, 0)))
+__attribute__((format (printf, 5, 0)))
 #endif
-error_cb (cpp_reader *, int errtype, int, source_location location,
-	  unsigned int, const char *msg, va_list *ap)
+error_cb (cpp_reader *, int errtype, int, rich_location *richloc,
+	  const char *msg, va_list *ap)
 {
   const line_map_ordinary *map;
+  source_location location = richloc->get_loc ();
   linemap_resolve_location (line_table, location, LRK_SPELLING_LOCATION, &map);
   expanded_location loc = linemap_expand_location (line_table, map, location);
   fprintf (stderr, "%s:%d:%d %s: ", loc.file, loc.line, loc.column,
@@ -102,9 +111,10 @@ __attribute__((format (printf, 2, 3)))
 #endif
 fatal_at (const cpp_token *tk, const char *msg, ...)
 {
+  rich_location richloc (tk->src_loc);
   va_list ap;
   va_start (ap, msg);
-  error_cb (NULL, CPP_DL_FATAL, 0, tk->src_loc, 0, msg, &ap);
+  error_cb (NULL, CPP_DL_FATAL, 0, &richloc, msg, &ap);
   va_end (ap);
 }
 
@@ -114,9 +124,10 @@ __attribute__((format (printf, 2, 3)))
 #endif
 fatal_at (source_location loc, const char *msg, ...)
 {
+  rich_location richloc (loc);
   va_list ap;
   va_start (ap, msg);
-  error_cb (NULL, CPP_DL_FATAL, 0, loc, 0, msg, &ap);
+  error_cb (NULL, CPP_DL_FATAL, 0, &richloc, msg, &ap);
   va_end (ap);
 }
 
@@ -126,9 +137,10 @@ __attribute__((format (printf, 2, 3)))
 #endif
 warning_at (const cpp_token *tk, const char *msg, ...)
 {
+  rich_location richloc (tk->src_loc);
   va_list ap;
   va_start (ap, msg);
-  error_cb (NULL, CPP_DL_WARNING, 0, tk->src_loc, 0, msg, &ap);
+  error_cb (NULL, CPP_DL_WARNING, 0, &richloc, msg, &ap);
   va_end (ap);
 }
 
@@ -138,9 +150,10 @@ __attribute__((format (printf, 2, 3)))
 #endif
 warning_at (source_location loc, const char *msg, ...)
 {
+  rich_location richloc (loc);
   va_list ap;
   va_start (ap, msg);
-  error_cb (NULL, CPP_DL_WARNING, 0, loc, 0, msg, &ap);
+  error_cb (NULL, CPP_DL_WARNING, 0, &richloc, msg, &ap);
   va_end (ap);
 }
 
diff --git a/gcc/input.c b/gcc/input.c
index e7302a4..bdba20f 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -751,6 +751,13 @@ expand_location_to_spelling_point (source_location loc)
   return expand_location_1 (loc, /*expansion_point_p=*/false);
 }
 
+expanded_location
+linemap_client_expand_location_to_spelling_point (source_location loc)
+{
+  return expand_location_to_spelling_point (loc);
+}
+
+
 /* If LOCATION is in a system header and if it is a virtual location for
    a token coming from the expansion of a macro, unwind it to the
    location of the expansion point of the macro.  Otherwise, just return
diff --git a/gcc/pretty-print.c b/gcc/pretty-print.c
index 5889015..aee4172 100644
--- a/gcc/pretty-print.c
+++ b/gcc/pretty-print.c
@@ -31,6 +31,27 @@ along with GCC; see the file COPYING3.  If not see
 #include <iconv.h>
 #endif
 
+/* Overwrite the range within this text_info's rich_location.
+   For use e.g. when implementing "+" in client format decoders.  */
+
+void
+text_info::set_range (unsigned int idx, source_range range, bool caret_p)
+{
+  gcc_checking_assert (m_richloc);
+  m_richloc->set_range (idx, range, caret_p, true);
+}
+
+location_t
+text_info::get_location (unsigned int index_of_location) const
+{
+  gcc_checking_assert (m_richloc);
+
+  if (index_of_location == 0)
+    return m_richloc->get_loc ();
+  else
+    return UNKNOWN_LOCATION;
+}
+
 // Default construct an output buffer.
 
 output_buffer::output_buffer ()
diff --git a/gcc/pretty-print.h b/gcc/pretty-print.h
index 2654b0f..cdee253 100644
--- a/gcc/pretty-print.h
+++ b/gcc/pretty-print.h
@@ -27,11 +27,6 @@ along with GCC; see the file COPYING3.  If not see
 /* Maximum number of format string arguments.  */
 #define PP_NL_ARGMAX   30
 
-/* Maximum number of locations associated to each message.  If
-   location 'i' is UNKNOWN_LOCATION, then location 'i+1' is not
-   valid.  */
-#define MAX_LOCATIONS_PER_MESSAGE 2
-
 /* The type of a text to be formatted according a format specification
    along with a list of things.  */
 struct text_info
@@ -40,21 +35,17 @@ struct text_info
   va_list *args_ptr;
   int err_no;  /* for %m */
   void **x_data;
+  rich_location *m_richloc;
 
-  inline void set_location (unsigned int index_of_location, location_t loc)
+  inline void set_location (unsigned int idx, location_t loc, bool caret_p)
   {
-    gcc_checking_assert (index_of_location < MAX_LOCATIONS_PER_MESSAGE);
-    this->locations[index_of_location] = loc;
+    source_range src_range;
+    src_range.m_start = loc;
+    src_range.m_finish = loc;
+    set_range (idx, src_range, caret_p);
   }
-
-  inline location_t get_location (unsigned int index_of_location) const
-  {
-    gcc_checking_assert (index_of_location < MAX_LOCATIONS_PER_MESSAGE);
-    return this->locations[index_of_location];
-  }
-
-private:
-  location_t locations[MAX_LOCATIONS_PER_MESSAGE];
+  void set_range (unsigned int idx, source_range range, bool caret_p);
+  location_t get_location (unsigned int index_of_location) const;
 };
 
 /* How often diagnostics are prefixed by their locations:
diff --git a/gcc/rtl-error.c b/gcc/rtl-error.c
index 8b9b391..d28be1d 100644
--- a/gcc/rtl-error.c
+++ b/gcc/rtl-error.c
@@ -69,9 +69,10 @@ diagnostic_for_asm (const rtx_insn *insn, const char *msg, va_list *args_ptr,
 		    diagnostic_t kind)
 {
   diagnostic_info diagnostic;
+  rich_location richloc (location_for_asm (insn));
 
   diagnostic_set_info (&diagnostic, msg, args_ptr,
-		       location_for_asm (insn), kind);
+		       &richloc, kind);
   report_diagnostic (&diagnostic);
 }
 
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-bw.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-bw.c
new file mode 100644
index 0000000..a4b16da
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-bw.c
@@ -0,0 +1,149 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdiagnostics-show-caret" } */
+
+/* This is a collection of unittests for diagnostic_show_locus;
+   see the overview in diagnostic_plugin_test_show_locus.c.
+
+   In particular, note the discussion of why we need a very long line here:
+01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
+   and that we can't use macros in this file.  */
+
+void test_simple (void)
+{
+#if 0
+  myvar = myvar.x; /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   myvar = myvar.x;
+           ~~~~~^~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_simple_2 (void)
+{
+#if 0
+  x = first_function () + second_function ();  /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   x = first_function () + second_function ();
+       ~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+
+void test_multiline (void)
+{
+#if 0
+  x = (first_function ()
+       + second_function ()); /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   x = (first_function ()
+        ~~~~~~~~~~~~~~~~~
+        + second_function ());
+        ^ ~~~~~~~~~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_many_lines (void)
+{
+#if 0
+  x = (first_function_with_a_very_long_name (lorem, ipsum, dolor, sit, amet,
+                                            consectetur, adipiscing, elit,
+                                            sed, eiusmod, tempor,
+                                            incididunt, ut, labore, et,
+                                            dolore, magna, aliqua)
+       + second_function_with_a_very_long_name (lorem, ipsum, dolor, sit, /* { dg-warning "test" } */
+                                                amet, consectetur,
+                                                adipiscing, elit, sed,
+                                                eiusmod, tempor, incididunt,
+                                                ut, labore, et, dolore,
+                                                magna, aliqua));
+
+/* { dg-begin-multiline-output "" }
+   x = (first_function_with_a_very_long_name (lorem, ipsum, dolor, sit, amet,
+        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                             consectetur, adipiscing, elit,
+                                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                             sed, eiusmod, tempor,
+                                             ~~~~~~~~~~~~~~~~~~~~~
+                                             incididunt, ut, labore, et,
+                                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                             dolore, magna, aliqua)
+                                             ~~~~~~~~~~~~~~~~~~~~~~
+        + second_function_with_a_very_long_name (lorem, ipsum, dolor, sit,
+        ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                                 amet, consectetur,
+                                                 ~~~~~~~~~~~~~~~~~~
+                                                 adipiscing, elit, sed,
+                                                 ~~~~~~~~~~~~~~~~~~~~~~
+                                                 eiusmod, tempor, incididunt,
+                                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                                 ut, labore, et, dolore,
+                                                 ~~~~~~~~~~~~~~~~~~~~~~~
+                                                 magna, aliqua));
+                                                 ~~~~~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_richloc_from_proper_range (void)
+{
+#if 0
+  float f = 98.6f; /* { dg-warning "test" } */
+/* { dg-begin-multiline-output "" }
+   float f = 98.6f;
+             ^~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_caret_within_proper_range (void)
+{
+#if 0
+  float f = foo * bar; /* { dg-warning "17: test" } */
+/* { dg-begin-multiline-output "" }
+   float f = foo * bar;
+             ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_very_wide_line (void)
+{
+#if 0
+                                                                                float f = foo * bar; /* { dg-warning "95: test" } */
+/* { dg-begin-multiline-output "" }
+                                              float f = foo * bar;
+                                                        ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_multiple_carets (void)
+{
+#if 0
+   x = x + y /* { dg-warning "8: test" } */
+/* { dg-begin-multiline-output "" }
+    x = x + y
+        A   B
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_caret_on_leading_whitespace (void)
+{
+#if 0
+    ASSOCIATE (y => x)
+      y = 5 /* { dg-warning "6: test" } */
+/* { dg-begin-multiline-output "" }
+     ASSOCIATE (y => x)
+                    2
+       y = 5
+      1
+   { dg-end-multiline-output "" } */
+#endif
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-color.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-color.c
new file mode 100644
index 0000000..47639b2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-color.c
@@ -0,0 +1,158 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdiagnostics-show-caret -fplugin-arg-diagnostic_plugin_test_show_locus-color" } */
+
+/* This is a collection of unittests for diagnostic_show_locus;
+   see the overview in diagnostic_plugin_test_show_locus.c.
+
+   In particular, note the discussion of why we need a very long line here:
+01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
+   and that we can't use macros in this file.  */
+
+void test_simple (void)
+{
+#if 0
+  myvar = myvar.x; /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   myvar = ^[[32m^[[Kmyvar^[[m^[[K^[[01;35m^[[K.^[[m^[[K^[[34m^[[Kx^[[m^[[K;
+           ^[[32m^[[K~~~~~^[[m^[[K^[[01;35m^[[K^^[[m^[[K^[[34m^[[K~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_simple_2 (void)
+{
+#if 0
+  x = first_function () + second_function ();  /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   x = ^[[32m^[[Kfirst_function ()^[[m^[[K ^[[01;35m^[[K+^[[m^[[K ^[[34m^[[Ksecond_function ()^[[m^[[K;
+       ^[[32m^[[K~~~~~~~~~~~~~~~~~^[[m^[[K ^[[01;35m^[[K^^[[m^[[K ^[[34m^[[K~~~~~~~~~~~~~~~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+
+void test_multiline (void)
+{
+#if 0
+  x = (first_function ()
+       + second_function ()); /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   x = (^[[32m^[[Kfirst_function ()
+ ^[[m^[[K       ^[[32m^[[K~~~~~~~~~~~~~~~~~
+^[[m^[[K        ^[[01;35m^[[K+^[[m^[[K ^[[34m^[[Ksecond_function ()^[[m^[[K);
+        ^[[01;35m^[[K^^[[m^[[K ^[[34m^[[K~~~~~~~~~~~~~~~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_many_lines (void)
+{
+#if 0
+  x = (first_function_with_a_very_long_name (lorem, ipsum, dolor, sit, amet,
+                                            consectetur, adipiscing, elit,
+                                            sed, eiusmod, tempor,
+                                            incididunt, ut, labore, et,
+                                            dolore, magna, aliqua)
+       + second_function_with_a_very_long_name (lorem, ipsum, dolor, sit, /* { dg-warning "test" } */
+                                                amet, consectetur,
+                                                adipiscing, elit, sed,
+                                                eiusmod, tempor, incididunt,
+                                                ut, labore, et, dolore,
+                                                magna, aliqua));
+
+/* { dg-begin-multiline-output "" }
+   x = (^[[32m^[[Kfirst_function_with_a_very_long_name (lorem, ipsum, dolor, sit, amet,
+ ^[[m^[[K       ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[32m^[[K                                            consectetur, adipiscing, elit,
+ ^[[m^[[K                                            ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[32m^[[K                                            sed, eiusmod, tempor,
+ ^[[m^[[K                                            ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[32m^[[K                                            incididunt, ut, labore, et,
+ ^[[m^[[K                                            ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[32m^[[K                                            dolore, magna, aliqua)
+ ^[[m^[[K                                            ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K        ^[[01;35m^[[K+^[[m^[[K ^[[34m^[[Ksecond_function_with_a_very_long_name (lorem, ipsum, dolor, sit,
+ ^[[m^[[K       ^[[01;35m^[[K^^[[m^[[K ^[[34m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                amet, consectetur,
+ ^[[m^[[K                                                ^[[34m^[[K~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                adipiscing, elit, sed,
+ ^[[m^[[K                                                ^[[34m^[[K~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                eiusmod, tempor, incididunt,
+ ^[[m^[[K                                                ^[[34m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                ut, labore, et, dolore,
+ ^[[m^[[K                                                ^[[34m^[[K~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                magna, aliqua)^[[m^[[K);
+                                                 ^[[34m^[[K~~~~~~~~~~~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_richloc_from_proper_range (void)
+{
+#if 0
+  float f = 98.6f; /* { dg-warning "test" } */
+/* { dg-begin-multiline-output "" }
+   float f = ^[[01;35m^[[K98.6f^[[m^[[K;
+             ^[[01;35m^[[K^~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_caret_within_proper_range (void)
+{
+#if 0
+  float f = foo * bar; /* { dg-warning "17: test" } */
+/* { dg-begin-multiline-output "" }
+   float f = ^[[01;35m^[[Kfoo * bar^[[m^[[K;
+             ^[[01;35m^[[K~~~~^~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_very_wide_line (void)
+{
+#if 0
+                                                                                float f = foo * bar; /* { dg-warning "95: test" } */
+/* { dg-begin-multiline-output "" }
+                                              float f = ^[[01;35m^[[Kfoo * bar^[[m^[[K;
+                                                        ^[[01;35m^[[K~~~~^~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_multiple_carets (void)
+{
+#if 0
+   x = x + y /* { dg-warning "8: test" } */
+/* { dg-begin-multiline-output "" }
+    x = ^[[01;35m^[[Kx^[[m^[[K + ^[[32m^[[Ky^[[m^[[K
+        ^[[01;35m^[[KA^[[m^[[K   ^[[32m^[[KB
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_caret_on_leading_whitespace (void)
+{
+#if 0
+    ASSOCIATE (y => x)
+      y = 5 /* { dg-warning "6: test" } */
+/* { dg-begin-multiline-output "" }
+     ASSOCIATE (y =>^[[32m^[[K ^[[m^[[Kx)
+                    ^[[32m^[[K2
+^[[m^[[K      ^[[01;35m^[[K ^[[m^[[Ky = 5
+      ^[[01;35m^[[K1
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
new file mode 100644
index 0000000..e49cf46
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
@@ -0,0 +1,321 @@
+/* { dg-options "-O" } */
+
+/* This plugin exercises the diagnostics-printing code.
+
+   The goal is to unit-test the range-printing code without needing any
+   correct range data within the compiler's IR.  We can't use any real
+   diagnostics for this, so we have to fake it, hence this plugin.
+
+   There are two test files used with this code:
+
+     diagnostic-test-show-locus-ascii-bw.c
+     ..........................-ascii-color.c
+
+   to exercise uncolored vs colored output by supplying plugin arguments
+   to hack in the desired behavior:
+
+     -fplugin-arg-diagnostic_plugin_test_show_locus-color
+
+   The test files contain functions, but the body of each
+   function is disabled using the preprocessor.  The plugin detects
+   the functions by name, and inject diagnostics within them, using
+   hard-coded locations relative to the top of each function.
+
+   The plugin uses a function "get_loc" below to map from line/column
+   numbers to source_location, and this relies on input_location being in
+   the same ordinary line_map as the locations in question.  The plugin
+   runs after parsing, so input_location will be at the end of the file.
+
+   This need for all of the test code to be in a single ordinary line map
+   means that each test file needs to have a very long line near the top
+   (potentially to cover the extra byte-count of colorized data),
+   to ensure that further very long lines don't start a new linemap.
+   This also means that we can't use macros in the test files.  */
+
+#include "gcc-plugin.h"
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "stringpool.h"
+#include "toplev.h"
+#include "basic-block.h"
+#include "hash-table.h"
+#include "vec.h"
+#include "ggc.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "internal-fn.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "tree.h"
+#include "tree-pass.h"
+#include "intl.h"
+#include "plugin-version.h"
+#include "diagnostic.h"
+#include "context.h"
+#include "print-tree.h"
+
+int plugin_is_GPL_compatible;
+
+const pass_data pass_data_test_show_locus =
+{
+  GIMPLE_PASS, /* type */
+  "test_show_locus", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_NONE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  0, /* todo_flags_finish */
+};
+
+class pass_test_show_locus : public gimple_opt_pass
+{
+public:
+  pass_test_show_locus(gcc::context *ctxt)
+    : gimple_opt_pass(pass_data_test_show_locus, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  bool gate (function *) { return true; }
+  virtual unsigned int execute (function *);
+
+}; // class pass_test_show_locus
+
+/* Given LINE_NUM and COL_NUM, generate a source_location in the
+   current file, relative to input_location.  This relies on the
+   location being expressible in the same ordinary line_map as
+   input_location (which is typically at the end of the source file
+   when this is called).  Hence the test files we compile with this
+   plugin must have an initial very long line (to avoid long lines
+   starting a new line map), and must not use macros.
+
+   COL_NUM uses the Emacs convention of 0-based column numbers.  */
+
+static source_location
+get_loc (unsigned int line_num, unsigned int col_num)
+{
+  /* Use input_location to get the relevant line_map */
+  const struct line_map_ordinary *line_map
+    = (const line_map_ordinary *)(linemap_lookup (line_table,
+						  input_location));
+
+  /* Convert from 0-based column numbers to 1-based column numbers.  */
+  source_location loc
+    = linemap_position_for_line_and_column (line_map,
+					    line_num, col_num + 1);
+
+  return loc;
+}
+
+/* Was "color" passed in as a plugin argument?  */
+static bool force_show_locus_color = false;
+
+/* We want to verify the colorized output of diagnostic_show_locus,
+   but turning on colorization for everything confuses "dg-warning" etc.
+   Hence we special-case it within this plugin by using this modified
+   version of default_diagnostic_finalizer, which, if "color" is
+   passed in as a plugin argument turns on colorization, but just
+   for diagnostic_show_locus.  */
+
+static void
+custom_diagnostic_finalizer (diagnostic_context *context,
+			     diagnostic_info *diagnostic)
+{
+  bool old_show_color = pp_show_color (context->printer);
+  if (force_show_locus_color)
+    pp_show_color (context->printer) = true;
+  diagnostic_show_locus (context, diagnostic);
+  pp_show_color (context->printer) = old_show_color;
+
+  pp_destroy_prefix (context->printer);
+  pp_newline_and_flush (context->printer);
+}
+
+/* Exercise the diagnostic machinery to emit various warnings,
+   for use by diagnostic-test-show-locus-*.c.
+
+   We inject each warning relative to the start of a function,
+   which avoids lots of hardcoded absolute locations.  */
+
+static void
+test_show_locus (function *fun)
+{
+  tree fndecl = fun->decl;
+  tree identifier = DECL_NAME (fndecl);
+  const char *fnname = IDENTIFIER_POINTER (identifier);
+  location_t fnstart = fun->function_start_locus;
+  int fnstart_line = LOCATION_LINE (fnstart);
+
+  diagnostic_finalizer (global_dc) = custom_diagnostic_finalizer;
+
+  /* Hardcode the "terminal width", to verify the behavior of
+     very wide lines.  */
+  global_dc->caret_max_width = 70;
+
+  if (0 == strcmp (fnname, "test_simple"))
+    {
+      const int line = fnstart_line + 2;
+      rich_location richloc (get_loc (line, 15));
+      richloc.add_range (get_loc (line, 10), get_loc (line, 14));
+      richloc.add_range (get_loc (line, 16), get_loc (line, 16));
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  if (0 == strcmp (fnname, "test_simple_2"))
+    {
+      const int line = fnstart_line + 2;
+      rich_location richloc (get_loc (line, 24));
+      richloc.add_range (get_loc (line, 6),
+			 get_loc (line, 22));
+      richloc.add_range (get_loc (line, 26),
+			 get_loc (line, 43));
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  if (0 == strcmp (fnname, "test_multiline"))
+    {
+      const int line = fnstart_line + 2;
+      rich_location richloc (get_loc (line + 1, 7));
+      richloc.add_range (get_loc (line, 7),
+			 get_loc (line, 23));
+      richloc.add_range (get_loc (line + 1, 9),
+			 get_loc (line + 1, 26));
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  if (0 == strcmp (fnname, "test_many_lines"))
+    {
+      const int line = fnstart_line + 2;
+      rich_location richloc (get_loc (line + 5, 7));
+      richloc.add_range (get_loc (line, 7),
+			 get_loc (line + 4, 65));
+      richloc.add_range (get_loc (line + 5, 9),
+			 get_loc (line + 10, 61));
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  /* Example of a rich_location constructed directly from a
+     source_range where the range is larger than one character.  */
+  if (0 == strcmp (fnname, "test_richloc_from_proper_range"))
+    {
+      const int line = fnstart_line + 2;
+      source_range src_range;
+      src_range.m_start = get_loc (line, 12);
+      src_range.m_finish = get_loc (line, 16);
+      rich_location richloc (src_range);
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  /* Example of a single-range location where the range starts
+     before the caret.  */
+  if (0 == strcmp (fnname, "test_caret_within_proper_range"))
+    {
+      const int line = fnstart_line + 2;
+      location_t caret = get_loc (line, 16);
+      source_range src_range;
+      src_range.m_start = get_loc (line, 12);
+      src_range.m_finish = get_loc (line, 20);
+      rich_location richloc (caret);
+      richloc.set_range (0, src_range, true, false);
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  /* Example of a very wide line, where the information of interest
+     is beyond the width of the terminal (hardcoded above).  */
+  if (0 == strcmp (fnname, "test_very_wide_line"))
+    {
+      const int line = fnstart_line + 2;
+      location_t caret = get_loc (line, 94);
+      source_range src_range;
+      src_range.m_start = get_loc (line, 90);
+      src_range.m_finish = get_loc (line, 98);
+      rich_location richloc (caret);
+      richloc.set_range (0, src_range, true, false);
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  /* Example of multiple carets.  */
+  if (0 == strcmp (fnname, "test_multiple_carets"))
+    {
+      const int line = fnstart_line + 2;
+      location_t caret_a = get_loc (line, 7);
+      location_t caret_b = get_loc (line, 11);
+      rich_location richloc (caret_a);
+      richloc.add_range (caret_b, caret_b, true);
+      global_dc->caret_chars[0] = 'A';
+      global_dc->caret_chars[1] = 'B';
+      warning_at_rich_loc (&richloc, 0, "test");
+      global_dc->caret_chars[0] = '^';
+      global_dc->caret_chars[1] = '^';
+    }
+
+  /* Example of two carets where both carets appear to have an off-by-one
+     error appearing one column early.
+     Seen with gfortran.dg/associate_5.f03.
+     In an earlier version of the printer, the printing of caret 0 aka
+     "1" was suppressed due to it appearing within the leading whitespace
+     before the text in its line.  Ensure that we at least faithfully
+     print both carets, at the given (erroneous) locations.  */
+  if (0 == strcmp (fnname, "test_caret_on_leading_whitespace"))
+    {
+      const int line = fnstart_line + 3;
+      location_t caret_a = get_loc (line, 5);
+      location_t caret_b = get_loc (line - 1, 19);
+      rich_location richloc (caret_a);
+      richloc.add_range (caret_b, caret_b, true);
+      global_dc->caret_chars[0] = '1';
+      global_dc->caret_chars[1] = '2';
+      warning_at_rich_loc (&richloc, 0, "test");
+      global_dc->caret_chars[0] = '^';
+      global_dc->caret_chars[1] = '^';
+    }
+}
+
+unsigned int
+pass_test_show_locus::execute (function *fun)
+{
+  test_show_locus (fun);
+  return 0;
+}
+
+static gimple_opt_pass *
+make_pass_test_show_locus (gcc::context *ctxt)
+{
+  return new pass_test_show_locus (ctxt);
+}
+
+int
+plugin_init (struct plugin_name_args *plugin_info,
+	     struct plugin_gcc_version *version)
+{
+  struct register_pass_info pass_info;
+  const char *plugin_name = plugin_info->base_name;
+  int argc = plugin_info->argc;
+  struct plugin_argument *argv = plugin_info->argv;
+
+  if (!plugin_default_version_check (version, &gcc_version))
+    return 1;
+
+  for (int i = 0; i < argc; i++)
+    {
+      if (0 == strcmp (argv[i].key, "color"))
+	force_show_locus_color = true;
+    }
+
+  pass_info.pass = make_pass_test_show_locus (g);
+  pass_info.reference_pass_name = "ssa";
+  pass_info.ref_pass_instance_number = 1;
+  pass_info.pos_op = PASS_POS_INSERT_AFTER;
+  register_callback (plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL,
+		     &pass_info);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/plugin.exp b/gcc/testsuite/gcc.dg/plugin/plugin.exp
index 39fab6e..941bccc 100644
--- a/gcc/testsuite/gcc.dg/plugin/plugin.exp
+++ b/gcc/testsuite/gcc.dg/plugin/plugin.exp
@@ -63,6 +63,9 @@ set plugin_test_list [list \
     { start_unit_plugin.c start_unit-test-1.c } \
     { finish_unit_plugin.c finish_unit-test-1.c } \
     { wide-int_plugin.c wide-int-test-1.c } \
+    { diagnostic_plugin_test_show_locus.c \
+	  diagnostic-test-show-locus-bw.c \
+	  diagnostic-test-show-locus-color.c } \
 ]
 
 foreach plugin_test $plugin_test_list {
diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp
index 7c1ab85..8cc1d87 100644
--- a/gcc/testsuite/lib/gcc-dg.exp
+++ b/gcc/testsuite/lib/gcc-dg.exp
@@ -29,6 +29,7 @@ load_lib libgloss.exp
 load_lib target-libpath.exp
 load_lib torture-options.exp
 load_lib fortran-modules.exp
+load_lib multiline.exp
 
 # We set LC_ALL and LANG to C so that we get the same error messages as expected.
 setenv LC_ALL C
diff --git a/gcc/tree-diagnostic.c b/gcc/tree-diagnostic.c
index 135f142..02009d8 100644
--- a/gcc/tree-diagnostic.c
+++ b/gcc/tree-diagnostic.c
@@ -289,7 +289,7 @@ default_tree_printer (pretty_printer *pp, text_info *text, const char *spec,
     }
 
   if (set_locus)
-    text->set_location (0, DECL_SOURCE_LOCATION (t));
+    text->set_location (0, DECL_SOURCE_LOCATION (t), true);
 
   if (DECL_P (t))
     {
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index ce3f6a8..29bc48a 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -3592,7 +3592,7 @@ void
 percent_K_format (text_info *text)
 {
   tree t = va_arg (*text->args_ptr, tree), block;
-  text->set_location (0, EXPR_LOCATION (t));
+  text->set_location (0, EXPR_LOCATION (t), true);
   gcc_assert (pp_ti_abstract_origin (text) != NULL);
   block = TREE_BLOCK (t);
   *pp_ti_abstract_origin (text) = NULL;
diff --git a/libcpp/errors.c b/libcpp/errors.c
index a33196e..c351c11 100644
--- a/libcpp/errors.c
+++ b/libcpp/errors.c
@@ -57,7 +57,8 @@ cpp_diagnostic (cpp_reader * pfile, int level, int reason,
 
   if (!pfile->cb.error)
     abort ();
-  ret = pfile->cb.error (pfile, level, reason, src_loc, 0, _(msgid), ap);
+  rich_location richloc (src_loc);
+  ret = pfile->cb.error (pfile, level, reason, &richloc, _(msgid), ap);
 
   return ret;
 }
@@ -139,7 +140,9 @@ cpp_diagnostic_with_line (cpp_reader * pfile, int level, int reason,
   
   if (!pfile->cb.error)
     abort ();
-  ret = pfile->cb.error (pfile, level, reason, src_loc, column, _(msgid), ap);
+  rich_location richloc (src_loc);
+  richloc.override_column (column);
+  ret = pfile->cb.error (pfile, level, reason, &richloc, _(msgid), ap);
 
   return ret;
 }
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 5eaea6b..a2bdfa0 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -573,9 +573,9 @@ struct cpp_callbacks
 
   /* Called to emit a diagnostic.  This callback receives the
      translated message.  */
-  bool (*error) (cpp_reader *, int, int, source_location, unsigned int,
+  bool (*error) (cpp_reader *, int, int, rich_location *,
 		 const char *, va_list *)
-       ATTRIBUTE_FPTR_PRINTF(6,0);
+       ATTRIBUTE_FPTR_PRINTF(5,0);
 
   /* Callbacks for when a macro is expanded, or tested (whether
      defined or not at the time) in #ifdef, #ifndef or "defined".  */
diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index bc747c1..bd73780 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -118,6 +118,35 @@ typedef unsigned int linenum_type;
   libcpp/location-example.txt.  */
 typedef unsigned int source_location;
 
+/* A range of source locations.
+
+   Ranges are closed:
+   m_start is the first location within the range,
+   m_finish is the last location within the range.
+
+   We may need a more compact way to store these, but for now,
+   let's do it the simple way, as a pair.  */
+struct GTY(()) source_range
+{
+  source_location m_start;
+  source_location m_finish;
+
+  void debug (const char *msg) const;
+
+  /* We avoid using constructors, since various structs that
+     don't yet have constructors will embed instances of
+     source_range.  */
+
+  /* Make a source_range from a source_location.  */
+  static source_range from_location (source_location loc)
+  {
+    source_range result;
+    result.m_start = loc;
+    result.m_finish = loc;
+    return result;
+  }
+};
+
 /* Memory allocation function typedef.  Works like xrealloc.  */
 typedef void *(*line_map_realloc) (void *, size_t);
 
@@ -1015,6 +1044,175 @@ typedef struct
   bool sysp;
 } expanded_location;
 
+/* Both gcc and emacs number source *lines* starting at 1, but
+   they have differing conventions for *columns*.
+
+   GCC uses a 1-based convention for source columns,
+   whereas Emacs's M-x column-number-mode uses a 0-based convention.
+
+   For example, an error in the initial, left-hand
+   column of source line 3 is reported by GCC as:
+
+      some-file.c:3:1: error: ...etc...
+
+   On navigating to the location of that error in Emacs
+   (e.g. via "next-error"),
+   the locus is reported in the Mode Line
+   (assuming M-x column-number-mode) as:
+
+     some-file.c   10%   (3, 0)
+
+   i.e. "3:1:" in GCC corresponds to "(3, 0)" in Emacs.  */
+
+/* Ranges are closed
+   m_start is the first location within the range, and
+   m_finish is the last location within the range.  */
+struct location_range
+{
+  expanded_location m_start;
+  expanded_location m_finish;
+
+  /* Should a caret be drawn for this range?  Typically this is
+     true for the 0th range, and false for subsequent ranges,
+     but the Fortran frontend overrides this for rendering things like:
+
+       x = x + y
+           1   2
+       Error: Shapes for operands at (1) and (2) are not conformable
+
+     where "1" and "2" are notionally carets.  */
+  bool m_show_caret_p;
+  expanded_location m_caret;
+};
+
+/* A "rich" source code location, for use when printing diagnostics.
+   A rich_location has one or more ranges, each optionally with
+   a caret.   Typically the zeroth range has a caret; other ranges
+   sometimes have carets.
+
+   The "primary" location of a rich_location is the caret of range 0,
+   used for determining the line/column when printing diagnostic
+   text, such as:
+
+      some-file.c:3:1: error: ...etc...
+
+   Additional ranges may be added to help the user identify other
+   pertinent clauses in a diagnostic.
+
+   rich_location instances are intended to be allocated on the stack
+   when generating diagnostics, and to be short-lived.
+
+   Examples of rich locations
+   --------------------------
+
+   Example A
+   *********
+      int i = "foo";
+              ^
+   This "rich" location is simply a single range (range 0), with
+   caret = start = finish at the given point.
+
+   Example B
+   *********
+      a = (foo && bar)
+          ~~~~~^~~~~~~
+   This rich location has a single range (range 0), with the caret
+   at the first "&", and the start/finish at the parentheses.
+   Compare with example C below.
+
+   Example C
+   *********
+      a = (foo && bar)
+           ~~~ ^~ ~~~
+   This rich location has three ranges:
+   - Range 0 has its caret and start location at the first "&" and
+     end at the second "&.
+   - Range 1 has its start and finish at the "f" and "o" of "foo";
+     the caret is not flagged for display, but is perhaps at the "f"
+     of "foo".
+   - Similarly, range 2 has its start and finish at the "b" and "r" of
+     "bar"; the caret is not flagged for display, but is perhaps at the
+     "b" of "bar".
+   Compare with example B above.
+
+   Example D (Fortran frontend)
+   ****************************
+       x = x + y
+           1   2
+   This rich location has range 0 at "1", and range 1 at "2".
+   Both are flagged for caret display.  Both ranges have start/finish
+   equal to their caret point.  The frontend overrides the diagnostic
+   context's default caret character for these ranges.
+
+   Example E
+   *********
+      printf ("arg0: %i  arg1: %s arg2: %i",
+                               ^~
+              100, 101, 102);
+                   ~~~
+   This rich location has two ranges:
+   - range 0 is at the "%s" with start = caret = "%" and finish at
+     the "s".
+   - range 1 has start/finish covering the "101" and is not flagged for
+     caret printing; it is perhaps at the start of "101".  */
+
+class rich_location
+{
+ public:
+  /* Constructors.  */
+
+  /* Constructing from a location.  */
+  rich_location (source_location loc);
+
+  /* Constructing from a source_range.  */
+  rich_location (source_range src_range);
+
+  /* Accessors.  */
+  source_location get_loc () const { return m_loc; }
+
+  source_location *get_loc_addr () { return &m_loc; }
+
+  void
+  add_range (source_location start, source_location finish,
+	     bool show_caret_p = false);
+
+  void
+  add_range (source_range src_range,
+	     bool show_caret_p = false);
+
+  void
+  add_range (location_range *src_range);
+
+  void
+  set_range (unsigned int idx, source_range src_range,
+	     bool show_caret_p, bool overwrite_loc_p);
+
+  unsigned int get_num_locations () const { return m_num_ranges; }
+
+  location_range *get_range (unsigned int idx)
+  {
+    linemap_assert (idx < m_num_ranges);
+    return &m_ranges[idx];
+  }
+
+  expanded_location lazily_expand_location ();
+
+  void
+  override_column (int column);
+
+public:
+  static const int MAX_RANGES = 3;
+
+protected:
+  source_location m_loc;
+
+  unsigned int m_num_ranges;
+  location_range m_ranges[MAX_RANGES];
+
+  bool m_have_expanded_location;
+  expanded_location m_expanded_location;
+};
+
 /* This is enum is used by the function linemap_resolve_location
    below.  The meaning of the values is explained in the comment of
    that function.  */
@@ -1158,4 +1356,13 @@ void linemap_dump (FILE *, struct line_maps *, unsigned, bool);
    specifies how many macro maps to dump.  */
 void line_table_dump (FILE *, struct line_maps *, unsigned int, unsigned int);
 
+/* The rich_location class requires a way to expand source_location instances.
+   We would directly use expand_location_to_spelling_point, which is
+   implemented in gcc/input.c, but we also need to use it for rich_location
+   within genmatch.c.
+   Hence we require client code of libcpp to implement the following
+   symbol.  */
+extern expanded_location
+linemap_client_expand_location_to_spelling_point (source_location );
+
 #endif /* !LIBCPP_LINE_MAP_H  */
diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index 3d82e9b..a6fa782 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -1752,3 +1752,133 @@ line_table_dump (FILE *stream, struct line_maps *set, unsigned int num_ordinary,
       fprintf (stream, "\n");
     }
 }
+
+/* class rich_location.  */
+
+/* Construct a rich_location with location LOC as its initial range.  */
+
+rich_location::rich_location (source_location loc) :
+  m_loc (loc),
+  m_num_ranges (0),
+  m_have_expanded_location (false)
+{
+  /* Set up the 0th range: */
+  add_range (loc, loc, true);
+  m_ranges[0].m_caret = lazily_expand_location ();
+}
+
+/* Construct a rich_location with source_range SRC_RANGE as its
+   initial range.  */
+
+rich_location::rich_location (source_range src_range)
+: m_loc (src_range.m_start),
+  m_num_ranges (0),
+  m_have_expanded_location (false)
+{
+  /* Set up the 0th range: */
+  add_range (src_range, true);
+}
+
+/* Get an expanded_location for this rich_location's primary
+   location.  */
+
+expanded_location
+rich_location::lazily_expand_location ()
+{
+  if (!m_have_expanded_location)
+    {
+      m_expanded_location
+	= linemap_client_expand_location_to_spelling_point (m_loc);
+      m_have_expanded_location = true;
+    }
+
+  return m_expanded_location;
+}
+
+/* Set the column of the primary location.  */
+
+void
+rich_location::override_column (int column)
+{
+  lazily_expand_location ();
+  m_expanded_location.column = column;
+}
+
+/* Add the given range.  */
+
+void
+rich_location::add_range (source_location start, source_location finish,
+			  bool show_caret_p)
+{
+  linemap_assert (m_num_ranges < MAX_RANGES);
+
+  location_range *range = &m_ranges[m_num_ranges++];
+  range->m_start = linemap_client_expand_location_to_spelling_point (start);
+  range->m_finish = linemap_client_expand_location_to_spelling_point (finish);
+  range->m_caret = range->m_start;
+  range->m_show_caret_p = show_caret_p;
+}
+
+/* Add the given range.  */
+
+void
+rich_location::add_range (source_range src_range, bool show_caret_p)
+{
+  linemap_assert (m_num_ranges < MAX_RANGES);
+
+  add_range (src_range.m_start, src_range.m_finish, show_caret_p);
+}
+
+void
+rich_location::add_range (location_range *src_range)
+{
+  linemap_assert (m_num_ranges < MAX_RANGES);
+
+  m_ranges[m_num_ranges++] = *src_range;
+}
+
+/* Add or overwrite the range given by IDX.  It must either
+   overwrite an existing range, or add one *exactly* on the end of
+   the array.
+
+   This is primarily for use by gcc when implementing diagnostic
+   format decoders e.g. the "+" in the C/C++ frontends, for handling
+   format codes like "%q+D" (which writes the source location of a
+   tree back into range 0 of the rich_location).
+
+   If SHOW_CARET_P is true, then the range should be rendered with
+   a caret at its starting location.  This
+   is for use by the Fortran frontend, for implementing the
+   "%C" and "%L" format codes.  */
+
+void
+rich_location::set_range (unsigned int idx, source_range src_range,
+			  bool show_caret_p, bool overwrite_loc_p)
+{
+  linemap_assert (idx < MAX_RANGES);
+
+  /* We can either overwrite an existing range, or add one exactly
+     on the end of the array.  */
+  linemap_assert (idx <= m_num_ranges);
+
+  location_range *locrange = &m_ranges[idx];
+  locrange->m_start
+    = linemap_client_expand_location_to_spelling_point (src_range.m_start);
+  locrange->m_finish
+    = linemap_client_expand_location_to_spelling_point (src_range.m_finish);
+
+  locrange->m_show_caret_p = show_caret_p;
+  if (overwrite_loc_p)
+    locrange->m_caret = locrange->m_start;
+
+  /* Are we adding a range onto the end?  */
+  if (idx == m_num_ranges)
+    m_num_ranges = idx + 1;
+
+  if (idx == 0 && overwrite_loc_p)
+    {
+      m_loc = src_range.m_start;
+      /* Mark any cached value here as dirty.  */
+      m_have_expanded_location = false;
+    }
+}
-- 
1.8.5.3


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH] v4 of diagnostic_show_locus and rich_location
  2015-10-12 15:45         ` [PATCH] v4 of diagnostic_show_locus and rich_location David Malcolm
@ 2015-10-12 16:37           ` Manuel López-Ibáñez
  2015-10-13 18:09             ` David Malcolm
  0 siblings, 1 reply; 83+ messages in thread
From: Manuel López-Ibáñez @ 2015-10-12 16:37 UTC (permalink / raw)
  To: David Malcolm
  Cc: Dodji Seketeli, Gcc Patch List, Jason Merrill, Tobias Burnus,
	Joseph S. Myers

On 12 October 2015 at 16:44, David Malcolm <dmalcolm@redhat.com> wrote:
> v4 of the patch does the conversion of Fortran, and eliminates the
> adaptation layer.  No partial transitions here!
>
> Manu: I hope this addresses your concerns.

Yes, it looks great. I don't understand how this

-   and for two locations that do not fit in the same locus line:
-
-       [name]:[locus]: Error: (1)
-       [name]:[locus2]: Error: Some error at (1) and (2)
+       [locus of primary range]: Error: Some error at (1) and (2)


passes the Fortran regression testsuite since the testcases normally
try to match the two locus separately, but I guess you figured out a
way to make it work and I must admit I did not have the time to read
the patch in deep detail. But it is a bit strange that you also
deleted this part:

-   With -fdiagnostic-show-caret (the default) and for valid locations,
-   it prints for one location:
+   With -fdiagnostic-show-caret (the default) it prints:

-       [locus]:
+       [locus of primary range]:

           some code
                  1
        Error: Some error at (1)

-   for two locations that fit in the same locus line:
+  With -fno-diagnostic-show-caret or if the primary range is not
+  valid, it prints:

-       [locus]:
-
-         some code and some more code
-                1       2
-       Error: Some error at (1) and (2)
-
-   and for two locations that do not fit in the same locus line:
-
-       [locus]:
-
-         some code
-                1
-       [locus2]:
-
-         some other code
-           2
-       Error: Some error at (1) and (2)
-

which should work the same before and after your patch. Independently
of whether the actual logic moved into some new mechanism in the new
rich locations world, this seems like useful info to keep in
fortran/error.c.

Cheers,

Manuel.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Benchmarks of v2 (was Re: [PATCH 0/5] RFC: Overhaul of diagnostics (v2))
  2015-09-24  9:03       ` Richard Biener
  2015-09-25 16:50         ` Jeff Law
@ 2015-10-13 15:33         ` David Malcolm
  2015-10-14  9:00           ` Richard Biener
  1 sibling, 1 reply; 83+ messages in thread
From: David Malcolm @ 2015-10-13 15:33 UTC (permalink / raw)
  To: Richard Biener; +Cc: Michael Matz, GCC Patches

On Thu, 2015-09-24 at 10:15 +0200, Richard Biener wrote:
> On Thu, Sep 24, 2015 at 2:25 AM, David Malcolm <dmalcolm@redhat.com> wrote:
> > On Wed, 2015-09-23 at 15:36 +0200, Richard Biener wrote:
> >> On Wed, Sep 23, 2015 at 3:19 PM, Michael Matz <matz@suse.de> wrote:
> >> > Hi,
> >> >
> >> > On Tue, 22 Sep 2015, David Malcolm wrote:
> >> >
> >> >> The drawback is that it could bloat the ad-hoc table.  Can the ad-hoc
> >> >> table ever get smaller, or does it only ever get inserted into?
> >> >
> >> > It only ever grows.
> >> >
> >> >> An idea I had is that we could stash short ranges directly into the 32
> >> >> bits of location_t, by offsetting the per-column-bits somewhat.
> >> >
> >> > It's certainly worth an experiment: let's say you restrict yourself to
> >> > tokens less than 8 characters, you need an additional 3 bits (using one
> >> > value, e.g. zero, as the escape value).  That leaves 20 bits for the line
> >> > numbers (for the normal 8 bit columns), which might be enough for most
> >> > single-file compilations.  For LTO compilation this often won't be enough.
> >> >
> >> >> My plan is to investigate the impact these patches have on the time and
> >> >> memory consumption of the compiler,
> >> >
> >> > When you do so, make sure you're also measuring an LTO compilation with
> >> > debug info of something big (firefox).  I know that we already had issues
> >> > with the size of the linemap data in the past for these cases (probably
> >> > when we added columns).
> >>
> >> The issue we have with LTO is that the linemap gets populated in quite
> >> random order and thus we repeatedly switch files (we've mitigated this
> >> somewhat for GCC 5).  We also considered dropping column info
> >> (and would drop range info) as diagnostics are from optimizers only
> >> with LTO and we keep locations merely for debug info.
> >
> > Thanks.  Presumably the mitigation you're referring to is the
> > lto_location_cache class in lto-streamer-in.c?
> >
> > Am I right in thinking that, right now, the LTO code doesn't support
> > ad-hoc locations? (presumably the block pointers only need to exist
> > during optimization, which happens after the serialization)
> 
> LTO code does support ad-hoc locations but they are "restored" only
> when reading function bodies and stmts (by means of COMBINE_LOCATION_DATA).
> 
> > The obvious simplification would be, as you suggest, to not bother
> > storing range information with LTO, falling back to just the existing
> > representation.  Then there's no need to extend LTO to serialize ad-hoc
> > data; simply store the underlying locus into the bit stream.  I think
> > that this happens already: lto-streamer-out.c calls expand_location and
> > stores the result, so presumably any ad-hoc location_t values made by
> > the v2 patches would have dropped their range data there when I ran the
> > test suite.
> 
> Yep.  We only preserve BLOCKs, so if you don't add extra code to
> preserve ranges they'll be "dropped".
> 
> > If it's acceptable to not bother with ranges for LTO, one way to do the
> > "stashing short ranges into the location_t" idea might be for the
> > bits-per-range of location_t values to be a property of the line_table
> > (or possibly the line map), set up when the struct line_maps is created.
> > For non-LTO it could be some tuned value (maybe from a param?); for LTO
> > it could be zero, so that we have as many bits as before for line/column
> > data.
> 
> That could be a possibility (likewise for column info?)
> 
> Richard.
> 
> > Hope this sounds sane
> > Dave

I did some crude benchmarking of the patchkit, using these scripts:
  https://github.com/davidmalcolm/gcc-benchmarking
(specifically, bb0222b455df8cefb53bfc1246eb0a8038256f30),
using the "big-code.c" and "kdecore.cc" files Michael posted as:
  https://gcc.gnu.org/ml/gcc-patches/2013-09/msg00062.html
and "influence.i", a preprocessed version of SPEC2006's 445.gobmk
engine/influence.c (as an example of a moderate-sized pure C source
file).

This doesn't yet cover very large autogenerated C files, and the .cc
file is only being measured to see the effect on the ad-hoc table (and
tokenization).

"control" was r227977.
"experiment" was the same revision with the v2 patchkit applied.

Recall that this patchkit captures ranges for tokens as an extra field
within tokens within libcpp and the C FE, and adds ranges to the ad-hoc
location lookaside, storing them for all tree nodes within the C FE that
have a location_t, and passing them around within c_expr for all C
expressions (including those that don't have a location_t).

Both control and experiment were built with
  --enable-checking=release \
  --disable-bootstrap \
  --disable-multilib \
  --enable-languages=c,ada,c++,fortran,go,java,lto,objc,obj-c++

The script measures:

(a) wallclock time for "xgcc -S" so it's measuring the driver, parsing,
optimimation, etc, rather than attempting to directly measure parsing.
This is without -ftime-report, since Mikhail indicated it's sufficiently
expensive to skew timings in this post:
  https://gcc.gnu.org/ml/gcc/2015-07/msg00165.html

(b) memory usage: by performing a separate build with -ftime-report,
extracting the "TOTAL" ggc value (actually 3 builds, but it's the same
each time).

Is this a fair way to measure things?  It could be argued that by
measuring totals I'm hiding the extra parsing cost in the overall cost.

Full logs can be seen at:
  https://dmalcolm.fedorapeople.org/gcc/2015-09-25/bmark-v2.txt
(v2 of the patchkit)

I also investigated a version of the patchkit with the token tracking
rewritten to build ad-hoc ranges for *every token*, without attempting
any kind of optimization (e.g. for short ranges).
A log of this can be seen at:
https://dmalcolm.fedorapeople.org/gcc/2015-09-25/bmark-v2-plus-adhoc-ranges-for-tokens.txt
(v2 of the patchkit, with token tracking rewritten to build ad-hoc
ranges for *every token*).
The nice thing about this approach is that lots of token-related
diagnostics gain underlining of the relevant token "for free" simply
from the location_t, without having to individually patch them.  Without
any optimization, the memory consumed by this approach is clearly
larger.

A summary comparing the two logs:

Minimal wallclock time (s) over 10 iterations
                          Control -> v2                                 Control -> v2+adhocloc+at+every+token
kdecore.cc -g -O0          10.306548 -> 10.268712: 1.00x faster          10.247160 -> 10.444528: 1.02x slower
kdecore.cc -g -O1          27.026285 -> 27.220654: 1.01x slower          27.280681 -> 27.622676: 1.01x slower
kdecore.cc -g -O2          43.791668 -> 44.020270: 1.01x slower          43.904934 -> 44.248477: 1.01x slower
kdecore.cc -g -O3          47.471836 -> 47.651101: 1.00x slower          47.645985 -> 48.005495: 1.01x slower
kdecore.cc -g -Os          31.678652 -> 31.802829: 1.00x slower          31.741484 -> 32.033478: 1.01x slower
   empty.c -g -O0            0.012662 -> 0.011932: 1.06x faster            0.012888 -> 0.013143: 1.02x slower
   empty.c -g -O1            0.012685 -> 0.012558: 1.01x faster            0.013164 -> 0.012790: 1.03x faster
   empty.c -g -O2            0.012694 -> 0.012846: 1.01x slower            0.012912 -> 0.013175: 1.02x slower
   empty.c -g -O3            0.012654 -> 0.012699: 1.00x slower            0.012596 -> 0.012792: 1.02x slower
   empty.c -g -Os            0.013057 -> 0.012766: 1.02x faster            0.012691 -> 0.012885: 1.02x slower
big-code.c -g -O0            3.292680 -> 3.325748: 1.01x slower            3.292948 -> 3.303049: 1.00x slower
big-code.c -g -O1          15.701810 -> 15.765014: 1.00x slower          15.714116 -> 15.759254: 1.00x slower
big-code.c -g -O2          22.575615 -> 22.620187: 1.00x slower          22.567406 -> 22.605435: 1.00x slower
big-code.c -g -O3          52.423586 -> 52.590075: 1.00x slower          52.421460 -> 52.703835: 1.01x slower
big-code.c -g -Os          21.153980 -> 21.253598: 1.00x slower          21.146266 -> 21.260138: 1.01x slower
influence.i -g -O0            0.148229 -> 0.149518: 1.01x slower            0.148672 -> 0.156262: 1.05x slower
influence.i -g -O1            0.387397 -> 0.389930: 1.01x slower            0.387734 -> 0.396655: 1.02x slower
influence.i -g -O2            0.587514 -> 0.589604: 1.00x slower            0.588064 -> 0.596510: 1.01x slower
influence.i -g -O3            1.273561 -> 1.280514: 1.01x slower            1.274599 -> 1.287596: 1.01x slower
influence.i -g -Os            0.526045 -> 0.527579: 1.00x slower            0.526827 -> 0.535635: 1.02x slower


Maximal ggc memory (kb)
                     Control -> v2                                 Control -> v2+adhocloc+at+every+token
kdecore.cc -g -O0      650337.000 -> 654435.000: 1.0063x larger      650337.000 -> 711775.000: 1.0945x larger
kdecore.cc -g -O1      931966.000 -> 940144.000: 1.0088x larger      931951.000 -> 989384.000: 1.0616x larger
kdecore.cc -g -O2    1125325.000 -> 1133514.000: 1.0073x larger    1125318.000 -> 1182384.000: 1.0507x larger
kdecore.cc -g -O3    1221408.000 -> 1229596.000: 1.0067x larger    1221410.000 -> 1278658.000: 1.0469x larger
kdecore.cc -g -Os      867140.000 -> 871235.000: 1.0047x larger      867141.000 -> 928700.000: 1.0710x larger
   empty.c -g -O0          1189.000 -> 1192.000: 1.0025x larger          1189.000 -> 1193.000: 1.0034x larger
   empty.c -g -O1          1189.000 -> 1192.000: 1.0025x larger          1189.000 -> 1193.000: 1.0034x larger
   empty.c -g -O2          1189.000 -> 1192.000: 1.0025x larger          1189.000 -> 1193.000: 1.0034x larger
   empty.c -g -O3          1189.000 -> 1192.000: 1.0025x larger          1189.000 -> 1193.000: 1.0034x larger
   empty.c -g -Os          1189.000 -> 1192.000: 1.0025x larger          1189.000 -> 1193.000: 1.0034x larger
big-code.c -g -O0      166584.000 -> 172731.000: 1.0369x larger      166584.000 -> 172726.000: 1.0369x larger
big-code.c -g -O1      279793.000 -> 285940.000: 1.0220x larger      279793.000 -> 285935.000: 1.0220x larger
big-code.c -g -O2      400058.000 -> 406194.000: 1.0153x larger      400058.000 -> 406189.000: 1.0153x larger
big-code.c -g -O3      903648.000 -> 909750.000: 1.0068x larger      903906.000 -> 910001.000: 1.0067x larger
big-code.c -g -Os      357060.000 -> 363010.000: 1.0167x larger      357060.000 -> 363005.000: 1.0166x larger
influence.i -g -O0          9273.000 -> 9719.000: 1.0481x larger         9273.000 -> 13303.000: 1.4346x larger
influence.i -g -O1        12968.000 -> 13414.000: 1.0344x larger        12968.000 -> 16998.000: 1.3108x larger
influence.i -g -O2        16386.000 -> 16768.000: 1.0233x larger        16386.000 -> 20352.000: 1.2420x larger
influence.i -g -O3        35508.000 -> 35763.000: 1.0072x larger        35508.000 -> 39346.000: 1.1081x larger
influence.i -g -Os        14287.000 -> 14669.000: 1.0267x larger        14287.000 -> 18253.000: 1.2776x larger

Thoughts?
Dave


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH] v4 of diagnostic_show_locus and rich_location
  2015-10-12 16:37           ` Manuel López-Ibáñez
@ 2015-10-13 18:09             ` David Malcolm
  0 siblings, 0 replies; 83+ messages in thread
From: David Malcolm @ 2015-10-13 18:09 UTC (permalink / raw)
  To: Manuel López-Ibáñez
  Cc: Dodji Seketeli, Gcc Patch List, Jason Merrill, Tobias Burnus,
	Joseph S. Myers

On Mon, 2015-10-12 at 17:36 +0100, Manuel López-Ibáñez wrote:
> On 12 October 2015 at 16:44, David Malcolm <dmalcolm@redhat.com> wrote:
> > v4 of the patch does the conversion of Fortran, and eliminates the
> > adaptation layer.  No partial transitions here!
> >
> > Manu: I hope this addresses your concerns.
> 
> Yes, it looks great. I don't understand how this
> 
> -   and for two locations that do not fit in the same locus line:
> -
> -       [name]:[locus]: Error: (1)
> -       [name]:[locus2]: Error: Some error at (1) and (2)
> +       [locus of primary range]: Error: Some error at (1) and (2)
> 
> 
> passes the Fortran regression testsuite since the testcases normally
> try to match the two locus separately, but I guess you figured out a
> way to make it work and I must admit I did not have the time to read
> the patch in deep detail.

The way it works is that the patch kit emulates the behavior of the old
printer for the -fno-diagnostics-show-caret case.

Consider this two-locus error, where the loci are on different lines.
With the patch it prints:

associate_5.f03:33:6:

     ASSOCIATE (y => x) ! { dg-error "variable definition context" }
                    2
       y = 5 ! { dg-error "variable definition context" }
      1

Error: Associate-name ‘y’ can not appear in a variable definition context (assignment) at (1) because its target at (2) can not, either

...using the new implementation of diagnostic-show-locus.


With -fno-diagnostics-show-caret, it prints:

associate_5.f03:33:6: Error: (1)
associate_5.f03:32:20: Error: Associate-name ‘y’ can not appear in a variable definition context (assignment) at (1) because its target at (2) can not, either

where the latter is the same behavior as before the patch.

The testsuite passes since it's faithfully emulating the old
-fno-diagnostics-show-caret behavior.

> But it is a bit strange that you also
> deleted this part:
> 
> -   With -fdiagnostic-show-caret (the default) and for valid locations,
> -   it prints for one location:
> +   With -fdiagnostic-show-caret (the default) it prints:
> 
> -       [locus]:
> +       [locus of primary range]:
> 
>            some code
>                   1
>         Error: Some error at (1)
> 
> -   for two locations that fit in the same locus line:
> +  With -fno-diagnostic-show-caret or if the primary range is not
> +  valid, it prints:
> 
> -       [locus]:
> -
> -         some code and some more code
> -                1       2
> -       Error: Some error at (1) and (2)
> -
> -   and for two locations that do not fit in the same locus line:
> -
> -       [locus]:
> -
> -         some code
> -                1
> -       [locus2]:
> -
> -         some other code
> -           2
> -       Error: Some error at (1) and (2)
> -
> 
> which should work the same before and after your patch. 

But this isn't what the new printer prints, for the
-fdiagnostic-show-caret case.  It doesn't print multiple "[locusN]:"
lines; these are only printed for the no-d-show-caret case.

> Independently
> of whether the actual logic moved into some new mechanism in the new
> rich locations world, this seems like useful info to keep in
> fortran/error.c.

Perhaps it's easiest to approach this from the POV of what the comment
*should* say.  For reference, the comment for gfc_diagnostic_starter
reads like this after the patch:

/* This function prints the locus (file:line:column), the diagnostic kind
   (Error, Warning) and (optionally) the relevant lines of code with
   annotation lines with '1' and/or '2' below them.

   With -fdiagnostic-show-caret (the default) it prints:

       [locus of primary range]:
       
          some code
                 1
       Error: Some error at (1)
        
  With -fno-diagnostic-show-caret or if the primary range is not
  valid, it prints:

       [locus of primary range]: Error: Some error at (1) and (2)
*/

Does this look OK?

Thanks
Dave

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Benchmarks of v2 (was Re: [PATCH 0/5] RFC: Overhaul of diagnostics (v2))
  2015-10-13 15:33         ` Benchmarks of v2 (was Re: [PATCH 0/5] RFC: Overhaul of diagnostics (v2)) David Malcolm
@ 2015-10-14  9:00           ` Richard Biener
  2015-10-14 12:49             ` Michael Matz
                               ` (2 more replies)
  0 siblings, 3 replies; 83+ messages in thread
From: Richard Biener @ 2015-10-14  9:00 UTC (permalink / raw)
  To: David Malcolm; +Cc: Michael Matz, GCC Patches

On Tue, Oct 13, 2015 at 5:32 PM, David Malcolm <dmalcolm@redhat.com> wrote:
> On Thu, 2015-09-24 at 10:15 +0200, Richard Biener wrote:
>> On Thu, Sep 24, 2015 at 2:25 AM, David Malcolm <dmalcolm@redhat.com> wrote:
>> > On Wed, 2015-09-23 at 15:36 +0200, Richard Biener wrote:
>> >> On Wed, Sep 23, 2015 at 3:19 PM, Michael Matz <matz@suse.de> wrote:
>> >> > Hi,
>> >> >
>> >> > On Tue, 22 Sep 2015, David Malcolm wrote:
>> >> >
>> >> >> The drawback is that it could bloat the ad-hoc table.  Can the ad-hoc
>> >> >> table ever get smaller, or does it only ever get inserted into?
>> >> >
>> >> > It only ever grows.
>> >> >
>> >> >> An idea I had is that we could stash short ranges directly into the 32
>> >> >> bits of location_t, by offsetting the per-column-bits somewhat.
>> >> >
>> >> > It's certainly worth an experiment: let's say you restrict yourself to
>> >> > tokens less than 8 characters, you need an additional 3 bits (using one
>> >> > value, e.g. zero, as the escape value).  That leaves 20 bits for the line
>> >> > numbers (for the normal 8 bit columns), which might be enough for most
>> >> > single-file compilations.  For LTO compilation this often won't be enough.
>> >> >
>> >> >> My plan is to investigate the impact these patches have on the time and
>> >> >> memory consumption of the compiler,
>> >> >
>> >> > When you do so, make sure you're also measuring an LTO compilation with
>> >> > debug info of something big (firefox).  I know that we already had issues
>> >> > with the size of the linemap data in the past for these cases (probably
>> >> > when we added columns).
>> >>
>> >> The issue we have with LTO is that the linemap gets populated in quite
>> >> random order and thus we repeatedly switch files (we've mitigated this
>> >> somewhat for GCC 5).  We also considered dropping column info
>> >> (and would drop range info) as diagnostics are from optimizers only
>> >> with LTO and we keep locations merely for debug info.
>> >
>> > Thanks.  Presumably the mitigation you're referring to is the
>> > lto_location_cache class in lto-streamer-in.c?
>> >
>> > Am I right in thinking that, right now, the LTO code doesn't support
>> > ad-hoc locations? (presumably the block pointers only need to exist
>> > during optimization, which happens after the serialization)
>>
>> LTO code does support ad-hoc locations but they are "restored" only
>> when reading function bodies and stmts (by means of COMBINE_LOCATION_DATA).
>>
>> > The obvious simplification would be, as you suggest, to not bother
>> > storing range information with LTO, falling back to just the existing
>> > representation.  Then there's no need to extend LTO to serialize ad-hoc
>> > data; simply store the underlying locus into the bit stream.  I think
>> > that this happens already: lto-streamer-out.c calls expand_location and
>> > stores the result, so presumably any ad-hoc location_t values made by
>> > the v2 patches would have dropped their range data there when I ran the
>> > test suite.
>>
>> Yep.  We only preserve BLOCKs, so if you don't add extra code to
>> preserve ranges they'll be "dropped".
>>
>> > If it's acceptable to not bother with ranges for LTO, one way to do the
>> > "stashing short ranges into the location_t" idea might be for the
>> > bits-per-range of location_t values to be a property of the line_table
>> > (or possibly the line map), set up when the struct line_maps is created.
>> > For non-LTO it could be some tuned value (maybe from a param?); for LTO
>> > it could be zero, so that we have as many bits as before for line/column
>> > data.
>>
>> That could be a possibility (likewise for column info?)
>>
>> Richard.
>>
>> > Hope this sounds sane
>> > Dave
>
> I did some crude benchmarking of the patchkit, using these scripts:
>   https://github.com/davidmalcolm/gcc-benchmarking
> (specifically, bb0222b455df8cefb53bfc1246eb0a8038256f30),
> using the "big-code.c" and "kdecore.cc" files Michael posted as:
>   https://gcc.gnu.org/ml/gcc-patches/2013-09/msg00062.html
> and "influence.i", a preprocessed version of SPEC2006's 445.gobmk
> engine/influence.c (as an example of a moderate-sized pure C source
> file).
>
> This doesn't yet cover very large autogenerated C files, and the .cc
> file is only being measured to see the effect on the ad-hoc table (and
> tokenization).
>
> "control" was r227977.
> "experiment" was the same revision with the v2 patchkit applied.
>
> Recall that this patchkit captures ranges for tokens as an extra field
> within tokens within libcpp and the C FE, and adds ranges to the ad-hoc
> location lookaside, storing them for all tree nodes within the C FE that
> have a location_t, and passing them around within c_expr for all C
> expressions (including those that don't have a location_t).
>
> Both control and experiment were built with
>   --enable-checking=release \
>   --disable-bootstrap \
>   --disable-multilib \
>   --enable-languages=c,ada,c++,fortran,go,java,lto,objc,obj-c++
>
> The script measures:
>
> (a) wallclock time for "xgcc -S" so it's measuring the driver, parsing,
> optimimation, etc, rather than attempting to directly measure parsing.
> This is without -ftime-report, since Mikhail indicated it's sufficiently
> expensive to skew timings in this post:
>   https://gcc.gnu.org/ml/gcc/2015-07/msg00165.html
>
> (b) memory usage: by performing a separate build with -ftime-report,
> extracting the "TOTAL" ggc value (actually 3 builds, but it's the same
> each time).
>
> Is this a fair way to measure things?  It could be argued that by
> measuring totals I'm hiding the extra parsing cost in the overall cost.

Overall cost is what matters.   Time to build the libstdc++ PCHs
would be interesting as well ;)  (and their size)

One could have argued you should have used -fsyntax-only.

> Full logs can be seen at:
>   https://dmalcolm.fedorapeople.org/gcc/2015-09-25/bmark-v2.txt
> (v2 of the patchkit)
>
> I also investigated a version of the patchkit with the token tracking
> rewritten to build ad-hoc ranges for *every token*, without attempting
> any kind of optimization (e.g. for short ranges).
> A log of this can be seen at:
> https://dmalcolm.fedorapeople.org/gcc/2015-09-25/bmark-v2-plus-adhoc-ranges-for-tokens.txt
> (v2 of the patchkit, with token tracking rewritten to build ad-hoc
> ranges for *every token*).
> The nice thing about this approach is that lots of token-related
> diagnostics gain underlining of the relevant token "for free" simply
> from the location_t, without having to individually patch them.  Without
> any optimization, the memory consumed by this approach is clearly
> larger.
>
> A summary comparing the two logs:
>
> Minimal wallclock time (s) over 10 iterations
>                           Control -> v2                                 Control -> v2+adhocloc+at+every+token
> kdecore.cc -g -O0          10.306548 -> 10.268712: 1.00x faster          10.247160 -> 10.444528: 1.02x slower
> kdecore.cc -g -O1          27.026285 -> 27.220654: 1.01x slower          27.280681 -> 27.622676: 1.01x slower
> kdecore.cc -g -O2          43.791668 -> 44.020270: 1.01x slower          43.904934 -> 44.248477: 1.01x slower
> kdecore.cc -g -O3          47.471836 -> 47.651101: 1.00x slower          47.645985 -> 48.005495: 1.01x slower
> kdecore.cc -g -Os          31.678652 -> 31.802829: 1.00x slower          31.741484 -> 32.033478: 1.01x slower
>    empty.c -g -O0            0.012662 -> 0.011932: 1.06x faster            0.012888 -> 0.013143: 1.02x slower
>    empty.c -g -O1            0.012685 -> 0.012558: 1.01x faster            0.013164 -> 0.012790: 1.03x faster
>    empty.c -g -O2            0.012694 -> 0.012846: 1.01x slower            0.012912 -> 0.013175: 1.02x slower
>    empty.c -g -O3            0.012654 -> 0.012699: 1.00x slower            0.012596 -> 0.012792: 1.02x slower
>    empty.c -g -Os            0.013057 -> 0.012766: 1.02x faster            0.012691 -> 0.012885: 1.02x slower
> big-code.c -g -O0            3.292680 -> 3.325748: 1.01x slower            3.292948 -> 3.303049: 1.00x slower
> big-code.c -g -O1          15.701810 -> 15.765014: 1.00x slower          15.714116 -> 15.759254: 1.00x slower
> big-code.c -g -O2          22.575615 -> 22.620187: 1.00x slower          22.567406 -> 22.605435: 1.00x slower
> big-code.c -g -O3          52.423586 -> 52.590075: 1.00x slower          52.421460 -> 52.703835: 1.01x slower
> big-code.c -g -Os          21.153980 -> 21.253598: 1.00x slower          21.146266 -> 21.260138: 1.01x slower
> influence.i -g -O0            0.148229 -> 0.149518: 1.01x slower            0.148672 -> 0.156262: 1.05x slower
> influence.i -g -O1            0.387397 -> 0.389930: 1.01x slower            0.387734 -> 0.396655: 1.02x slower
> influence.i -g -O2            0.587514 -> 0.589604: 1.00x slower            0.588064 -> 0.596510: 1.01x slower
> influence.i -g -O3            1.273561 -> 1.280514: 1.01x slower            1.274599 -> 1.287596: 1.01x slower
> influence.i -g -Os            0.526045 -> 0.527579: 1.00x slower            0.526827 -> 0.535635: 1.02x slower
>
>
> Maximal ggc memory (kb)
>                      Control -> v2                                 Control -> v2+adhocloc+at+every+token
> kdecore.cc -g -O0      650337.000 -> 654435.000: 1.0063x larger      650337.000 -> 711775.000: 1.0945x larger
> kdecore.cc -g -O1      931966.000 -> 940144.000: 1.0088x larger      931951.000 -> 989384.000: 1.0616x larger
> kdecore.cc -g -O2    1125325.000 -> 1133514.000: 1.0073x larger    1125318.000 -> 1182384.000: 1.0507x larger
> kdecore.cc -g -O3    1221408.000 -> 1229596.000: 1.0067x larger    1221410.000 -> 1278658.000: 1.0469x larger
> kdecore.cc -g -Os      867140.000 -> 871235.000: 1.0047x larger      867141.000 -> 928700.000: 1.0710x larger
>    empty.c -g -O0          1189.000 -> 1192.000: 1.0025x larger          1189.000 -> 1193.000: 1.0034x larger
>    empty.c -g -O1          1189.000 -> 1192.000: 1.0025x larger          1189.000 -> 1193.000: 1.0034x larger
>    empty.c -g -O2          1189.000 -> 1192.000: 1.0025x larger          1189.000 -> 1193.000: 1.0034x larger
>    empty.c -g -O3          1189.000 -> 1192.000: 1.0025x larger          1189.000 -> 1193.000: 1.0034x larger
>    empty.c -g -Os          1189.000 -> 1192.000: 1.0025x larger          1189.000 -> 1193.000: 1.0034x larger
> big-code.c -g -O0      166584.000 -> 172731.000: 1.0369x larger      166584.000 -> 172726.000: 1.0369x larger
> big-code.c -g -O1      279793.000 -> 285940.000: 1.0220x larger      279793.000 -> 285935.000: 1.0220x larger
> big-code.c -g -O2      400058.000 -> 406194.000: 1.0153x larger      400058.000 -> 406189.000: 1.0153x larger
> big-code.c -g -O3      903648.000 -> 909750.000: 1.0068x larger      903906.000 -> 910001.000: 1.0067x larger
> big-code.c -g -Os      357060.000 -> 363010.000: 1.0167x larger      357060.000 -> 363005.000: 1.0166x larger
> influence.i -g -O0          9273.000 -> 9719.000: 1.0481x larger         9273.000 -> 13303.000: 1.4346x larger
> influence.i -g -O1        12968.000 -> 13414.000: 1.0344x larger        12968.000 -> 16998.000: 1.3108x larger
> influence.i -g -O2        16386.000 -> 16768.000: 1.0233x larger        16386.000 -> 20352.000: 1.2420x larger
> influence.i -g -O3        35508.000 -> 35763.000: 1.0072x larger        35508.000 -> 39346.000: 1.1081x larger
> influence.i -g -Os        14287.000 -> 14669.000: 1.0267x larger        14287.000 -> 18253.000: 1.2776x larger
>
> Thoughts?

The compile-time and memory-usage impact for the adhocloc at every
token patchkit is quite big.  Remember
that gaining 1% in compile-time is hard and 20-40% memory increase for
influence.i looks too much.

I also wonder why you see differences in memory usage change for
different -O levels.  I think we should
have a pretty "static" line table after parsing?  Thus rather than
percentages I'd like to see absolute changes
(which I'd expect to be the same for all -O levels).

Richard.

> Dave
>
>

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Benchmarks of v2 (was Re: [PATCH 0/5] RFC: Overhaul of diagnostics (v2))
  2015-10-14  9:00           ` Richard Biener
@ 2015-10-14 12:49             ` Michael Matz
  2015-10-16 15:57             ` David Malcolm
  2015-11-13 16:02             ` David Malcolm
  2 siblings, 0 replies; 83+ messages in thread
From: Michael Matz @ 2015-10-14 12:49 UTC (permalink / raw)
  To: Richard Biener; +Cc: David Malcolm, GCC Patches

Hi,

On Wed, 14 Oct 2015, Richard Biener wrote:

> The compile-time and memory-usage impact for the adhocloc at every token 
> patchkit is quite big.  Remember that gaining 1% in compile-time is hard 
> and 20-40% memory increase for influence.i looks too much.

Yes.  OTOH the compile time and memory use for the v2 patchkit itself look 
reasonable.

> I also wonder why you see differences in memory usage change for
> different -O levels.  I think we should
> have a pretty "static" line table after parsing?  Thus rather than
> percentages I'd like to see absolute changes

He gave the absolute numbers, so you can calculate this yourself :)
empty.c 3KB, big-code.c 6MB, influence.i 400KB, kdecore.cc 4MB and 8MB (v2 
patchkit).

> (which I'd expect to be the same for all -O levels).

This strangely is not the case for influence.i and kdecore.cc.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Benchmarks of v2 (was Re: [PATCH 0/5] RFC: Overhaul of diagnostics (v2))
  2015-10-14  9:00           ` Richard Biener
  2015-10-14 12:49             ` Michael Matz
@ 2015-10-16 15:57             ` David Malcolm
  2015-10-19 14:59               ` Michael Matz
  2015-11-13 16:02             ` David Malcolm
  2 siblings, 1 reply; 83+ messages in thread
From: David Malcolm @ 2015-10-16 15:57 UTC (permalink / raw)
  To: Richard Biener; +Cc: Michael Matz, GCC Patches

On Wed, 2015-10-14 at 11:00 +0200, Richard Biener wrote:
> On Tue, Oct 13, 2015 at 5:32 PM, David Malcolm <dmalcolm@redhat.com> wrote:
> > On Thu, 2015-09-24 at 10:15 +0200, Richard Biener wrote:
> >> On Thu, Sep 24, 2015 at 2:25 AM, David Malcolm <dmalcolm@redhat.com> wrote:
> >> > On Wed, 2015-09-23 at 15:36 +0200, Richard Biener wrote:
> >> >> On Wed, Sep 23, 2015 at 3:19 PM, Michael Matz <matz@suse.de> wrote:
> >> >> > Hi,
> >> >> >
> >> >> > On Tue, 22 Sep 2015, David Malcolm wrote:
> >> >> >
> >> >> >> The drawback is that it could bloat the ad-hoc table.  Can the ad-hoc
> >> >> >> table ever get smaller, or does it only ever get inserted into?
> >> >> >
> >> >> > It only ever grows.
> >> >> >
> >> >> >> An idea I had is that we could stash short ranges directly into the 32
> >> >> >> bits of location_t, by offsetting the per-column-bits somewhat.
> >> >> >
> >> >> > It's certainly worth an experiment: let's say you restrict yourself to
> >> >> > tokens less than 8 characters, you need an additional 3 bits (using one
> >> >> > value, e.g. zero, as the escape value).  That leaves 20 bits for the line
> >> >> > numbers (for the normal 8 bit columns), which might be enough for most
> >> >> > single-file compilations.  For LTO compilation this often won't be enough.
> >> >> >
> >> >> >> My plan is to investigate the impact these patches have on the time and
> >> >> >> memory consumption of the compiler,
> >> >> >
> >> >> > When you do so, make sure you're also measuring an LTO compilation with
> >> >> > debug info of something big (firefox).  I know that we already had issues
> >> >> > with the size of the linemap data in the past for these cases (probably
> >> >> > when we added columns).
> >> >>
> >> >> The issue we have with LTO is that the linemap gets populated in quite
> >> >> random order and thus we repeatedly switch files (we've mitigated this
> >> >> somewhat for GCC 5).  We also considered dropping column info
> >> >> (and would drop range info) as diagnostics are from optimizers only
> >> >> with LTO and we keep locations merely for debug info.
> >> >
> >> > Thanks.  Presumably the mitigation you're referring to is the
> >> > lto_location_cache class in lto-streamer-in.c?
> >> >
> >> > Am I right in thinking that, right now, the LTO code doesn't support
> >> > ad-hoc locations? (presumably the block pointers only need to exist
> >> > during optimization, which happens after the serialization)
> >>
> >> LTO code does support ad-hoc locations but they are "restored" only
> >> when reading function bodies and stmts (by means of COMBINE_LOCATION_DATA).
> >>
> >> > The obvious simplification would be, as you suggest, to not bother
> >> > storing range information with LTO, falling back to just the existing
> >> > representation.  Then there's no need to extend LTO to serialize ad-hoc
> >> > data; simply store the underlying locus into the bit stream.  I think
> >> > that this happens already: lto-streamer-out.c calls expand_location and
> >> > stores the result, so presumably any ad-hoc location_t values made by
> >> > the v2 patches would have dropped their range data there when I ran the
> >> > test suite.
> >>
> >> Yep.  We only preserve BLOCKs, so if you don't add extra code to
> >> preserve ranges they'll be "dropped".
> >>
> >> > If it's acceptable to not bother with ranges for LTO, one way to do the
> >> > "stashing short ranges into the location_t" idea might be for the
> >> > bits-per-range of location_t values to be a property of the line_table
> >> > (or possibly the line map), set up when the struct line_maps is created.
> >> > For non-LTO it could be some tuned value (maybe from a param?); for LTO
> >> > it could be zero, so that we have as many bits as before for line/column
> >> > data.
> >>
> >> That could be a possibility (likewise for column info?)
> >>
> >> Richard.
> >>
> >> > Hope this sounds sane
> >> > Dave
> >
> > I did some crude benchmarking of the patchkit, using these scripts:
> >   https://github.com/davidmalcolm/gcc-benchmarking
> > (specifically, bb0222b455df8cefb53bfc1246eb0a8038256f30),
> > using the "big-code.c" and "kdecore.cc" files Michael posted as:
> >   https://gcc.gnu.org/ml/gcc-patches/2013-09/msg00062.html
> > and "influence.i", a preprocessed version of SPEC2006's 445.gobmk
> > engine/influence.c (as an example of a moderate-sized pure C source
> > file).
> >
> > This doesn't yet cover very large autogenerated C files, and the .cc
> > file is only being measured to see the effect on the ad-hoc table (and
> > tokenization).
> >
> > "control" was r227977.
> > "experiment" was the same revision with the v2 patchkit applied.
> >
> > Recall that this patchkit captures ranges for tokens as an extra field
> > within tokens within libcpp and the C FE, and adds ranges to the ad-hoc
> > location lookaside, storing them for all tree nodes within the C FE that
> > have a location_t, and passing them around within c_expr for all C
> > expressions (including those that don't have a location_t).
> >
> > Both control and experiment were built with
> >   --enable-checking=release \
> >   --disable-bootstrap \
> >   --disable-multilib \
> >   --enable-languages=c,ada,c++,fortran,go,java,lto,objc,obj-c++
> >
> > The script measures:
> >
> > (a) wallclock time for "xgcc -S" so it's measuring the driver, parsing,
> > optimimation, etc, rather than attempting to directly measure parsing.
> > This is without -ftime-report, since Mikhail indicated it's sufficiently
> > expensive to skew timings in this post:
> >   https://gcc.gnu.org/ml/gcc/2015-07/msg00165.html
> >
> > (b) memory usage: by performing a separate build with -ftime-report,
> > extracting the "TOTAL" ggc value (actually 3 builds, but it's the same
> > each time).
> >
> > Is this a fair way to measure things?  It could be argued that by
> > measuring totals I'm hiding the extra parsing cost in the overall cost.
> 
> Overall cost is what matters.   Time to build the libstdc++ PCHs
> would be interesting as well ;)  (and their size)
> 
> One could have argued you should have used -fsyntax-only.
> 
> > Full logs can be seen at:
> >   https://dmalcolm.fedorapeople.org/gcc/2015-09-25/bmark-v2.txt
> > (v2 of the patchkit)
> >
> > I also investigated a version of the patchkit with the token tracking
> > rewritten to build ad-hoc ranges for *every token*, without attempting
> > any kind of optimization (e.g. for short ranges).
> > A log of this can be seen at:
> > https://dmalcolm.fedorapeople.org/gcc/2015-09-25/bmark-v2-plus-adhoc-ranges-for-tokens.txt
> > (v2 of the patchkit, with token tracking rewritten to build ad-hoc
> > ranges for *every token*).
> > The nice thing about this approach is that lots of token-related
> > diagnostics gain underlining of the relevant token "for free" simply
> > from the location_t, without having to individually patch them.  Without
> > any optimization, the memory consumed by this approach is clearly
> > larger.
> >
> > A summary comparing the two logs:
> >
> > Minimal wallclock time (s) over 10 iterations
> >                           Control -> v2                                 Control -> v2+adhocloc+at+every+token
> > kdecore.cc -g -O0          10.306548 -> 10.268712: 1.00x faster          10.247160 -> 10.444528: 1.02x slower
> > kdecore.cc -g -O1          27.026285 -> 27.220654: 1.01x slower          27.280681 -> 27.622676: 1.01x slower
> > kdecore.cc -g -O2          43.791668 -> 44.020270: 1.01x slower          43.904934 -> 44.248477: 1.01x slower
> > kdecore.cc -g -O3          47.471836 -> 47.651101: 1.00x slower          47.645985 -> 48.005495: 1.01x slower
> > kdecore.cc -g -Os          31.678652 -> 31.802829: 1.00x slower          31.741484 -> 32.033478: 1.01x slower
> >    empty.c -g -O0            0.012662 -> 0.011932: 1.06x faster            0.012888 -> 0.013143: 1.02x slower
> >    empty.c -g -O1            0.012685 -> 0.012558: 1.01x faster            0.013164 -> 0.012790: 1.03x faster
> >    empty.c -g -O2            0.012694 -> 0.012846: 1.01x slower            0.012912 -> 0.013175: 1.02x slower
> >    empty.c -g -O3            0.012654 -> 0.012699: 1.00x slower            0.012596 -> 0.012792: 1.02x slower
> >    empty.c -g -Os            0.013057 -> 0.012766: 1.02x faster            0.012691 -> 0.012885: 1.02x slower
> > big-code.c -g -O0            3.292680 -> 3.325748: 1.01x slower            3.292948 -> 3.303049: 1.00x slower
> > big-code.c -g -O1          15.701810 -> 15.765014: 1.00x slower          15.714116 -> 15.759254: 1.00x slower
> > big-code.c -g -O2          22.575615 -> 22.620187: 1.00x slower          22.567406 -> 22.605435: 1.00x slower
> > big-code.c -g -O3          52.423586 -> 52.590075: 1.00x slower          52.421460 -> 52.703835: 1.01x slower
> > big-code.c -g -Os          21.153980 -> 21.253598: 1.00x slower          21.146266 -> 21.260138: 1.01x slower
> > influence.i -g -O0            0.148229 -> 0.149518: 1.01x slower            0.148672 -> 0.156262: 1.05x slower
> > influence.i -g -O1            0.387397 -> 0.389930: 1.01x slower            0.387734 -> 0.396655: 1.02x slower
> > influence.i -g -O2            0.587514 -> 0.589604: 1.00x slower            0.588064 -> 0.596510: 1.01x slower
> > influence.i -g -O3            1.273561 -> 1.280514: 1.01x slower            1.274599 -> 1.287596: 1.01x slower
> > influence.i -g -Os            0.526045 -> 0.527579: 1.00x slower            0.526827 -> 0.535635: 1.02x slower
> >
> >
> > Maximal ggc memory (kb)
> >                      Control -> v2                                 Control -> v2+adhocloc+at+every+token
> > kdecore.cc -g -O0      650337.000 -> 654435.000: 1.0063x larger      650337.000 -> 711775.000: 1.0945x larger
> > kdecore.cc -g -O1      931966.000 -> 940144.000: 1.0088x larger      931951.000 -> 989384.000: 1.0616x larger
> > kdecore.cc -g -O2    1125325.000 -> 1133514.000: 1.0073x larger    1125318.000 -> 1182384.000: 1.0507x larger
> > kdecore.cc -g -O3    1221408.000 -> 1229596.000: 1.0067x larger    1221410.000 -> 1278658.000: 1.0469x larger
> > kdecore.cc -g -Os      867140.000 -> 871235.000: 1.0047x larger      867141.000 -> 928700.000: 1.0710x larger
> >    empty.c -g -O0          1189.000 -> 1192.000: 1.0025x larger          1189.000 -> 1193.000: 1.0034x larger
> >    empty.c -g -O1          1189.000 -> 1192.000: 1.0025x larger          1189.000 -> 1193.000: 1.0034x larger
> >    empty.c -g -O2          1189.000 -> 1192.000: 1.0025x larger          1189.000 -> 1193.000: 1.0034x larger
> >    empty.c -g -O3          1189.000 -> 1192.000: 1.0025x larger          1189.000 -> 1193.000: 1.0034x larger
> >    empty.c -g -Os          1189.000 -> 1192.000: 1.0025x larger          1189.000 -> 1193.000: 1.0034x larger
> > big-code.c -g -O0      166584.000 -> 172731.000: 1.0369x larger      166584.000 -> 172726.000: 1.0369x larger
> > big-code.c -g -O1      279793.000 -> 285940.000: 1.0220x larger      279793.000 -> 285935.000: 1.0220x larger
> > big-code.c -g -O2      400058.000 -> 406194.000: 1.0153x larger      400058.000 -> 406189.000: 1.0153x larger
> > big-code.c -g -O3      903648.000 -> 909750.000: 1.0068x larger      903906.000 -> 910001.000: 1.0067x larger
> > big-code.c -g -Os      357060.000 -> 363010.000: 1.0167x larger      357060.000 -> 363005.000: 1.0166x larger
> > influence.i -g -O0          9273.000 -> 9719.000: 1.0481x larger         9273.000 -> 13303.000: 1.4346x larger
> > influence.i -g -O1        12968.000 -> 13414.000: 1.0344x larger        12968.000 -> 16998.000: 1.3108x larger
> > influence.i -g -O2        16386.000 -> 16768.000: 1.0233x larger        16386.000 -> 20352.000: 1.2420x larger
> > influence.i -g -O3        35508.000 -> 35763.000: 1.0072x larger        35508.000 -> 39346.000: 1.1081x larger
> > influence.i -g -Os        14287.000 -> 14669.000: 1.0267x larger        14287.000 -> 18253.000: 1.2776x larger
> >
> > Thoughts?
> 
> The compile-time and memory-usage impact for the adhocloc at every
> token patchkit is quite big.  Remember
> that gaining 1% in compile-time is hard and 20-40% memory increase for
> influence.i looks too much.
> 
> I also wonder why you see differences in memory usage change for
> different -O levels.  I think we should
> have a pretty "static" line table after parsing?  Thus rather than
> percentages I'd like to see absolute changes
> (which I'd expect to be the same for all -O levels).
> Richard.
> 
> > Dave

I have a mostly-working implementation of the range-packing idea, where
short ranges that start at the caret location and finish  within
(1<<RANGE_BITS) of the start are stored in low bits directly within the
32-bit source_location, the hope being to ameliorate the cost of packing
every token's range into the location_t.
     https://dmalcolm.fedorapeople.org/gcc/2015-10-15/rich-locations/

(patch 0014 implements the range compression idea).

It doesn't bootstrap yet (am working on that), but runs well enough to
pass a lot of tests, and can be benchmarked.

The following shows the results posted above, plus the new v2+ packed
ranges as the final column, in a slightly different format that I hope
is easier to read.

Minimal wallclock time (s) over 10 iterations
  (each experiment's % change after "v2" is relative to 10 iterations of
   control interleaved with that experiment)
                      Control  v2                 v2+every+token     v2+packed+ranges
------------------  ---------  -----------------  -----------------  ------------------
kdecore.cc -g -O0   10.3065    10.268712 (-0.4%)  10.444528 (+1.9%)  10.398405 (+1.3%)
kdecore.cc -g -O1   27.0263    27.220654 (+0.7%)  27.622676 (+1.3%)  27.548755 (+1.9%)
kdecore.cc -g -O2   43.7917    44.02027 (+0.5%)   44.248477 (+0.8%)  43.843728 (-0.6%)
kdecore.cc -g -O3   47.4718    47.651101 (+0.4%)  48.005495 (+0.8%)  47.57282 (+0.1%)
kdecore.cc -g -Os   31.6787    31.802829 (+0.4%)  32.033478 (+0.9%)  31.699171 (-0.2%)
empty.c -g -O0       0.012662  0.011932 (-5.8%)   0.013143 (+2.0%)   0.013208 (+1.5%)
empty.c -g -O1       0.012685  0.012558 (-1.0%)   0.01279 (-2.8%)    0.012424 (+0.4%)
empty.c -g -O2       0.012694  0.012846 (+1.2%)   0.013175 (+2.0%)   0.013176 (+1.6%)
empty.c -g -O3       0.012654  0.012699 (+0.4%)   0.012792 (+1.6%)   0.013198 (+0.5%)
empty.c -g -Os       0.013057  0.012766 (-2.2%)   0.012885 (+1.5%)   0.01298 (-1.6%)
big-code.c -g -O0    3.29268   3.325748 (+1.0%)   3.303049 (+0.3%)   3.350572 (+1.7%)
big-code.c -g -O1   15.7018    15.765014 (+0.4%)  15.759254 (+0.3%)  15.45175 (-2.0%)
big-code.c -g -O2   22.5756    22.620187 (+0.2%)  22.605435 (+0.2%)  22.343609 (-1.2%)
big-code.c -g -O3   52.4236    52.590075 (+0.3%)  52.703835 (+0.5%)  51.9239 (-1.0%)
big-code.c -g -Os   21.154     21.253598 (+0.5%)  21.260138 (+0.5%)  20.907407 (-1.3%)
influence.i -g -O0   0.148229  0.149518 (+0.9%)   0.156262 (+5.1%)   0.150652 (+1.0%)
influence.i -g -O1   0.387397  0.38993 (+0.7%)    0.396655 (+2.3%)   0.3916 (+0.8%)
influence.i -g -O2   0.587514  0.589604 (+0.4%)   0.59651 (+1.4%)    0.585223 (-0.6%)
influence.i -g -O3   1.27356   1.280514 (+0.5%)   1.287596 (+1.0%)   1.273018 (-0.1%)
influence.i -g -Os   0.526045  0.527579 (+0.3%)   0.535635 (+1.7%)   0.528192 (+0.2%)


Maximal ggc memory (kb)
                      Control  v2               v2+every+token    v2+packed+ranges
------------------  ---------  ---------------  ----------------  ------------------
kdecore.cc -g -O0      650337  654435 (+0.6%)   711775 (+9.4%)    659372 (+1.4%)
kdecore.cc -g -O1      931966  940144 (+0.9%)   989384 (+6.2%)    954779 (+2.4%)
kdecore.cc -g -O2     1125325  1133514 (+0.7%)  1182384 (+5.1%)   1136459 (+1.0%)
kdecore.cc -g -O3     1221408  1229596 (+0.7%)  1278658 (+4.7%)   1232688 (+0.9%)
kdecore.cc -g -Os      867140  871235 (+0.5%)   928700 (+7.1%)    884304 (+2.0%)
empty.c -g -O0           1189  1192 (+0.3%)     1193 (+0.3%)      1189 (+0.0%)
empty.c -g -O1           1189  1192 (+0.3%)     1193 (+0.3%)      1189 (+0.0%)
empty.c -g -O2           1189  1192 (+0.3%)     1193 (+0.3%)      1189 (+0.0%)
empty.c -g -O3           1189  1192 (+0.3%)     1193 (+0.3%)      1189 (+0.0%)
empty.c -g -Os           1189  1192 (+0.3%)     1193 (+0.3%)      1189 (+0.0%)
big-code.c -g -O0      166584  172731 (+3.7%)   172726 (+3.7%)    176062 (+5.7%)
big-code.c -g -O1      279793  285940 (+2.2%)   285935 (+2.2%)    281538 (+0.6%)
big-code.c -g -O2      400058  406194 (+1.5%)   406189 (+1.5%)    395632 (-1.1%)
big-code.c -g -O3      903648  909750 (+0.7%)   910001 (+0.7%)    902477 (-0.1%)
big-code.c -g -Os      357060  363010 (+1.7%)   363005 (+1.7%)    353338 (-1.0%)
influence.i -g -O0       9273  9719 (+4.8%)     13303 (+43.5%)    9758 (+5.2%)
influence.i -g -O1      12968  13414 (+3.4%)    16998 (+31.1%)    13562 (+4.6%)
influence.i -g -O2      16386  16768 (+2.3%)    20352 (+24.2%)    16737 (+2.1%)
influence.i -g -O3      35508  35763 (+0.7%)    39346 (+10.8%)    35783 (+0.8%)
influence.i -g -Os      14287  14669 (+2.7%)    18253 (+27.8%)    14769 (+3.4%)


This fixes much of the bloat seen for influence.i when sending ranges
through for every token.

This was with 8 bits allocated for packed ranges (which is probably
excessive, but it makes debugging easier).

Interestingly, the memory usage sometimes goes down relative to control
at higher optimization levels.  I'm not sure why yet.

Dave

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Benchmarks of v2 (was Re: [PATCH 0/5] RFC: Overhaul of diagnostics (v2))
  2015-10-16 15:57             ` David Malcolm
@ 2015-10-19 14:59               ` Michael Matz
  2015-10-22 15:05                 ` David Malcolm
  0 siblings, 1 reply; 83+ messages in thread
From: Michael Matz @ 2015-10-19 14:59 UTC (permalink / raw)
  To: David Malcolm; +Cc: Richard Biener, GCC Patches

Hi,

On Fri, 16 Oct 2015, David Malcolm wrote:

> This fixes much of the bloat seen for influence.i when sending ranges 
> through for every token.

Yeah, I think that's on the right track.

> This was with 8 bits allocated for packed ranges (which is probably 
> excessive, but it makes debugging easier).

Probably in the end it should be done similar to how column bits are dealt 
with, start with a reasonably low number (5 bits?) and increase if 
necessary and budget allows (budget being column+range < N bits && range < 
8 bits, or so; so that range can't consume all of the column bits).


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Benchmarks of v2 (was Re: [PATCH 0/5] RFC: Overhaul of diagnostics (v2))
  2015-10-19 14:59               ` Michael Matz
@ 2015-10-22 15:05                 ` David Malcolm
  0 siblings, 0 replies; 83+ messages in thread
From: David Malcolm @ 2015-10-22 15:05 UTC (permalink / raw)
  To: Michael Matz; +Cc: Richard Biener, GCC Patches

On Mon, 2015-10-19 at 16:51 +0200, Michael Matz wrote:
> Hi,
> 
> On Fri, 16 Oct 2015, David Malcolm wrote:
> 
> > This fixes much of the bloat seen for influence.i when sending ranges 
> > through for every token.
> 
> Yeah, I think that's on the right track.

Thanks.


> > This was with 8 bits allocated for packed ranges (which is probably 
> > excessive, but it makes debugging easier).
> 
> Probably in the end it should be done similar to how column bits are dealt 
> with, start with a reasonably low number (5 bits?) and increase if 
> necessary and budget allows (budget being column+range < N bits && range < 
> 8 bits, or so; so that range can't consume all of the column bits).

In my latest version, the range_bits is indeed a field of the ordinary
map, defaulting to 5 bits.  Right now it's either 5 or 0 bits for an
ordinary map's range: 5 as the default, with 0 for the case where the
map's column_bits value is also 0, which happens for large values of
location_t (above LINE_MAP_MAX_LOCATION_WITH_COLS), or for maps where
max_column_hint was very high.  I put it into the ordinary_map (rather
than the line_table) to handle those column_bits == 0 cases.  I'm not
sure I want to add any logic beyond that, for fear of over-complicating
things.

I have a messy patch kit with this idea, which bootstraps and passes
regression testing on x86_64; I'm going to tidy it up and post it for
review [the messy version is backed up here:
https://dmalcolm.fedorapeople.org/gcc/2015-10-22/rich-locations/
It's on top of r228618.] 

Here are some benchmark numbers (the final column is for the patch kit
above; control is r228618).

Minimal wallclock time (s) over 10 iterations
  (each experiment's % change is relative to 10 iterations of control
interleaved with that experiment)
                      Control  v2                 v2+every+token     v2+packed+ranges    v2+packed+ranges+20151021
------------------  ---------  -----------------  -----------------  ------------------  ---------------------------
kdecore.cc -g -O0   10.3065    10.268712 (-0.4%)  10.444528 (+1.9%)  10.398405 (+1.3%)   10.382135 (+1.3%)
kdecore.cc -g -O1   27.0263    27.220654 (+0.7%)  27.622676 (+1.3%)  27.548755 (+1.9%)   27.725537 (+2.5%)
kdecore.cc -g -O2   43.7917    44.02027 (+0.5%)   44.248477 (+0.8%)  43.843728 (-0.6%)   44.034842 (-0.6%)
kdecore.cc -g -O3   47.4718    47.651101 (+0.4%)  48.005495 (+0.8%)  47.57282 (+0.1%)    48.149045 (+1.2%)
kdecore.cc -g -Os   31.6787    31.802829 (+0.4%)  32.033478 (+0.9%)  31.699171 (-0.2%)   31.802537 (+0.2%)
empty.c -g -O0       0.012662  0.011932 (-5.8%)   0.013143 (+2.0%)   0.013208 (+1.5%)    0.011506 (+2.0%)
empty.c -g -O1       0.012685  0.012558 (-1.0%)   0.01279 (-2.8%)    0.012424 (+0.4%)    0.011212 (-1.2%)
empty.c -g -O2       0.012694  0.012846 (+1.2%)   0.013175 (+2.0%)   0.013176 (+1.6%)    0.011495 (-0.0%)
empty.c -g -O3       0.012654  0.012699 (+0.4%)   0.012792 (+1.6%)   0.013198 (+0.5%)    0.010793 (+1.3%)
empty.c -g -Os       0.013057  0.012766 (-2.2%)   0.012885 (+1.5%)   0.01298 (-1.6%)     0.011169 (-0.2%)
big-code.c -g -O0    3.29268   3.325748 (+1.0%)   3.303049 (+0.3%)   3.350572 (+1.7%)    3.352896 (+1.7%)
big-code.c -g -O1   15.7018    15.765014 (+0.4%)  15.759254 (+0.3%)  15.45175 (-2.0%)    15.454777 (-1.9%)
big-code.c -g -O2   22.5756    22.620187 (+0.2%)  22.605435 (+0.2%)  22.343609 (-1.2%)   22.2913 (-1.4%)
big-code.c -g -O3   52.4236    52.590075 (+0.3%)  52.703835 (+0.5%)  51.9239 (-1.0%)     51.86898 (-1.1%)
big-code.c -g -Os   21.154     21.253598 (+0.5%)  21.260138 (+0.5%)  20.907407 (-1.3%)   20.870625 (-1.3%)
influence.i -g -O0   0.148229  0.149518 (+0.9%)   0.156262 (+5.1%)   0.150652 (+1.0%)    0.147663 (+1.4%)
influence.i -g -O1   0.387397  0.38993 (+0.7%)    0.396655 (+2.3%)   0.3916 (+0.8%)      0.388918 (+1.0%)
influence.i -g -O2   0.587514  0.589604 (+0.4%)   0.59651 (+1.4%)    0.585223 (-0.6%)    0.583341 (-0.4%)
influence.i -g -O3   1.27356   1.280514 (+0.5%)   1.287596 (+1.0%)   1.273018 (-0.1%)    1.2723 (+0.0%)
influence.i -g -Os   0.526045  0.527579 (+0.3%)   0.535635 (+1.7%)   0.528192 (+0.2%)    0.525308 (+0.3%)


Maximal ggc memory (kb)
                      Control  v2               v2+every+token    v2+packed+ranges    v2+packed+ranges+20151021
------------------  ---------  ---------------  ----------------  ------------------  ---------------------------
kdecore.cc -g -O0      650337  654435 (+0.6%)   711775 (+9.4%)    659372 (+1.4%)      657465 (+1.1%)
kdecore.cc -g -O1      931966  940144 (+0.9%)   989384 (+6.2%)    954779 (+2.4%)      952734 (+2.2%)
kdecore.cc -g -O2     1125325  1133514 (+0.7%)  1182384 (+5.1%)   1136459 (+1.0%)     1134412 (+0.8%)
kdecore.cc -g -O3     1221408  1229596 (+0.7%)  1278658 (+4.7%)   1232688 (+0.9%)     1230634 (+0.8%)
kdecore.cc -g -Os      867140  871235 (+0.5%)   928700 (+7.1%)    884304 (+2.0%)      874049 (+0.8%)
empty.c -g -O0           1189  1192 (+0.3%)     1193 (+0.3%)      1189 (+0.0%)        1189 (+0.0%)
empty.c -g -O1           1189  1192 (+0.3%)     1193 (+0.3%)      1189 (+0.0%)        1189 (+0.0%)
empty.c -g -O2           1189  1192 (+0.3%)     1193 (+0.3%)      1189 (+0.0%)        1189 (+0.0%)
empty.c -g -O3           1189  1192 (+0.3%)     1193 (+0.3%)      1189 (+0.0%)        1189 (+0.0%)
empty.c -g -Os           1189  1192 (+0.3%)     1193 (+0.3%)      1189 (+0.0%)        1189 (+0.0%)
big-code.c -g -O0      166584  172731 (+3.7%)   172726 (+3.7%)    176062 (+5.7%)      176062 (+5.7%)
big-code.c -g -O1      279793  285940 (+2.2%)   285935 (+2.2%)    281538 (+0.6%)      281538 (+0.6%)
big-code.c -g -O2      400058  406194 (+1.5%)   406189 (+1.5%)    395632 (-1.1%)      395632 (-1.1%)
big-code.c -g -O3      903648  909750 (+0.7%)   910001 (+0.7%)    902477 (-0.1%)      902477 (-0.1%)
big-code.c -g -Os      357060  363010 (+1.7%)   363005 (+1.7%)    353338 (-1.0%)      353338 (-1.0%)
influence.i -g -O0       9273  9719 (+4.8%)     13303 (+43.5%)    9758 (+5.2%)        9758 (+5.2%)
influence.i -g -O1      12968  13414 (+3.4%)    16998 (+31.1%)    13562 (+4.6%)       13562 (+4.6%)
influence.i -g -O2      16386  16768 (+2.3%)    20352 (+24.2%)    16737 (+2.1%)       16737 (+2.1%)
influence.i -g -O3      35508  35763 (+0.7%)    39346 (+10.8%)    35783 (+0.8%)       35783 (+0.8%)
influence.i -g -Os      14287  14669 (+2.7%)    18253 (+27.8%)    14769 (+3.4%)       14769 (+3.4%)


[I'm looking to see if any further saving can be made for influence.i; I
was expecting the gcc consumption to be less that that for "v2"]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 01/10] Improvements to description of source_location in line-map.h
  2015-10-23 20:25 ` [PATCH 00/10] Overhaul of diagnostics (v5) David Malcolm
  2015-10-23 20:24   ` [PATCH 06/10] Track expression ranges in C frontend David Malcolm
  2015-10-23 20:24   ` [PATCH 03/10] libstdc++v3: Explicitly disable carets and colorization within testsuite David Malcolm
@ 2015-10-23 20:24   ` David Malcolm
  2015-10-23 21:02     ` Jeff Law
  2015-10-23 20:25   ` [PATCH 04/10] Reimplement diagnostic_show_locus, introducing rich_location classes (v5) David Malcolm
                     ` (7 subsequent siblings)
  10 siblings, 1 reply; 83+ messages in thread
From: David Malcolm @ 2015-10-23 20:24 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

libcpp/ChangeLog:
	* include/line-map.h (source_location): In the table in the
	descriptive comment, show UNKNOWN_LOCATION, BUILTINS_LOCATION,
	LINE_MAP_MAX_LOCATION_WITH_COLS, LINE_MAP_MAX_SOURCE_LOCATION.
	Add notes about ad-hoc values.
---
 libcpp/include/line-map.h | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index bc747c1..30bad87 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -59,8 +59,10 @@ typedef unsigned int linenum_type;
 
   Actual     | Value                         | Meaning
   -----------+-------------------------------+-------------------------------
-  0x00000000 |                               | Reserved for use by libcpp
-  0x00000001 | RESERVED_LOCATION_COUNT - 1   | Reserved for use by libcpp
+  0x00000000 | UNKNOWN_LOCATION (gcc/input.h)| Unknown/invalid location.
+  -----------+-------------------------------+-------------------------------
+  0x00000001 | BUILTINS_LOCATION             | The location for declarations
+             |   (gcc/input.h)               | in "<built-in>"
   -----------+-------------------------------+-------------------------------
   0x00000002 | RESERVED_LOCATION_COUNT       | The first location to be
              | (also                         | handed out, and the
@@ -94,6 +96,16 @@ typedef unsigned int linenum_type;
              |
              |                    (unallocated integers)
              |
+  0x60000000 | LINE_MAP_MAX_LOCATION_WITH_COLS
+             |   Beyond this point, ordinary linemaps have 0 bits per column:
+             |   each increment of the value corresponds to a new source line.
+             |
+  0x70000000 | LINE_MAP_MAX_SOURCE_LOCATION
+             |   Beyond the point, we give up on ordinary maps; attempts to
+             |   create locations in them lead to UNKNOWN_LOCATION (0).
+             |
+             |                    (unallocated integers)
+             |
              |                   Macro maps grow this way
              |                   ^^^^^^^^^^^^^^^^^^^^^^^^
              |                               |
@@ -107,10 +119,11 @@ typedef unsigned int linenum_type;
              | macromap[1]->start_location   | Start of macro map 1
   -----------+-------------------------------+-------------------------------
              | macromap[0]->start_location   | Start of macro map 0
-  0x7fffffff | MAX_SOURCE_LOCATION           |
+  0x7fffffff | MAX_SOURCE_LOCATION           | Also used as a mask for
+             |                               | accessing the ad-hoc data table
   -----------+-------------------------------+-------------------------------
-  0x80000000 | Start of ad-hoc values        |
-  ...        |                               |
+  0x80000000 | Start of ad-hoc values; the lower 31 bits are used as an index
+  ...        | into the line_table->location_adhoc_data_map.data array.
   0xffffffff | UINT_MAX                      |
   -----------+-------------------------------+-------------------------------
 
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 06/10] Track expression ranges in C frontend
  2015-10-23 20:25 ` [PATCH 00/10] Overhaul of diagnostics (v5) David Malcolm
@ 2015-10-23 20:24   ` David Malcolm
  2015-10-30  8:01     ` Jeff Law
  2015-10-23 20:24   ` [PATCH 03/10] libstdc++v3: Explicitly disable carets and colorization within testsuite David Malcolm
                     ` (9 subsequent siblings)
  10 siblings, 1 reply; 83+ messages in thread
From: David Malcolm @ 2015-10-23 20:24 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

As in the previous version of this patch
 "Implement tree expression tracking in C FE (v2)"
the patch now captures ranges for all C expressions during parsing within
a new field of c_expr, and for all tree nodes with a location_t, it stores
them in ad-hoc locations for later use.

Hence compound expressions get ranges; see:
  https://dmalcolm.fedorapeople.org/gcc/2015-09-22/diagnostic-test-expressions-1.html

and for this example:

  int test (int foo)
  {
    return foo * 100;
           ^^^   ^^^
  }

we have access to the ranges of "foo" and "100" during C parsing via
the c_expr, but once we have GENERIC, all we have is a VAR_DECL and an
INTEGER_CST (the former's location is in at the top of the
function, and the latter has no location).

gcc/ChangeLog:
	* Makefile.in (OBJS): Add gcc-rich-location.o.
	* gcc-rich-location.c: New file.
	* gcc-rich-location.h: New file.
	* print-tree.c (print_node): Print any source range information.
	* tree.c (set_source_range): New functions.
	* tree.h (CAN_HAVE_RANGE_P): New.
	(EXPR_LOCATION_RANGE): New.
	(EXPR_HAS_RANGE): New.
	(get_expr_source_range): New inline function.
	(DECL_LOCATION_RANGE): New.
	(set_source_range): New decls.
	(get_decl_source_range): New inline function.

gcc/c-family/ChangeLog:
	* c-common.c (c_fully_fold_internal): Capture existing souce_range,
	and store it on the result.

gcc/c/ChangeLog:
	* c-parser.c (set_c_expr_source_range): New functions.
	(c_token::get_range): New method.
	(c_token::get_finish): New method.
	(c_parser_expr_no_commas): Call set_c_expr_source_range on the ret
	based on the range from the start of the LHS to the end of the
	RHS.
	(c_parser_conditional_expression): Likewise, based on the range
	from the start of the cond.value to the end of exp2.value.
	(c_parser_binary_expression): Call set_c_expr_source_range on
	the stack values for TRUTH_ANDIF_EXPR and TRUTH_ORIF_EXPR.
	(c_parser_cast_expression): Call set_c_expr_source_range on ret
	based on the cast_loc through to the end of the expr.
	(c_parser_unary_expression): Likewise, based on the
	op_loc through to the end of op.
	(c_parser_sizeof_expression) Likewise, based on the start of the
	sizeof token through to either the closing paren or the end of
	expr.
	(c_parser_postfix_expression): Likewise, using the token range,
	or from the open paren through to the close paren for
	parenthesized expressions.
	(c_parser_postfix_expression_after_primary): Likewise, for
	various kinds of expression.
	* c-tree.h (struct c_expr): Add field "src_range".
	(c_expr::get_start): New method.
	(c_expr::get_finish): New method.
	(set_c_expr_source_range): New decls.
	* c-typeck.c (parser_build_unary_op): Call set_c_expr_source_range
	on ret for prefix unary ops.
	(parser_build_binary_op): Likewise, running from the start of
	arg1.value through to the end of arg2.value.

gcc/testsuite/ChangeLog:
	* gcc.dg/plugin/diagnostic-test-expressions-1.c: New file.
	* gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c:
	New file.
	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add
	diagnostic_plugin_test_tree_expression_range.c and
	diagnostic-test-expressions-1.c.
---
 gcc/Makefile.in                                    |   1 +
 gcc/c-family/c-common.c                            |  10 +-
 gcc/c/c-parser.c                                   |  98 ++++-
 gcc/c/c-tree.h                                     |  19 +
 gcc/c/c-typeck.c                                   |  10 +
 gcc/gcc-rich-location.c                            |  86 +++++
 gcc/gcc-rich-location.h                            |  47 +++
 gcc/print-tree.c                                   |  21 +
 .../gcc.dg/plugin/diagnostic-test-expressions-1.c  | 422 +++++++++++++++++++++
 .../diagnostic_plugin_test_tree_expression_range.c | 152 ++++++++
 gcc/testsuite/gcc.dg/plugin/plugin.exp             |   2 +
 gcc/tree.c                                         |  22 ++
 gcc/tree.h                                         |  31 ++
 13 files changed, 917 insertions(+), 4 deletions(-)
 create mode 100644 gcc/gcc-rich-location.c
 create mode 100644 gcc/gcc-rich-location.h
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 009c745..8cd446d 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1255,6 +1255,7 @@ OBJS = \
 	fold-const.o \
 	function.o \
 	fwprop.o \
+	gcc-rich-location.o \
 	gcse.o \
 	gcse-common.o \
 	ggc-common.o \
diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 4a5ccb7..c102bbd 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -1188,6 +1188,7 @@ c_fully_fold_internal (tree expr, bool in_init, bool *maybe_const_operands,
   bool op0_const_self = true, op1_const_self = true, op2_const_self = true;
   bool nowarning = TREE_NO_WARNING (expr);
   bool unused_p;
+  source_range old_range;
 
   /* This function is not relevant to C++ because C++ folds while
      parsing, and may need changes to be correct for C++ when C++
@@ -1203,6 +1204,9 @@ c_fully_fold_internal (tree expr, bool in_init, bool *maybe_const_operands,
       || code == SAVE_EXPR)
     return expr;
 
+  if (IS_EXPR_CODE_CLASS (kind))
+    old_range = EXPR_LOCATION_RANGE (expr);
+
   /* Operands of variable-length expressions (function calls) have
      already been folded, as have __builtin_* function calls, and such
      expressions cannot occur in constant expressions.  */
@@ -1627,7 +1631,11 @@ c_fully_fold_internal (tree expr, bool in_init, bool *maybe_const_operands,
       TREE_NO_WARNING (ret) = 1;
     }
   if (ret != expr)
-    protected_set_expr_location (ret, loc);
+    {
+      protected_set_expr_location (ret, loc);
+      if (IS_EXPR_CODE_CLASS (kind))
+	set_source_range (ret, old_range.m_start, old_range.m_finish);
+    }
   return ret;
 }
 
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 2d24c21..00a8698 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -67,6 +67,23 @@ along with GCC; see the file COPYING3.  If not see
 #include "gomp-constants.h"
 #include "c-family/c-indentation.h"
 
+void
+set_c_expr_source_range (c_expr *expr,
+			 location_t start, location_t finish)
+{
+  expr->src_range.m_start = start;
+  expr->src_range.m_finish = finish;
+  set_source_range (expr->value, start, finish);
+}
+
+void
+set_c_expr_source_range (c_expr *expr,
+			 source_range src_range)
+{
+  expr->src_range = src_range;
+  set_source_range (expr->value, src_range);
+}
+
 \f
 /* Initialization routine for this file.  */
 
@@ -172,6 +189,16 @@ struct GTY (()) c_token {
   location_t location;
   /* The value associated with this token, if any.  */
   tree value;
+
+  source_range get_range () const
+  {
+    return get_range_from_loc (line_table, location);
+  }
+
+  location_t get_finish () const
+  {
+    return get_range ().m_finish;
+  }
 };
 
 /* A parser structure recording information about the state and
@@ -6085,6 +6112,9 @@ c_parser_expr_no_commas (c_parser *parser, struct c_expr *after,
   ret.value = build_modify_expr (op_location, lhs.value, lhs.original_type,
 				 code, exp_location, rhs.value,
 				 rhs.original_type);
+  set_c_expr_source_range (&ret,
+			   lhs.get_start (),
+			   rhs.get_finish ());
   if (code == NOP_EXPR)
     ret.original_code = MODIFY_EXPR;
   else
@@ -6115,7 +6145,7 @@ c_parser_conditional_expression (c_parser *parser, struct c_expr *after,
 				 tree omp_atomic_lhs)
 {
   struct c_expr cond, exp1, exp2, ret;
-  location_t cond_loc, colon_loc, middle_loc;
+  location_t start, cond_loc, colon_loc, middle_loc;
 
   gcc_assert (!after || c_dialect_objc ());
 
@@ -6123,6 +6153,10 @@ c_parser_conditional_expression (c_parser *parser, struct c_expr *after,
 
   if (c_parser_next_token_is_not (parser, CPP_QUERY))
     return cond;
+  if (cond.value != error_mark_node)
+    start = cond.get_start ();
+  else
+    start = UNKNOWN_LOCATION;
   cond_loc = c_parser_peek_token (parser)->location;
   cond = convert_lvalue_to_rvalue (cond_loc, cond, true, true);
   c_parser_consume_token (parser);
@@ -6198,6 +6232,9 @@ c_parser_conditional_expression (c_parser *parser, struct c_expr *after,
 			   ? t1
 			   : NULL);
     }
+  set_c_expr_source_range (&ret,
+			   start,
+			   exp2.get_finish ());
   return ret;
 }
 
@@ -6350,6 +6387,7 @@ c_parser_binary_expression (c_parser *parser, struct c_expr *after,
     {
       enum c_parser_prec oprec;
       enum tree_code ocode;
+      source_range src_range;
       if (parser->error)
 	goto out;
       switch (c_parser_peek_token (parser)->type)
@@ -6438,6 +6476,7 @@ c_parser_binary_expression (c_parser *parser, struct c_expr *after,
       switch (ocode)
 	{
 	case TRUTH_ANDIF_EXPR:
+	  src_range = stack[sp].expr.src_range;
 	  stack[sp].expr
 	    = convert_lvalue_to_rvalue (stack[sp].loc,
 					stack[sp].expr, true, true);
@@ -6445,8 +6484,10 @@ c_parser_binary_expression (c_parser *parser, struct c_expr *after,
 	    (stack[sp].loc, default_conversion (stack[sp].expr.value));
 	  c_inhibit_evaluation_warnings += (stack[sp].expr.value
 					    == truthvalue_false_node);
+	  set_c_expr_source_range (&stack[sp].expr, src_range);
 	  break;
 	case TRUTH_ORIF_EXPR:
+	  src_range = stack[sp].expr.src_range;
 	  stack[sp].expr
 	    = convert_lvalue_to_rvalue (stack[sp].loc,
 					stack[sp].expr, true, true);
@@ -6454,6 +6495,7 @@ c_parser_binary_expression (c_parser *parser, struct c_expr *after,
 	    (stack[sp].loc, default_conversion (stack[sp].expr.value));
 	  c_inhibit_evaluation_warnings += (stack[sp].expr.value
 					    == truthvalue_true_node);
+	  set_c_expr_source_range (&stack[sp].expr, src_range);
 	  break;
 	default:
 	  break;
@@ -6522,6 +6564,10 @@ c_parser_cast_expression (c_parser *parser, struct c_expr *after)
 	expr = convert_lvalue_to_rvalue (expr_loc, expr, true, true);
       }
       ret.value = c_cast_expr (cast_loc, type_name, expr.value);
+      if (ret.value && expr.value)
+	set_c_expr_source_range (&ret,
+				 cast_loc,
+				 expr.get_finish ());
       ret.original_code = ERROR_MARK;
       ret.original_type = NULL;
       return ret;
@@ -6571,6 +6617,7 @@ c_parser_unary_expression (c_parser *parser)
   struct c_expr ret, op;
   location_t op_loc = c_parser_peek_token (parser)->location;
   location_t exp_loc;
+  location_t finish;
   ret.original_code = ERROR_MARK;
   ret.original_type = NULL;
   switch (c_parser_peek_token (parser)->type)
@@ -6610,8 +6657,10 @@ c_parser_unary_expression (c_parser *parser)
       c_parser_consume_token (parser);
       exp_loc = c_parser_peek_token (parser)->location;
       op = c_parser_cast_expression (parser, NULL);
+      finish = op.get_finish ();
       op = convert_lvalue_to_rvalue (exp_loc, op, true, true);
       ret.value = build_indirect_ref (op_loc, op.value, RO_UNARY_STAR);
+      set_c_expr_source_range (&ret, op_loc, finish);
       return ret;
     case CPP_PLUS:
       if (!c_dialect_objc () && !in_system_header_at (input_location))
@@ -6699,8 +6748,15 @@ static struct c_expr
 c_parser_sizeof_expression (c_parser *parser)
 {
   struct c_expr expr;
+  struct c_expr result;
   location_t expr_loc;
   gcc_assert (c_parser_next_token_is_keyword (parser, RID_SIZEOF));
+
+  location_t start;
+  location_t finish = UNKNOWN_LOCATION;
+
+  start = c_parser_peek_token (parser)->location;
+
   c_parser_consume_token (parser);
   c_inhibit_evaluation_warnings++;
   in_sizeof++;
@@ -6714,6 +6770,7 @@ c_parser_sizeof_expression (c_parser *parser)
       expr_loc = c_parser_peek_token (parser)->location;
       type_name = c_parser_type_name (parser);
       c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, "expected %<)%>");
+      finish = parser->tokens_buf[0].location;
       if (type_name == NULL)
 	{
 	  struct c_expr ret;
@@ -6729,17 +6786,19 @@ c_parser_sizeof_expression (c_parser *parser)
 	  expr = c_parser_postfix_expression_after_paren_type (parser,
 							       type_name,
 							       expr_loc);
+	  finish = expr.get_finish ();
 	  goto sizeof_expr;
 	}
       /* sizeof ( type-name ).  */
       c_inhibit_evaluation_warnings--;
       in_sizeof--;
-      return c_expr_sizeof_type (expr_loc, type_name);
+      result = c_expr_sizeof_type (expr_loc, type_name);
     }
   else
     {
       expr_loc = c_parser_peek_token (parser)->location;
       expr = c_parser_unary_expression (parser);
+      finish = expr.get_finish ();
     sizeof_expr:
       c_inhibit_evaluation_warnings--;
       in_sizeof--;
@@ -6747,8 +6806,11 @@ c_parser_sizeof_expression (c_parser *parser)
       if (TREE_CODE (expr.value) == COMPONENT_REF
 	  && DECL_C_BIT_FIELD (TREE_OPERAND (expr.value, 1)))
 	error_at (expr_loc, "%<sizeof%> applied to a bit-field");
-      return c_expr_sizeof_expr (expr_loc, expr);
+      result = c_expr_sizeof_expr (expr_loc, expr);
     }
+  if (finish != UNKNOWN_LOCATION)
+    set_c_expr_source_range (&result, start, finish);
+  return result;
 }
 
 /* Parse an alignof expression.  */
@@ -7168,12 +7230,14 @@ c_parser_postfix_expression (c_parser *parser)
   struct c_expr expr, e1;
   struct c_type_name *t1, *t2;
   location_t loc = c_parser_peek_token (parser)->location;;
+  source_range tok_range = c_parser_peek_token (parser)->get_range ();
   expr.original_code = ERROR_MARK;
   expr.original_type = NULL;
   switch (c_parser_peek_token (parser)->type)
     {
     case CPP_NUMBER:
       expr.value = c_parser_peek_token (parser)->value;
+      set_c_expr_source_range (&expr, tok_range);
       loc = c_parser_peek_token (parser)->location;
       c_parser_consume_token (parser);
       if (TREE_CODE (expr.value) == FIXED_CST
@@ -7188,6 +7252,7 @@ c_parser_postfix_expression (c_parser *parser)
     case CPP_CHAR32:
     case CPP_WCHAR:
       expr.value = c_parser_peek_token (parser)->value;
+      set_c_expr_source_range (&expr, tok_range);
       c_parser_consume_token (parser);
       break;
     case CPP_STRING:
@@ -7196,6 +7261,7 @@ c_parser_postfix_expression (c_parser *parser)
     case CPP_WSTRING:
     case CPP_UTF8STRING:
       expr.value = c_parser_peek_token (parser)->value;
+      set_c_expr_source_range (&expr, tok_range);
       expr.original_code = STRING_CST;
       c_parser_consume_token (parser);
       break;
@@ -7203,6 +7269,7 @@ c_parser_postfix_expression (c_parser *parser)
       gcc_assert (c_dialect_objc ());
       expr.value
 	= objc_build_string_object (c_parser_peek_token (parser)->value);
+      set_c_expr_source_range (&expr, tok_range);
       c_parser_consume_token (parser);
       break;
     case CPP_NAME:
@@ -7216,6 +7283,7 @@ c_parser_postfix_expression (c_parser *parser)
 					     (c_parser_peek_token (parser)->type
 					      == CPP_OPEN_PAREN),
 					     &expr.original_type);
+	    set_c_expr_source_range (&expr, tok_range);
 	    break;
 	  }
 	case C_ID_CLASSNAME:
@@ -7304,6 +7372,7 @@ c_parser_postfix_expression (c_parser *parser)
       else
 	{
 	  /* A parenthesized expression.  */
+	  location_t loc_open_paren = c_parser_peek_token (parser)->location;
 	  c_parser_consume_token (parser);
 	  expr = c_parser_expression (parser);
 	  if (TREE_CODE (expr.value) == MODIFY_EXPR)
@@ -7311,6 +7380,8 @@ c_parser_postfix_expression (c_parser *parser)
 	  if (expr.original_code != C_MAYBE_CONST_EXPR)
 	    expr.original_code = ERROR_MARK;
 	  /* Don't change EXPR.ORIGINAL_TYPE.  */
+	  location_t loc_close_paren = c_parser_peek_token (parser)->location;
+	  set_c_expr_source_range (&expr, loc_open_paren, loc_close_paren);
 	  c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
 				     "expected %<)%>");
 	}
@@ -7901,6 +7972,8 @@ c_parser_postfix_expression_after_primary (c_parser *parser,
   vec<tree, va_gc> *exprlist;
   vec<tree, va_gc> *origtypes = NULL;
   vec<location_t> arg_loc = vNULL;
+  location_t start;
+  location_t finish;
 
   while (true)
     {
@@ -7937,7 +8010,10 @@ c_parser_postfix_expression_after_primary (c_parser *parser,
 		{
 		  c_parser_skip_until_found (parser, CPP_CLOSE_SQUARE,
 					     "expected %<]%>");
+		  start = expr.get_start ();
+		  finish = parser->tokens_buf[0].location;
 		  expr.value = build_array_ref (op_loc, expr.value, idx);
+		  set_c_expr_source_range (&expr, start, finish);
 		}
 	    }
 	  expr.original_code = ERROR_MARK;
@@ -7980,9 +8056,13 @@ c_parser_postfix_expression_after_primary (c_parser *parser,
 			"%<memset%> used with constant zero length parameter; "
 			"this could be due to transposed parameters");
 
+	  start = expr.get_start ();
+	  finish = parser->tokens_buf[0].get_finish ();
 	  expr.value
 	    = c_build_function_call_vec (expr_loc, arg_loc, expr.value,
 					 exprlist, origtypes);
+	  set_c_expr_source_range (&expr, start, finish);
+
 	  expr.original_code = ERROR_MARK;
 	  if (TREE_CODE (expr.value) == INTEGER_CST
 	      && TREE_CODE (orig_expr.value) == FUNCTION_DECL
@@ -8011,8 +8091,11 @@ c_parser_postfix_expression_after_primary (c_parser *parser,
               expr.original_type = NULL;
 	      return expr;
 	    }
+	  start = expr.get_start ();
+	  finish = c_parser_peek_token (parser)->get_finish ();
 	  c_parser_consume_token (parser);
 	  expr.value = build_component_ref (op_loc, expr.value, ident);
+	  set_c_expr_source_range (&expr, start, finish);
 	  expr.original_code = ERROR_MARK;
 	  if (TREE_CODE (expr.value) != COMPONENT_REF)
 	    expr.original_type = NULL;
@@ -8040,12 +8123,15 @@ c_parser_postfix_expression_after_primary (c_parser *parser,
 	      expr.original_type = NULL;
 	      return expr;
 	    }
+	  start = expr.get_start ();
+	  finish = c_parser_peek_token (parser)->get_finish ();
 	  c_parser_consume_token (parser);
 	  expr.value = build_component_ref (op_loc,
 					    build_indirect_ref (op_loc,
 								expr.value,
 								RO_ARROW),
 					    ident);
+	  set_c_expr_source_range (&expr, start, finish);
 	  expr.original_code = ERROR_MARK;
 	  if (TREE_CODE (expr.value) != COMPONENT_REF)
 	    expr.original_type = NULL;
@@ -8061,6 +8147,8 @@ c_parser_postfix_expression_after_primary (c_parser *parser,
 	  break;
 	case CPP_PLUS_PLUS:
 	  /* Postincrement.  */
+	  start = expr.get_start ();
+	  finish = c_parser_peek_token (parser)->get_finish ();
 	  c_parser_consume_token (parser);
 	  /* If the expressions have array notations, we expand them.  */
 	  if (flag_cilkplus
@@ -8072,11 +8160,14 @@ c_parser_postfix_expression_after_primary (c_parser *parser,
 	      expr.value = build_unary_op (op_loc,
 					   POSTINCREMENT_EXPR, expr.value, 0);
 	    }
+	  set_c_expr_source_range (&expr, start, finish);
 	  expr.original_code = ERROR_MARK;
 	  expr.original_type = NULL;
 	  break;
 	case CPP_MINUS_MINUS:
 	  /* Postdecrement.  */
+	  start = expr.get_start ();
+	  finish = c_parser_peek_token (parser)->get_finish ();
 	  c_parser_consume_token (parser);
 	  /* If the expressions have array notations, we expand them.  */
 	  if (flag_cilkplus
@@ -8088,6 +8179,7 @@ c_parser_postfix_expression_after_primary (c_parser *parser,
 	      expr.value = build_unary_op (op_loc,
 					   POSTDECREMENT_EXPR, expr.value, 0);
 	    }
+	  set_c_expr_source_range (&expr, start, finish);
 	  expr.original_code = ERROR_MARK;
 	  expr.original_type = NULL;
 	  break;
diff --git a/gcc/c/c-tree.h b/gcc/c/c-tree.h
index 667529a..ffa0598 100644
--- a/gcc/c/c-tree.h
+++ b/gcc/c/c-tree.h
@@ -132,6 +132,17 @@ struct c_expr
      The type of an enum constant is a plain integer type, but this
      field will be the enum type.  */
   tree original_type;
+
+  /* The source range of this expression.  This is redundant
+     for node values that have locations, but not all node kinds
+     have locations (e.g. constants, and references to params, locals,
+     etc), so we stash a copy here.  */
+  source_range src_range;
+
+  /* Access to the first and last locations within the source spelling
+     of this expression.  */
+  location_t get_start () const { return src_range.m_start; }
+  location_t get_finish () const { return src_range.m_finish; }
 };
 
 /* Type alias for struct c_expr. This allows to use the structure
@@ -709,4 +720,12 @@ extern void pedwarn_c90 (location_t, int opt, const char *, ...)
 extern bool pedwarn_c99 (location_t, int opt, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,4);
 
+extern void
+set_c_expr_source_range (c_expr *expr,
+			 location_t start, location_t finish);
+
+extern void
+set_c_expr_source_range (c_expr *expr,
+			 source_range src_range);
+
 #endif /* ! GCC_C_TREE_H */
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index df3245a..c2e16c6 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -3395,6 +3395,12 @@ parser_build_unary_op (location_t loc, enum tree_code code, struct c_expr arg)
     overflow_warning (loc, result.value);
     }
 
+  /* We are typically called when parsing a prefix token at LOC acting on
+     ARG.  Reflect this by updating the source range of the result to
+     start at LOC and end at the end of ARG.  */
+  set_c_expr_source_range (&result,
+			   loc, arg.get_finish ());
+
   return result;
 }
 
@@ -3432,6 +3438,10 @@ parser_build_binary_op (location_t location, enum tree_code code,
   if (location != UNKNOWN_LOCATION)
     protected_set_expr_location (result.value, location);
 
+  set_c_expr_source_range (&result,
+			   arg1.get_start (),
+			   arg2.get_finish ());
+
   /* Check for cases such as x+y<<z which users are likely
      to misinterpret.  */
   if (warn_parentheses)
diff --git a/gcc/gcc-rich-location.c b/gcc/gcc-rich-location.c
new file mode 100644
index 0000000..b0ec47b
--- /dev/null
+++ b/gcc/gcc-rich-location.c
@@ -0,0 +1,86 @@
+/* Implementation of gcc_rich_location class
+   Copyright (C) 2014-2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "rtl.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "alias.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree-core.h"
+#include "tree.h"
+#include "diagnostic-core.h"
+#include "gcc-rich-location.h"
+#include "print-tree.h"
+#include "pretty-print.h"
+#include "intl.h"
+#include "cpplib.h"
+#include "diagnostic.h"
+
+/* Extract any source range information from EXPR and write it
+   to *R.  */
+
+static bool
+get_range_for_expr (tree expr, location_range *r)
+{
+  if (EXPR_HAS_RANGE (expr))
+    {
+      source_range sr = EXPR_LOCATION_RANGE (expr);
+
+      /* Do we have meaningful data?  */
+      if (sr.m_start && sr.m_finish)
+	{
+	  r->m_start = expand_location (sr.m_start);
+	  r->m_finish = expand_location (sr.m_finish);
+	  return true;
+	}
+    }
+
+  return false;
+}
+
+/* Add a range to the rich_location, covering expression EXPR. */
+
+void
+gcc_rich_location::add_expr (tree expr)
+{
+  gcc_assert (expr);
+
+  location_range r;
+  r.m_show_caret_p = false;
+  if (get_range_for_expr (expr, &r))
+    add_range (&r);
+}
+
+/* If T is an expression, add a range for it to the rich_location.  */
+
+void
+gcc_rich_location::maybe_add_expr (tree t)
+{
+  if (EXPR_P (t))
+    add_expr (t);
+}
diff --git a/gcc/gcc-rich-location.h b/gcc/gcc-rich-location.h
new file mode 100644
index 0000000..c82cbf1
--- /dev/null
+++ b/gcc/gcc-rich-location.h
@@ -0,0 +1,47 @@
+/* Declarations relating to class gcc_rich_location
+   Copyright (C) 2014-2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_RICH_LOCATION_H
+#define GCC_RICH_LOCATION_H
+
+/* A gcc_rich_location is libcpp's rich_location with additional
+   helper methods for working with gcc's types.  */
+class gcc_rich_location : public rich_location
+{
+ public:
+  /* Constructors.  */
+
+  /* Constructing from a location.  */
+  gcc_rich_location (source_location loc) :
+    rich_location (loc) {}
+
+  /* Constructing from a source_range.  */
+  gcc_rich_location (source_range src_range) :
+    rich_location (src_range) {}
+
+
+  /* Methods for adding ranges via gcc entities.  */
+  void
+  add_expr (tree expr);
+
+  void
+  maybe_add_expr (tree t);
+};
+
+#endif /* GCC_RICH_LOCATION_H */
diff --git a/gcc/print-tree.c b/gcc/print-tree.c
index ea50056..8b3794a 100644
--- a/gcc/print-tree.c
+++ b/gcc/print-tree.c
@@ -936,6 +936,27 @@ print_node (FILE *file, const char *prefix, tree node, int indent)
       expanded_location xloc = expand_location (EXPR_LOCATION (node));
       indent_to (file, indent+4);
       fprintf (file, "%s:%d:%d", xloc.file, xloc.line, xloc.column);
+
+      /* Print the range, if any */
+      source_range r = EXPR_LOCATION_RANGE (node);
+      if (r.m_start)
+	{
+	  xloc = expand_location (r.m_start);
+	  fprintf (file, " start: %s:%d:%d", xloc.file, xloc.line, xloc.column);
+	}
+      else
+	{
+	  fprintf (file, " start: unknown");
+	}
+      if (r.m_finish)
+	{
+	  xloc = expand_location (r.m_finish);
+	  fprintf (file, " finish: %s:%d:%d", xloc.file, xloc.line, xloc.column);
+	}
+      else
+	{
+	  fprintf (file, " finish: unknown");
+	}
     }
 
   fprintf (file, ">");
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
new file mode 100644
index 0000000..5485aaf
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
@@ -0,0 +1,422 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdiagnostics-show-caret" } */
+
+/* This is a collection of unittests to verify that we're correctly
+   capturing the source code ranges of various kinds of expression.
+
+   It uses the various "diagnostic_test_*_expression_range_plugin"
+   plugins which handles "__emit_expression_range" by generating a warning
+   at the given source range of the input argument.  Each of the
+   different plugins do this at a different phase of the internal
+   representation (tree, gimple, etc), so we can verify that the
+   source code range information is valid at each phase.
+
+   We want to accept an expression of any type.  To do this in C, we
+   use variadic arguments, but C requires at least one argument before
+   the ellipsis, so we have a dummy one.  */
+
+extern void __emit_expression_range (int dummy, ...);
+
+int global;
+
+void test_parentheses (int a, int b)
+{
+  __emit_expression_range (0, (a + b) ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, (a + b) );
+                               ~~~^~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, (a + b) * (a - b) ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, (a + b) * (a - b) );
+                               ~~~~~~~~^~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, !(a && b) ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, !(a && b) );
+                               ^~~~~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+/* Postfix expressions.  ************************************************/
+
+void test_array_reference (int *arr)
+{
+  __emit_expression_range (0, arr[100] ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, arr[100] );
+                               ~~~^~~~~
+   { dg-end-multiline-output "" } */
+}
+
+int test_function_call (int p, int q, int r)
+{
+  __emit_expression_range (0, test_function_call (p, q, r) ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, test_function_call (p, q, r) );
+                               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+  return 0;
+}
+
+struct test_struct
+{
+  int field;
+};
+
+int test_structure_references (struct test_struct *ptr)
+{
+  struct test_struct local;
+  local.field = 42;
+
+  __emit_expression_range (0, local.field ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, local.field );
+                               ~~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, ptr->field ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, ptr->field );
+                               ~~~^~~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+int test_postfix_incdec (int i)
+{
+  __emit_expression_range (0, i++ ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, i++ );
+                               ~^~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, i-- ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, i-- );
+                               ~^~
+   { dg-end-multiline-output "" } */
+}
+
+/* Unary operators.  ****************************************************/
+
+int test_prefix_incdec (int i)
+{
+  __emit_expression_range (0, ++i ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, ++i );
+                               ^~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, --i ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, --i );
+                               ^~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_address_operator (void)
+{
+  __emit_expression_range (0, &global ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, &global );
+                               ^~~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_indirection (int *ptr)
+{
+  __emit_expression_range (0, *ptr ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, *ptr );
+                               ^~~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_unary_minus (int i)
+{
+  __emit_expression_range (0, -i ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, -i );
+                               ^~
+   { dg-end-multiline-output "" } */
+}
+
+void test_ones_complement (int i)
+{
+  __emit_expression_range (0, ~i ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, ~i );
+                               ^~
+   { dg-end-multiline-output "" } */
+}
+
+void test_logical_negation (int flag)
+{
+  __emit_expression_range (0, !flag ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, !flag );
+                               ^~~~~
+   { dg-end-multiline-output "" } */
+}
+
+/* Casts.  ****************************************************/
+
+void test_cast (void *ptr)
+{
+  __emit_expression_range (0, (int *)ptr ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, (int *)ptr );
+                               ^~~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+}
+
+/* Binary operators.  *******************************************/
+
+void test_multiplicative_operators (int lhs, int rhs)
+{
+  __emit_expression_range (0, lhs * rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs * rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs / rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs / rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs % rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs % rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_additive_operators (int lhs, int rhs)
+{
+  __emit_expression_range (0, lhs + rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs + rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs - rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs - rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_shift_operators (int lhs, int rhs)
+{
+  __emit_expression_range (0, lhs << rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs << rhs );
+                               ~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs >> rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs >> rhs );
+                               ~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_relational_operators (int lhs, int rhs)
+{
+  __emit_expression_range (0, lhs < rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs < rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs > rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs > rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs <= rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs <= rhs );
+                               ~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs >= rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs >= rhs );
+                               ~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_equality_operators (int lhs, int rhs)
+{
+  __emit_expression_range (0, lhs == rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs == rhs );
+                               ~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs != rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs != rhs );
+                               ~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_bitwise_binary_operators (int lhs, int rhs)
+{
+  __emit_expression_range (0, lhs & rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs & rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs ^ rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs ^ rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs | rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs | rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_logical_operators (int lhs, int rhs)
+{
+  __emit_expression_range (0, lhs && rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs && rhs );
+                               ~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs || rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs || rhs );
+                               ~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+/* Conditional operator.  *******************************************/
+
+void test_conditional_operators (int flag, int on_true, int on_false)
+{
+  __emit_expression_range (0, flag ? on_true : on_false ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, flag ? on_true : on_false );
+                               ~~~~~~~~~~~~~~~^~~~~~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+/* Assignment expressions.  *******************************************/
+
+void test_assignment_expressions (int dest, int other)
+{
+  __emit_expression_range (0, dest = other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest = other );
+                               ~~~~~^~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest *= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest *= other );
+                               ~~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest /= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest /= other );
+                               ~~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest %= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest %= other );
+                               ~~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest += other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest += other );
+                               ~~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest -= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest -= other );
+                               ~~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest <<= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest <<= other );
+                               ~~~~~^~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest >>= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest >>= other );
+                               ~~~~~^~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest &= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest &= other );
+                               ~~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest ^= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest ^= other );
+                               ~~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest |= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest |= other );
+                               ~~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+/* Comma operator.  *******************************************/
+
+void test_comma_operator (int a, int b)
+{
+  __emit_expression_range (0, (a++, a + b) ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, (a++, a + b) );
+                               ~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+/* Examples of non-trivial expressions.  ****************************/
+
+extern double sqrt (double x);
+
+void test_quadratic (double a, double b, double c)
+{
+  __emit_expression_range (0, b * b - 4 * a * c ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, b * b - 4 * a * c );
+                               ~~~~~~^~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0,
+     (-b + sqrt (b * b - 4 * a * c))
+     / (2 * a)); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+      (-b + sqrt (b * b - 4 * a * c))
+      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+      / (2 * a));
+      ^~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c
new file mode 100644
index 0000000..46e97b7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c
@@ -0,0 +1,152 @@
+/* This plugin verifies the source-code location ranges of
+   expressions, at the pre-gimplification tree stage.  */
+/* { dg-options "-O" } */
+
+#include "gcc-plugin.h"
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "stringpool.h"
+#include "toplev.h"
+#include "basic-block.h"
+#include "hash-table.h"
+#include "vec.h"
+#include "ggc.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "internal-fn.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "tree.h"
+#include "tree-pass.h"
+#include "intl.h"
+#include "plugin-version.h"
+#include "diagnostic.h"
+#include "context.h"
+#include "gcc-rich-location.h"
+#include "print-tree.h"
+
+/*
+  Hack: fails with linker error:
+./diagnostic_plugin_test_tree_expression_range.so: undefined symbol: _ZN17gcc_rich_location8add_exprEP9tree_node
+  since nothing in the tree is using gcc_rich_location::add_expr yet.
+
+  I've tried various workarounds (adding DEBUG_FUNCTION to the
+  method, taking its address), but can't seem to fix it that way.
+  So as a nasty workaround, the following material is copied&pasted
+  from gcc-rich-location.c: */
+
+static bool
+get_range_for_expr (tree expr, location_range *r)
+{
+  if (EXPR_HAS_RANGE (expr))
+    {
+      source_range sr = EXPR_LOCATION_RANGE (expr);
+
+      /* Do we have meaningful data?  */
+      if (sr.m_start && sr.m_finish)
+	{
+	  r->m_start = expand_location (sr.m_start);
+	  r->m_finish = expand_location (sr.m_finish);
+	  return true;
+	}
+    }
+
+  return false;
+}
+
+/* Add a range to the rich_location, covering expression EXPR. */
+
+void
+gcc_rich_location::add_expr (tree expr)
+{
+  gcc_assert (expr);
+
+  location_range r;
+  r.m_show_caret_p = false;
+  if (get_range_for_expr (expr, &r))
+    add_range (&r);
+}
+
+/* FIXME: end of material taken from gcc-rich-location.c */
+
+
+int plugin_is_GPL_compatible;
+
+static void
+emit_warning (rich_location *richloc)
+{
+  if (richloc->get_num_locations () < 2)
+    {
+      error_at_rich_loc (richloc, "range not found");
+      return;
+    }
+
+  location_range *range = richloc->get_range (1);
+  warning_at_rich_loc (richloc, 0,
+		       "tree range %i:%i-%i:%i",
+		       range->m_start.line,
+		       range->m_start.column,
+		       range->m_finish.line,
+		       range->m_finish.column);
+}
+
+tree
+cb_walk_tree_fn (tree * tp, int * walk_subtrees,
+		 void * data ATTRIBUTE_UNUSED)
+{
+  if (TREE_CODE (*tp) != CALL_EXPR)
+    return NULL_TREE;
+
+  tree call_expr = *tp;
+  tree fn = CALL_EXPR_FN (call_expr);
+  if (TREE_CODE (fn) != ADDR_EXPR)
+    return NULL_TREE;
+  fn = TREE_OPERAND (fn, 0);
+  if (TREE_CODE (fn) != FUNCTION_DECL)
+    return NULL_TREE;
+  if (strcmp (IDENTIFIER_POINTER (DECL_NAME (fn)), "__emit_expression_range"))
+    return NULL_TREE;
+
+  /* Get arg 1; print it! */
+  tree arg = CALL_EXPR_ARG (call_expr, 1);
+
+  gcc_rich_location richloc (EXPR_LOCATION (arg));
+  richloc.add_expr (arg);
+  emit_warning (&richloc);
+
+  return NULL_TREE;
+}
+
+static void
+callback (void *gcc_data, void *user_data)
+{
+  tree fndecl = (tree)gcc_data;
+  walk_tree (&DECL_SAVED_TREE (fndecl), cb_walk_tree_fn, NULL, NULL);
+}
+
+int
+plugin_init (struct plugin_name_args *plugin_info,
+	     struct plugin_gcc_version *version)
+{
+  struct register_pass_info pass_info;
+  const char *plugin_name = plugin_info->base_name;
+  int argc = plugin_info->argc;
+  struct plugin_argument *argv = plugin_info->argv;
+
+  if (!plugin_default_version_check (version, &gcc_version))
+    return 1;
+
+  register_callback (plugin_name,
+		     PLUGIN_PRE_GENERICIZE,
+		     callback,
+		     NULL);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/plugin.exp b/gcc/testsuite/gcc.dg/plugin/plugin.exp
index 941bccc..b7efcf5 100644
--- a/gcc/testsuite/gcc.dg/plugin/plugin.exp
+++ b/gcc/testsuite/gcc.dg/plugin/plugin.exp
@@ -66,6 +66,8 @@ set plugin_test_list [list \
     { diagnostic_plugin_test_show_locus.c \
 	  diagnostic-test-show-locus-bw.c \
 	  diagnostic-test-show-locus-color.c } \
+    { diagnostic_plugin_test_tree_expression_range.c \
+	  diagnostic-test-expressions-1.c } \
 ]
 
 foreach plugin_test $plugin_test_list {
diff --git a/gcc/tree.c b/gcc/tree.c
index 426803c..a676352 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -13660,4 +13660,26 @@ set_block (location_t loc, tree block)
   return COMBINE_LOCATION_DATA (line_table, loc, src_range, block);
 }
 
+void
+set_source_range (tree expr, location_t start, location_t finish)
+{
+  source_range src_range;
+  src_range.m_start = start;
+  src_range.m_finish = finish;
+  set_source_range (expr, src_range);
+}
+
+void
+set_source_range (tree expr, source_range src_range)
+{
+  if (!EXPR_P (expr))
+    return;
+
+  location_t adhoc = COMBINE_LOCATION_DATA (line_table,
+					    EXPR_LOCATION (expr),
+					    src_range,
+					    NULL);
+  SET_EXPR_LOCATION (expr, adhoc);
+}
+
 #include "gt-tree.h"
diff --git a/gcc/tree.h b/gcc/tree.h
index 92cc929..7d20f74 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1070,10 +1070,25 @@ extern void omp_clause_range_check_failed (const_tree, const char *, int,
 #define EXPR_FILENAME(NODE) LOCATION_FILE (EXPR_CHECK ((NODE))->exp.locus)
 #define EXPR_LINENO(NODE) LOCATION_LINE (EXPR_CHECK (NODE)->exp.locus)
 
+#define CAN_HAVE_RANGE_P(NODE) (CAN_HAVE_LOCATION_P (NODE))
+#define EXPR_LOCATION_RANGE(NODE) (get_expr_source_range (EXPR_CHECK ((NODE))))
+
+#define EXPR_HAS_RANGE(NODE) \
+    (CAN_HAVE_RANGE_P (NODE) \
+     ? EXPR_LOCATION_RANGE (NODE).m_start != UNKNOWN_LOCATION \
+     : false)
+
 /* True if a tree is an expression or statement that can have a
    location.  */
 #define CAN_HAVE_LOCATION_P(NODE) ((NODE) && EXPR_P (NODE))
 
+static inline source_range
+get_expr_source_range (tree expr)
+{
+  location_t loc = EXPR_LOCATION (expr);
+  return get_range_from_loc (line_table, loc);
+}
+
 extern void protected_set_expr_location (tree, location_t);
 
 /* In a TARGET_EXPR node.  */
@@ -2098,6 +2113,9 @@ extern machine_mode element_mode (const_tree t);
 #define DECL_IS_BUILTIN(DECL) \
   (LOCATION_LOCUS (DECL_SOURCE_LOCATION (DECL)) <= BUILTINS_LOCATION)
 
+#define DECL_LOCATION_RANGE(NODE) \
+  (get_decl_source_range (DECL_MINIMAL_CHECK (NODE)))
+
 /*  For FIELD_DECLs, this is the RECORD_TYPE, UNION_TYPE, or
     QUAL_UNION_TYPE node that the field is a member of.  For VAR_DECL,
     PARM_DECL, FUNCTION_DECL, LABEL_DECL, RESULT_DECL, and CONST_DECL
@@ -5148,4 +5166,17 @@ extern void gt_pch_nx (tree &, gt_pointer_operator, void *);
 
 extern bool nonnull_arg_p (const_tree);
 
+extern void
+set_source_range (tree expr, location_t start, location_t finish);
+
+extern void
+set_source_range (tree expr, source_range src_range);
+
+static inline source_range
+get_decl_source_range (tree decl)
+{
+  location_t loc = DECL_SOURCE_LOCATION (decl);
+  return get_range_from_loc (line_table, loc);
+}
+
 #endif  /* GCC_TREE_H  */
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 03/10] libstdc++v3: Explicitly disable carets and colorization within testsuite
  2015-10-23 20:25 ` [PATCH 00/10] Overhaul of diagnostics (v5) David Malcolm
  2015-10-23 20:24   ` [PATCH 06/10] Track expression ranges in C frontend David Malcolm
@ 2015-10-23 20:24   ` David Malcolm
  2015-10-23 21:10     ` Jeff Law
  2015-10-23 20:24   ` [PATCH 01/10] Improvements to description of source_location in line-map.h David Malcolm
                     ` (8 subsequent siblings)
  10 siblings, 1 reply; 83+ messages in thread
From: David Malcolm @ 2015-10-23 20:24 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

Later on in this patch kit, with token range underlining, the
libstdc++v3 testsuite starts showing numerous failures of the form:

  FAIL: 17_intro/using_namespace_std_tr1_neg.cc (test for excess errors)

The excess errors turn out to be the source code and
caret/underlines emitted after an "error":

  using namespace std::tr1;  // { dg-error "is not a namespace-name" }
                  ^~~

However, looking at the results of a control build of r228618, I see
the testsuite emit code and carets (albeit without underlines):

  using namespace std::tr1;  // { dg-error "is not a namespace-name" }
                  ^

and for some reason this is treated by dg.exp as:

  PASS: 17_intro/using_namespace_std_tr1_neg.cc (test for excess errors)

It's not clear to me why the status quo isn't treating the lines of
dumped source code and caret as "excess errors", but the attached
patch explicitly disables carets and colorization.

libstdc++-v3/ChangeLog:
	* testsuite/lib/libstdc++.exp (v3_target_compile): Add
	-fno-diagnostics-show-caret -fdiagnostics-color=never to
	option's additional_flags.
---
 libstdc++-v3/testsuite/lib/libstdc++.exp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/libstdc++-v3/testsuite/lib/libstdc++.exp b/libstdc++-v3/testsuite/lib/libstdc++.exp
index 88738b7..ac3654b 100644
--- a/libstdc++-v3/testsuite/lib/libstdc++.exp
+++ b/libstdc++-v3/testsuite/lib/libstdc++.exp
@@ -462,6 +462,8 @@ proc v3_target_compile { source dest type options } {
     global STATIC_LIBCXXFLAGS
     global tool
 
+    lappend options "additional_flags=-fno-diagnostics-show-caret -fdiagnostics-color=never"
+
     if { [target_info needs_status_wrapper] != "" && [info exists gluefile] } {
         lappend options "libs=${gluefile}"
         lappend options "ldflags=${wrap_flags}"
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 00/10] Overhaul of diagnostics (v5)
  2015-09-22 21:09 [PATCH 0/5] RFC: Overhaul of diagnostics (v2) David Malcolm
                   ` (5 preceding siblings ...)
  2015-09-23 13:36 ` [PATCH 0/5] RFC: Overhaul of diagnostics (v2) Michael Matz
@ 2015-10-23 20:25 ` David Malcolm
  2015-10-23 20:24   ` [PATCH 06/10] Track expression ranges in C frontend David Malcolm
                     ` (10 more replies)
  6 siblings, 11 replies; 83+ messages in thread
From: David Malcolm @ 2015-10-23 20:25 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

This is a followup to:
  https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01696.html
(one of the individual patches has seen iteration since that, so
I'm calling the whole thing "v5" for the sake of clarity).


Patches 1-3 are a preamble:
  "Improvements to description of source_location in line-map.h"
  "Add stats on adhoc table to dump_line_table_statistics"
  "libstdc++v3: Explicitly disable carets and colorization within
    testsuite"

Patch 4:
 "Reimplement diagnostic_show_locus, introducing rich_location classes (v5)"
is an updated version of the rewrite of diagnostic_show_locus,
via the new rich_location class.  I believe this one is ready for trunk
and could be applied without needing the followup patches; I
have a followup patch that adds support for "fix it hints" on top
of this (PR/62314).

Patch 5:
  "Add ranges to libcpp tokens (via ad-hoc data, unoptimized)"
implements token range tracking by adding range information to
the ad-hoc location table.  As noted in the patch, this generalizes
source_location (aka location_t) to be both a caret and a range,
letting us track them through our existing location-tracking
mechanisms, without having to add extra fields to core data structures.
The drawback is that it's inefficient. This is addressed by patch 10,
which implements a packing scheme to avoid the ad-hoc table for most
tokens.

Patch 6:
  "Track expression ranges in C frontend"
is an updated version of the patch to add tracking of expression
ranges to the C frontend, using the above mechanism.

Patch 7:
  "Add plugin to recursively dump the source-ranges in a tree (v2)"
is the test plugin to demo dumping the ranges for all
sub-expressions of a complicated expression.  It's unchanged since
previous versions.

Patch 8:
  "Wire things up so that libcpp users get token underlines"
wires up the work from patches 4 and 5 so that most diagnostics
in frontends using libcpp will see some kind of underlining, for tokens
at least.

Patch 9:
  "Delay some resolution of ad-hoc locations, preserving ranges"
tweaks things to provide underlines for some places that patch 8
missed.

Patch 10:
  "Compress short ranges into source_location"
is the bit-packing optimization for patch 5.

Successfully bootstrapped&regrtested the net effect of the kit on
x86_64-pc-linux-gnu (with 186 new PASS results for gcc.sum).

Some benchmarks can be seen in this post:
  https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02283.html

Are patches 1-4 OK for trunk?  (assuming they individually
bootstrap&regrtest?)

How do patches 5-10 look?  I'm about to do some more benchmarking.

(The patches are relative to r228618 plus the dg-begin-multiline-output
patch).

Dave

 gcc/Makefile.in                                    |   1 +
 gcc/ada/gcc-interface/trans.c                      |   3 +-
 gcc/c-family/c-common.c                            |  25 +-
 gcc/c-family/c-common.h                            |   4 +-
 gcc/c-family/c-opts.c                              |   2 +
 gcc/c/c-decl.c                                     |   3 +-
 gcc/c/c-errors.c                                   |  12 +-
 gcc/c/c-objc-common.c                              |   2 +-
 gcc/c/c-parser.c                                   |  98 ++-
 gcc/c/c-tree.h                                     |  19 +
 gcc/c/c-typeck.c                                   |  10 +
 gcc/cp/error.c                                     |   5 +-
 gcc/diagnostic-color.c                             |   5 +-
 gcc/diagnostic-core.h                              |   8 +
 gcc/diagnostic-show-locus.c                        | 778 ++++++++++++++++++---
 gcc/diagnostic.c                                   | 196 +++++-
 gcc/diagnostic.h                                   |  55 +-
 gcc/fortran/cpp.c                                  |  13 +-
 gcc/fortran/error.c                                | 103 +--
 gcc/gcc-rich-location.c                            |  86 +++
 gcc/gcc-rich-location.h                            |  47 ++
 gcc/genmatch.c                                     |  27 +-
 gcc/gimple.h                                       |   6 +-
 gcc/input.c                                        |  41 +-
 gcc/pretty-print.c                                 |  21 +
 gcc/pretty-print.h                                 |  25 +-
 gcc/print-tree.c                                   |  21 +
 gcc/rtl-error.c                                    |   3 +-
 gcc/testsuite/gcc.dg/diagnostic-token-ranges.c     | 120 ++++
 .../gcc.dg/diagnostic-tree-expr-ranges-2.c         |  23 +
 .../gcc.dg/plugin/diagnostic-test-expressions-1.c  | 422 +++++++++++
 .../gcc.dg/plugin/diagnostic-test-show-locus-bw.c  | 149 ++++
 .../plugin/diagnostic-test-show-locus-color.c      | 158 +++++
 .../gcc.dg/plugin/diagnostic-test-show-trees-1.c   |  65 ++
 .../gcc.dg/plugin/diagnostic_plugin_show_trees.c   | 174 +++++
 .../plugin/diagnostic_plugin_test_show_locus.c     | 322 +++++++++
 .../diagnostic_plugin_test_tree_expression_range.c |  98 +++
 gcc/testsuite/gcc.dg/plugin/plugin.exp             |   7 +
 gcc/testsuite/lib/gcc-dg.exp                       |   1 +
 gcc/toplev.c                                       |   1 +
 gcc/tree-cfg.c                                     |   9 +-
 gcc/tree-diagnostic.c                              |   2 +-
 gcc/tree-inline.c                                  |   5 +-
 gcc/tree-pretty-print.c                            |   2 +-
 gcc/tree.c                                         |  54 +-
 gcc/tree.h                                         |  34 +
 libcpp/errors.c                                    |   7 +-
 libcpp/include/cpplib.h                            |   7 +-
 libcpp/include/line-map.h                          | 354 +++++++++-
 libcpp/lex.c                                       |  13 +
 libcpp/line-map.c                                  | 415 ++++++++++-
 libcpp/location-example.txt                        | 188 ++---
 libstdc++-v3/testsuite/lib/libstdc++.exp           |   2 +
 53 files changed, 3772 insertions(+), 479 deletions(-)
 create mode 100644 gcc/gcc-rich-location.c
 create mode 100644 gcc/gcc-rich-location.h
 create mode 100644 gcc/testsuite/gcc.dg/diagnostic-token-ranges.c
 create mode 100644 gcc/testsuite/gcc.dg/diagnostic-tree-expr-ranges-2.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-bw.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-color.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-trees-1.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_show_trees.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c

-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 04/10] Reimplement diagnostic_show_locus, introducing rich_location classes (v5)
  2015-10-23 20:25 ` [PATCH 00/10] Overhaul of diagnostics (v5) David Malcolm
                     ` (2 preceding siblings ...)
  2015-10-23 20:24   ` [PATCH 01/10] Improvements to description of source_location in line-map.h David Malcolm
@ 2015-10-23 20:25   ` David Malcolm
  2015-10-27 23:12     ` Jeff Law
  2015-10-23 20:26   ` [PATCH 08/10] Wire things up so that libcpp users get token underlines David Malcolm
                     ` (6 subsequent siblings)
  10 siblings, 1 reply; 83+ messages in thread
From: David Malcolm @ 2015-10-23 20:25 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

The change since v4 can be seen at:
 https://dmalcolm.fedorapeople.org/gcc/2015-10-23/0001-Add-colorize_source_p-to-diagnostic_context.patch
which is a tweak to colorization, to handle both frontends that provide
ranges and those that only provide carets, and provide a smoother
transition path for the latter.

gcc/ChangeLog:
	* diagnostic-color.c (color_dict): Eliminate "caret"; add "range1"
	and "range2".
	(parse_gcc_colors): Update comment to describe default GCC_COLORS.
	* diagnostic-core.h (warning_at_rich_loc): New declaration.
	(error_at_rich_loc): New declaration.
	(permerror_at_rich_loc): New declaration.
	(inform_at_rich_loc): New declaration.
	* diagnostic-show-locus.c (adjust_line): Delete.
	(struct point_state): New struct.
	(class colorizer): New class.
	(class layout_point): New class.
	(class layout_range): New class.
	(class layout): New class.
	(colorizer::colorizer): New ctor.
	(colorizer::~colorizer): New dtor.
	(layout::layout): New ctor.
	(layout::print_line): New method.
	(layout::get_state_at_point): New method.
	(layout::get_x_bound_for_row): New method.
	(show_ruler): New function.
	(diagnostic_show_locus): Reimplement in terms of class layout.
	* diagnostic.c (diagnostic_initialize): Replace
	MAX_LOCATIONS_PER_MESSAGE with rich_location::MAX_RANGES.
	(diagnostic_set_info_translated): Convert param from location_t
	to rich_location *.  Eliminate calls to set_location on the
	message in favor of storing the rich_location ptr there.
	(diagnostic_set_info): Convert param from location_t to
	rich_location *.
	(diagnostic_build_prefix): Break out array into...
	(diagnostic_kind_color): New variable.
	(diagnostic_get_color_for_kind): New function.
	(diagnostic_report_diagnostic): Colorize the option_text
	using the color for the severity.
	(diagnostic_append_note): Update for change in signature of
	diagnostic_set_info.
	(diagnostic_append_note_at_rich_loc): New function.
	(emit_diagnostic): Update for change in signature of
	diagnostic_set_info.
	(inform): Likewise.
	(inform_at_rich_loc): New function.
	(inform_n): Update for change in signature of diagnostic_set_info.
	(warning): Likewise.
	(warning_at): Likewise.
	(warning_at_rich_loc): New function.
	(warning_n): Update for change in signature of diagnostic_set_info.
	(pedwarn): Likewise.
	(permerror): Likewise.
	(permerror_at_rich_loc): New function.
	(error): Update for change in signature of diagnostic_set_info.
	(error_n): Likewise.
	(error_at): Likewise.
	(error_at_rich_loc): New function.
	(sorry): Update for change in signature of diagnostic_set_info.
	(fatal_error): Likewise.
	(internal_error): Likewise.
	(internal_error_no_backtrace): Likewise.
	(source_range::debug): New function.
	* diagnostic.h (struct diagnostic_info): Eliminate field
	"override_column".  Add field "richloc".
	(struct diagnostic_context): Add field "colorize_source_p".
	(diagnostic_override_column): Delete.
	(diagnostic_set_info): Convert param from location_t to
	rich_location *.
	(diagnostic_set_info_translated): Likewise.
	(diagnostic_append_note_at_rich_loc): New function.
	(diagnostic_num_locations): New function.
	(diagnostic_expand_location): Get the location from the
	rich_location.
	(diagnostic_print_caret_line): Delete.
	(diagnostic_get_color_for_kind): New declaration.
	* genmatch.c (linemap_client_expand_location_to_spelling_point): New.
	(error_cb): Update for change in signature of "error" callback.
	(fatal_at): Likewise.
	(warning_at): Likewise.
	* input.c (linemap_client_expand_location_to_spelling_point): New.
	* pretty-print.c (text_info::set_range): New method.
	(text_info::get_location): New method.
	* pretty-print.h (MAX_LOCATIONS_PER_MESSAGE): Eliminate this macro.
	(struct text_info): Eliminate "locations" array in favor of
	"m_richloc", a rich_location *.
	(textinfo::set_location): Add a "caret_p" param, and reimplement
	in terms of a call to set_range.
	(textinfo::get_location): Eliminate inline implementation in favor of
	an out-of-line reimplementation.
	(textinfo::set_range): New method.
	* rtl-error.c (diagnostic_for_asm): Update for change in signature
	of diagnostic_set_info.
	* tree-diagnostic.c (default_tree_printer): Update for new
	"caret_p" param for textinfo::set_location.
	* tree-pretty-print.c (percent_K_format): Likewise.

gcc/c-family/ChangeLog:
	* c-common.c (c_cpp_error): Convert parameter from location_t to
	rich_location *.  Eliminate the "column_override" parameter and
	the call to diagnostic_override_column.
	Update the "done_lexing" clause to set range 0
	on the rich_location, rather than overwriting a location_t.
	* c-common.h (c_cpp_error): Convert parameter from location_t to
	rich_location *.  Eliminate the "column_override" parameter.

gcc/c/ChangeLog:
	* c-decl.c (warn_defaults_to): Update for change in signature
	of diagnostic_set_info.
	* c-errors.c (pedwarn_c99): Likewise.
	(pedwarn_c90): Likewise.
	* c-objc-common.c (c_tree_printer): Update for new "caret_p" param
	for textinfo::set_location.

gcc/cp/ChangeLog:
	* error.c (cp_printer): Update for new "caret_p" param for
	textinfo::set_location.
	(pedwarn_cxx98): Update for change in signature of
	diagnostic_set_info.

gcc/fortran/ChangeLog:
	* cpp.c (cb_cpp_error): Convert parameter from location_t to
	rich_location *.  Eliminate the "column_override" parameter.
	* error.c (gfc_warning): Update for change in signature of
	diagnostic_set_info.
	(gfc_format_decoder): Update handling of %C/%L for changes
	to struct text_info.
	(gfc_diagnostic_starter): Use richloc when determining whether to
	print one locus or two.  When handling a location that will
	involve a call to diagnostic_show_locus, only attempt to print the
	locus for the primary location, and don't call into
	diagnostic_print_caret_line.
	(gfc_warning_now_at): Update for change in signature of
	diagnostic_set_info.
	(gfc_warning_now): Likewise.
	(gfc_error_now): Likewise.
	(gfc_fatal_error): Likewise.
	(gfc_error): Likewise.
	(gfc_internal_error): Likewise.

gcc/testsuite/ChangeLog:
	* gcc.dg/plugin/diagnostic-test-show-locus-bw.c: New file.
	* gcc.dg/plugin/diagnostic-test-show-locus-color.c: New file.
	* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c: New file.
	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add the above.
	* lib/gcc-dg.exp: Load multiline.exp.

libcpp/ChangeLog:
	* errors.c (cpp_diagnostic): Update for change in signature
	of "error" callback.
	(cpp_diagnostic_with_line): Likewise, calling override_column
	on the rich_location.
	* include/cpplib.h (struct cpp_callbacks): Within "error"
	callback, convert param from source_location to rich_location *,
	and drop column_override param.
	* include/line-map.h (struct source_range): New struct.
	(struct location_range): New struct.
	(class rich_location): New class.
	(linemap_client_expand_location_to_spelling_point): New declaration.
	* line-map.c (rich_location::rich_location): New ctors.
	(rich_location::lazily_expand_location): New method.
	(rich_location::override_column): New method.
	(rich_location::add_range): New methods.
	(rich_location::set_range): New method.
---
 gcc/c-family/c-common.c                            |  15 +-
 gcc/c-family/c-common.h                            |   4 +-
 gcc/c/c-decl.c                                     |   3 +-
 gcc/c/c-errors.c                                   |  12 +-
 gcc/c/c-objc-common.c                              |   2 +-
 gcc/cp/error.c                                     |   5 +-
 gcc/diagnostic-color.c                             |   5 +-
 gcc/diagnostic-core.h                              |   8 +
 gcc/diagnostic-show-locus.c                        | 778 ++++++++++++++++++---
 gcc/diagnostic.c                                   | 196 +++++-
 gcc/diagnostic.h                                   |  55 +-
 gcc/fortran/cpp.c                                  |  13 +-
 gcc/fortran/error.c                                | 103 +--
 gcc/genmatch.c                                     |  27 +-
 gcc/input.c                                        |   7 +
 gcc/pretty-print.c                                 |  21 +
 gcc/pretty-print.h                                 |  25 +-
 gcc/rtl-error.c                                    |   3 +-
 .../gcc.dg/plugin/diagnostic-test-show-locus-bw.c  | 149 ++++
 .../plugin/diagnostic-test-show-locus-color.c      | 158 +++++
 .../plugin/diagnostic_plugin_test_show_locus.c     | 326 +++++++++
 gcc/testsuite/gcc.dg/plugin/plugin.exp             |   3 +
 gcc/testsuite/lib/gcc-dg.exp                       |   1 +
 gcc/tree-diagnostic.c                              |   2 +-
 gcc/tree-pretty-print.c                            |   2 +-
 libcpp/errors.c                                    |   7 +-
 libcpp/include/cpplib.h                            |   4 +-
 libcpp/include/line-map.h                          | 207 ++++++
 libcpp/line-map.c                                  | 130 ++++
 29 files changed, 1983 insertions(+), 288 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-bw.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-color.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 4b64a44..4a5ccb7 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -10477,15 +10477,14 @@ c_option_controlling_cpp_error (int reason)
 /* Callback from cpp_error for PFILE to print diagnostics from the
    preprocessor.  The diagnostic is of type LEVEL, with REASON set
    to the reason code if LEVEL is represents a warning, at location
-   LOCATION unless this is after lexing and the compiler's location
-   should be used instead, with column number possibly overridden by
-   COLUMN_OVERRIDE if not zero; MSG is the translated message and AP
+   RICHLOC unless this is after lexing and the compiler's location
+   should be used instead; MSG is the translated message and AP
    the arguments.  Returns true if a diagnostic was emitted, false
    otherwise.  */
 
 bool
 c_cpp_error (cpp_reader *pfile ATTRIBUTE_UNUSED, int level, int reason,
-	     location_t location, unsigned int column_override,
+	     rich_location *richloc,
 	     const char *msg, va_list *ap)
 {
   diagnostic_info diagnostic;
@@ -10526,11 +10525,11 @@ c_cpp_error (cpp_reader *pfile ATTRIBUTE_UNUSED, int level, int reason,
       gcc_unreachable ();
     }
   if (done_lexing)
-    location = input_location;
+    richloc->set_range (0,
+			source_range::from_location (input_location),
+			true, true);
   diagnostic_set_info_translated (&diagnostic, msg, ap,
-				  location, dlevel);
-  if (column_override)
-    diagnostic_override_column (&diagnostic, column_override);
+				  richloc, dlevel);
   diagnostic_override_option_index (&diagnostic,
                                     c_option_controlling_cpp_error (reason));
   ret = report_diagnostic (&diagnostic);
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index d5fb499..b0a7661 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -995,9 +995,9 @@ extern void init_c_lex (void);
 
 extern void c_cpp_builtins (cpp_reader *);
 extern void c_cpp_builtins_optimize_pragma (cpp_reader *, tree, tree);
-extern bool c_cpp_error (cpp_reader *, int, int, location_t, unsigned int,
+extern bool c_cpp_error (cpp_reader *, int, int, rich_location *,
 			 const char *, va_list *)
-     ATTRIBUTE_GCC_DIAG(6,0);
+     ATTRIBUTE_GCC_DIAG(5,0);
 extern int c_common_has_attribute (cpp_reader *);
 
 extern bool parse_optimize_options (tree, bool);
diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index ce8406a..732080a 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -5297,9 +5297,10 @@ warn_defaults_to (location_t location, int opt, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
                        flag_isoc99 ? DK_PEDWARN : DK_WARNING);
   diagnostic.option_index = opt;
   report_diagnostic (&diagnostic);
diff --git a/gcc/c/c-errors.c b/gcc/c/c-errors.c
index e5fbf05..0f8b933 100644
--- a/gcc/c/c-errors.c
+++ b/gcc/c/c-errors.c
@@ -42,13 +42,14 @@ pedwarn_c99 (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool warned = false;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
   /* If desired, issue the C99/C11 compat warning, which is more specific
      than -pedantic.  */
   if (warn_c99_c11_compat > 0)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 			   (pedantic && !flag_isoc11)
 			   ? DK_PEDWARN : DK_WARNING);
       diagnostic.option_index = OPT_Wc99_c11_compat;
@@ -60,7 +61,7 @@ pedwarn_c99 (location_t location, int opt, const char *gmsgid, ...)
   /* For -pedantic outside C11, issue a pedwarn.  */
   else if (pedantic && !flag_isoc11)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_PEDWARN);
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_PEDWARN);
       diagnostic.option_index = opt;
       warned = report_diagnostic (&diagnostic);
     }
@@ -80,6 +81,7 @@ pedwarn_c90 (location_t location, int opt, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
   /* Warnings such as -Wvla are the most specific ones.  */
@@ -90,7 +92,7 @@ pedwarn_c90 (location_t location, int opt, const char *gmsgid, ...)
         goto out;
       else if (opt_var > 0)
 	{
-	  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+	  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 			       (pedantic && !flag_isoc99)
 			       ? DK_PEDWARN : DK_WARNING);
 	  diagnostic.option_index = opt;
@@ -102,7 +104,7 @@ pedwarn_c90 (location_t location, int opt, const char *gmsgid, ...)
      specific than -pedantic.  */
   if (warn_c90_c99_compat > 0)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 			   (pedantic && !flag_isoc99)
 			   ? DK_PEDWARN : DK_WARNING);
       diagnostic.option_index = OPT_Wc90_c99_compat;
@@ -114,7 +116,7 @@ pedwarn_c90 (location_t location, int opt, const char *gmsgid, ...)
   /* For -pedantic outside C99, issue a pedwarn.  */
   else if (pedantic && !flag_isoc99)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_PEDWARN);
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_PEDWARN);
       diagnostic.option_index = opt;
       report_diagnostic (&diagnostic);
     }
diff --git a/gcc/c/c-objc-common.c b/gcc/c/c-objc-common.c
index 47fd7de..1e601f9 100644
--- a/gcc/c/c-objc-common.c
+++ b/gcc/c/c-objc-common.c
@@ -101,7 +101,7 @@ c_tree_printer (pretty_printer *pp, text_info *text, const char *spec,
     {
       t = va_arg (*text->args_ptr, tree);
       if (set_locus)
-	text->set_location (0, DECL_SOURCE_LOCATION (t));
+	text->set_location (0, DECL_SOURCE_LOCATION (t), true);
     }
 
   switch (*spec)
diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 17870b5..2e2ff10 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -3562,7 +3562,7 @@ cp_printer (pretty_printer *pp, text_info *text, const char *spec,
 
   pp_string (pp, result);
   if (set_locus && t != NULL)
-    text->set_location (0, location_of (t));
+    text->set_location (0, location_of (t), true);
   return true;
 #undef next_tree
 #undef next_tcode
@@ -3676,9 +3676,10 @@ pedwarn_cxx98 (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 		       (cxx_dialect == cxx98) ? DK_PEDWARN : DK_WARNING);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
diff --git a/gcc/diagnostic-color.c b/gcc/diagnostic-color.c
index 3fe49b2..d848dfc 100644
--- a/gcc/diagnostic-color.c
+++ b/gcc/diagnostic-color.c
@@ -164,7 +164,8 @@ static struct color_cap color_dict[] =
   { "warning", SGR_SEQ (COLOR_BOLD COLOR_SEPARATOR COLOR_FG_MAGENTA),
 	       7, false },
   { "note", SGR_SEQ (COLOR_BOLD COLOR_SEPARATOR COLOR_FG_CYAN), 4, false },
-  { "caret", SGR_SEQ (COLOR_BOLD COLOR_SEPARATOR COLOR_FG_GREEN), 5, false },
+  { "range1", SGR_SEQ (COLOR_FG_GREEN), 6, false },
+  { "range2", SGR_SEQ (COLOR_FG_BLUE), 6, false },
   { "locus", SGR_SEQ (COLOR_BOLD), 5, false },
   { "quote", SGR_SEQ (COLOR_BOLD), 5, false },
   { NULL, NULL, 0, false }
@@ -195,7 +196,7 @@ colorize_stop (bool show_color)
 }
 
 /* Parse GCC_COLORS.  The default would look like:
-   GCC_COLORS='error=01;31:warning=01;35:note=01;36:caret=01;32:locus=01:quote=01'
+   GCC_COLORS='error=01;31:warning=01;35:note=01;36:range1=32:range2=34;locus=01:quote=01'
    No character escaping is needed or supported.  */
 static bool
 parse_gcc_colors (void)
diff --git a/gcc/diagnostic-core.h b/gcc/diagnostic-core.h
index 66d2e42..a8a7c37 100644
--- a/gcc/diagnostic-core.h
+++ b/gcc/diagnostic-core.h
@@ -63,18 +63,26 @@ extern bool warning_n (location_t, int, int, const char *, const char *, ...)
     ATTRIBUTE_GCC_DIAG(4,6) ATTRIBUTE_GCC_DIAG(5,6);
 extern bool warning_at (location_t, int, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,4);
+extern bool warning_at_rich_loc (rich_location *, int, const char *, ...)
+    ATTRIBUTE_GCC_DIAG(3,4);
 extern void error (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
 extern void error_n (location_t, int, const char *, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,5) ATTRIBUTE_GCC_DIAG(4,5);
 extern void error_at (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
+extern void error_at_rich_loc (rich_location *, const char *, ...)
+  ATTRIBUTE_GCC_DIAG(2,3);
 extern void fatal_error (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3)
      ATTRIBUTE_NORETURN;
 /* Pass one of the OPT_W* from options.h as the second parameter.  */
 extern bool pedwarn (location_t, int, const char *, ...)
      ATTRIBUTE_GCC_DIAG(3,4);
 extern bool permerror (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
+extern bool permerror_at_rich_loc (rich_location *, const char *,
+				   ...) ATTRIBUTE_GCC_DIAG(2,3);
 extern void sorry (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
 extern void inform (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
+extern void inform_at_rich_loc (rich_location *, const char *,
+				...) ATTRIBUTE_GCC_DIAG(2,3);
 extern void inform_n (location_t, int, const char *, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,5) ATTRIBUTE_GCC_DIAG(4,5);
 extern void verbatim (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
index 147a2b8..6865209 100644
--- a/gcc/diagnostic-show-locus.c
+++ b/gcc/diagnostic-show-locus.c
@@ -36,131 +36,711 @@ along with GCC; see the file COPYING3.  If not see
 # include <sys/ioctl.h>
 #endif
 
-/* If LINE is longer than MAX_WIDTH, and COLUMN is not smaller than
-   MAX_WIDTH by some margin, then adjust the start of the line such
-   that the COLUMN is smaller than MAX_WIDTH minus the margin.  The
-   margin is either CARET_LINE_MARGIN characters or the difference
-   between the column and the length of the line, whatever is smaller.
-   The length of LINE is given by LINE_WIDTH.  */
-static const char *
-adjust_line (const char *line, int line_width,
-	     int max_width, int *column_p)
-{
-  int right_margin = CARET_LINE_MARGIN;
-  int column = *column_p;
-
-  gcc_checking_assert (line_width >= column);
-  right_margin = MIN (line_width - column, right_margin);
-  right_margin = max_width - right_margin;
-  if (line_width >= max_width && column > right_margin)
+static void
+show_ruler (diagnostic_context *context, int max_width, int x_offset);
+
+/* Classes for rendering source code and diagnostics, within an
+   anonymous namespace.
+   The work is done by "class layout", which embeds and uses
+   "class colorizer" and "class layout_range" to get things done.  */
+
+namespace {
+
+/* The state at a given point of the source code, assuming that we're
+   in a range: which range are we in, and whether we should draw a caret at
+   this point.  */
+
+struct point_state
+{
+  int range_idx;
+  bool draw_caret_p;
+};
+
+/* A class to inject colorization codes when printing the diagnostic locus.
+
+   It has one kind of colorization for each of:
+     - normal text
+     - range 0 (the "primary location")
+     - range 1
+     - range 2
+
+   The class caches the lookup of the color codes for the above.
+
+   The class also has responsibility for tracking which of the above is
+   active, filtering out unnecessary changes.  This allows layout::print_line
+   to simply request a colorization code for *every* character it prints
+   through this class, and have the filtering be done for it here.  */
+
+class colorizer
+{
+ public:
+  colorizer (diagnostic_context *context,
+	     const diagnostic_info *diagnostic);
+  ~colorizer ();
+
+  void set_range (int range_idx) { set_state (range_idx); }
+  void set_normal_text () { set_state (STATE_NORMAL_TEXT); }
+
+ private:
+  void set_state (int state);
+  void begin_state (int state);
+  void finish_state (int state);
+
+ private:
+  static const int STATE_NORMAL_TEXT = -1;
+
+  diagnostic_context *m_context;
+  const diagnostic_info *m_diagnostic;
+  int m_current_state;
+  const char *m_caret_cs;
+  const char *m_caret_ce;
+  const char *m_range1_cs;
+  const char *m_range2_cs;
+  const char *m_range_ce;
+};
+
+/* A point within a layout_range; similar to an expanded_location,
+   but after filtering on file.  */
+
+class layout_point
+{
+ public:
+  layout_point (const expanded_location &exploc)
+  : m_line (exploc.line),
+    m_column (exploc.column) {}
+
+  int m_line;
+  int m_column;
+};
+
+/* A class for use by "class layout" below: a filtered location_range.  */
+
+class layout_range
+{
+ public:
+  layout_range (const location_range *loc_range);
+
+  bool contains_point (int row, int column) const;
+
+  layout_point m_start;
+  layout_point m_finish;
+  bool m_show_caret_p;
+  layout_point m_caret;
+};
+
+/* A class to control the overall layout when printing a diagnostic.
+
+   The layout is determined within the constructor.
+   It is then printed by repeatedly calling the "print_line" method.
+   Each such call can print two lines: one for the source line itself,
+   and potentially an "annotation" line, containing carets/underlines.
+
+   We assume we have disjoint ranges.  */
+
+class layout
+{
+ public:
+  layout (diagnostic_context *context,
+	  const diagnostic_info *diagnostic);
+
+  int get_first_line () const { return m_first_line; }
+  int get_last_line () const { return m_last_line; }
+
+  void print_line (int row);
+
+ private:
+  bool
+  get_state_at_point (/* Inputs.  */
+		      int row, int column,
+		      int first_non_ws, int last_non_ws,
+		      /* Outputs.  */
+		      point_state *out_state);
+
+  int
+  get_x_bound_for_row (int row, int caret_column,
+		       int last_non_ws);
+
+ private:
+  diagnostic_context *m_context;
+  pretty_printer *m_pp;
+  diagnostic_t m_diagnostic_kind;
+  expanded_location m_exploc;
+  colorizer m_colorizer;
+  bool m_colorize_source_p;
+  auto_vec <layout_range> m_layout_ranges;
+  int m_first_line;
+  int m_last_line;
+  int m_x_offset;
+};
+
+/* Implementation of "class colorizer".  */
+
+/* The constructor for "colorizer".  Lookup and store color codes for the
+   different kinds of things we might need to print.  */
+
+colorizer::colorizer (diagnostic_context *context,
+		      const diagnostic_info *diagnostic) :
+  m_context (context),
+  m_diagnostic (diagnostic),
+  m_current_state (STATE_NORMAL_TEXT)
+{
+  m_caret_ce = colorize_stop (pp_show_color (context->printer));
+  m_range1_cs = colorize_start (pp_show_color (context->printer), "range1");
+  m_range2_cs = colorize_start (pp_show_color (context->printer), "range2");
+  m_range_ce = colorize_stop (pp_show_color (context->printer));
+}
+
+/* The destructor for "colorize".  If colorization is on, print a code to
+   turn it off.  */
+
+colorizer::~colorizer ()
+{
+  finish_state (m_current_state);
+}
+
+/* Update state, printing color codes if necessary if there's a state
+   change.  */
+
+void
+colorizer::set_state (int new_state)
+{
+  if (m_current_state != new_state)
     {
-      line += column - right_margin;
-      *column_p = right_margin;
+      finish_state (m_current_state);
+      m_current_state = new_state;
+      begin_state (new_state);
     }
-  return line;
 }
 
-/* Print the physical source line corresponding to the location of
-   this diagnostic, and a caret indicating the precise column.  This
-   function only prints two caret characters if the two locations
-   given by DIAGNOSTIC are on the same line according to
-   diagnostic_same_line().  */
+/* Turn on any colorization for STATE.  */
+
 void
-diagnostic_show_locus (diagnostic_context * context,
-		       const diagnostic_info *diagnostic)
+colorizer::begin_state (int state)
 {
-  if (!context->show_caret
-      || diagnostic_location (diagnostic, 0) <= BUILTINS_LOCATION
-      || diagnostic_location (diagnostic, 0) == context->last_location)
-    return;
+  switch (state)
+    {
+    case STATE_NORMAL_TEXT:
+      break;
 
-  context->last_location = diagnostic_location (diagnostic, 0);
-  expanded_location s0 = diagnostic_expand_location (diagnostic, 0);
-  expanded_location s1 = { };
-  /* Zero-initialized. This is checked later by diagnostic_print_caret_line.  */
+    case 0:
+      /* Make range 0 be the same color as the "kind" text
+	 (error vs warning vs note).  */
+      pp_string
+	(m_context->printer,
+	 colorize_start (pp_show_color (m_context->printer),
+			 diagnostic_get_color_for_kind (m_diagnostic->kind)));
+      break;
+
+    case 1:
+      pp_string (m_context->printer, m_range1_cs);
+      break;
+
+    case 2:
+      pp_string (m_context->printer, m_range2_cs);
+      break;
+
+    default:
+      /* We don't expect more than 3 ranges per diagnostic.  */
+      gcc_unreachable ();
+      break;
+    }
+}
+
+/* Turn off any colorization for STATE.  */
+
+void
+colorizer::finish_state (int state)
+{
+  switch (state)
+    {
+    case STATE_NORMAL_TEXT:
+      break;
+
+    case 0:
+      pp_string (m_context->printer, m_caret_ce);
+      break;
+
+    default:
+      /* Within a range.  */
+      gcc_assert (state > 0);
+      pp_string (m_context->printer, m_range_ce);
+      break;
+    }
+}
+
+/* Implementation of class layout_range.  */
+
+/* The constructor for class layout_range.
+   Initialize various layout_point fields from expanded_location
+   equivalents; we've already filtered on file.  */
+
+layout_range::layout_range (const location_range *loc_range)
+: m_start (loc_range->m_start),
+  m_finish (loc_range->m_finish),
+  m_show_caret_p (loc_range->m_show_caret_p),
+  m_caret (loc_range->m_caret)
+{
+}
+
+/* Is (column, row) within the given range?
+   We've already filtered on the file.
+
+   Ranges are closed (both limits are within the range).
+
+   Example A: a single-line range:
+     start:  (col=22, line=2)
+     finish: (col=38, line=2)
+
+  |00000011111111112222222222333333333344444444444
+  |34567890123456789012345678901234567890123456789
+--+-----------------------------------------------
+01|bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
+02|bbbbbbbbbbbbbbbbbbbSwwwwwwwwwwwwwwwFaaaaaaaaaaa
+03|aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+
+   Example B: a multiline range with
+     start:  (col=14, line=3)
+     finish: (col=08, line=5)
+
+  |00000011111111112222222222333333333344444444444
+  |34567890123456789012345678901234567890123456789
+--+-----------------------------------------------
+01|bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
+02|bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
+03|bbbbbbbbbbbSwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
+04|wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
+05|wwwwwFaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+06|aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+--+-----------------------------------------------
+
+   Legend:
+   - 'b' indicates a point *before* the range
+   - 'S' indicates the start of the range
+   - 'w' indicates a point within the range
+   - 'F' indicates the finish of the range (which is
+	 within it).
+   - 'a' indicates a subsequent point *after* the range.  */
+
+bool
+layout_range::contains_point (int row, int column) const
+{
+  gcc_assert (m_start.m_line <= m_finish.m_line);
+  /* ...but the equivalent isn't true for the columns;
+     consider example B in the comment above.  */
+
+  if (row < m_start.m_line)
+    /* Points before the first line of the range are
+       outside it (corresponding to line 01 in example A
+       and lines 01 and 02 in example B above).  */
+    return false;
+
+  if (row == m_start.m_line)
+    /* On same line as start of range (corresponding
+       to line 02 in example A and line 03 in example B).  */
+    {
+      if (column < m_start.m_column)
+	/* Points on the starting line of the range, but
+	   before the column in which it begins.  */
+	return false;
+
+      if (row < m_finish.m_line)
+	/* This is a multiline range; the point
+	   is within it (corresponds to line 03 in example B
+	   from column 14 onwards) */
+	return true;
+      else
+	{
+	  /* This is a single-line range.  */
+	  gcc_assert (row == m_finish.m_line);
+	  return column <= m_finish.m_column;
+	}
+    }
+
+  /* The point is in a line beyond that containing the
+     start of the range: lines 03 onwards in example A,
+     and lines 04 onwards in example B.  */
+  gcc_assert (row > m_start.m_line);
+
+  if (row > m_finish.m_line)
+    /* The point is beyond the final line of the range
+       (lines 03 onwards in example A, and lines 06 onwards
+       in example B).  */
+    return false;
+
+  if (row < m_finish.m_line)
+    {
+      /* The point is in a line that's fully within a multiline
+	 range (e.g. line 04 in example B).  */
+      gcc_assert (m_start.m_line < m_finish.m_line);
+      return true;
+    }
+
+  gcc_assert (row ==  m_finish.m_line);
+
+  return column <= m_finish.m_column;
+}
+
+/* Given a source line LINE of length LINE_WIDTH, determine the width
+   without any trailing whitespace.  */
+
+static int
+get_line_width_without_trailing_whitespace (const char *line, int line_width)
+{
+  int result = line_width;
+  while (result > 0)
+    {
+      char ch = line[result - 1];
+      if (ch == ' ' || ch == '\t')
+	result--;
+      else
+	break;
+    }
+  gcc_assert (result >= 0);
+  gcc_assert (result <= line_width);
+  gcc_assert (result == 0 ||
+	      (line[result - 1] != ' '
+	       && line[result -1] != '\t'));
+  return result;
+}
+
+/* Implementation of class layout.  */
 
-  if (diagnostic_location (diagnostic, 1) > BUILTINS_LOCATION)
-    s1 = diagnostic_expand_location (diagnostic, 1);
+/* Constructor for class layout.
+
+   Filter the ranges from the rich_location to those that we can
+   sanely print, populating m_layout_ranges.
+   Determine the range of lines that we will print.
+   Determine m_x_offset, to ensure that the primary caret
+   will fit within the max_width provided by the diagnostic_context.  */
+
+layout::layout (diagnostic_context * context,
+		const diagnostic_info *diagnostic)
+: m_context (context),
+  m_pp (context->printer),
+  m_diagnostic_kind (diagnostic->kind),
+  m_exploc (diagnostic->richloc->lazily_expand_location ()),
+  m_colorizer (context, diagnostic),
+  m_colorize_source_p (context->colorize_source_p),
+  m_layout_ranges (rich_location::MAX_RANGES),
+  m_first_line (m_exploc.line),
+  m_last_line  (m_exploc.line),
+  m_x_offset (0)
+{
+  rich_location *richloc = diagnostic->richloc;
+  for (unsigned int idx = 0; idx < richloc->get_num_locations (); idx++)
+    {
+      /* This diagnostic printer can only cope with "sufficiently sane" ranges.
+	 Ignore any ranges that are awkward to handle.  */
+      location_range *loc_range = richloc->get_range (idx);
+
+      /* If any part of the range isn't in the same file as the primary
+	 location of this diagnostic, ignore the range.  */
+      if (loc_range->m_start.file != m_exploc.file)
+	continue;
+      if (loc_range->m_finish.file != m_exploc.file)
+	continue;
+      if (loc_range->m_show_caret_p)
+	if (loc_range->m_caret.file != m_exploc.file)
+	  continue;
+
+      /* Passed all the tests; add the range to m_layout_ranges so that
+	 it will be printed.  */
+      layout_range ri (loc_range);
+      m_layout_ranges.safe_push (ri);
+
+      /* Update m_first_line/m_last_line if necessary.  */
+      if (loc_range->m_start.line < m_first_line)
+	m_first_line = loc_range->m_start.line;
+      if (loc_range->m_finish.line > m_last_line)
+	m_last_line = loc_range->m_finish.line;
+    }
+
+  /* Adjust m_x_offset.
+     Center the primary caret to fit in max_width; all columns
+     will be adjusted accordingly.  */
+  int max_width = m_context->caret_max_width;
+  int line_width;
+  const char *line = location_get_source_line (m_exploc.file, m_exploc.line,
+					       &line_width);
+  if (line && m_exploc.column <= line_width)
+    {
+      int right_margin = CARET_LINE_MARGIN;
+      int column = m_exploc.column;
+      right_margin = MIN (line_width - column, right_margin);
+      right_margin = max_width - right_margin;
+      if (line_width >= max_width && column > right_margin)
+	m_x_offset = column - right_margin;
+      gcc_assert (m_x_offset >= 0);
+    }
 
-  diagnostic_print_caret_line (context, s0, s1,
-			       context->caret_chars[0],
-			       context->caret_chars[1]);
+  if (0)
+    show_ruler (context, line_width, m_x_offset);
 }
 
-/* Print (part) of the source line given by xloc1 with caret1 pointing
-   at the column.  If xloc2.column != 0 and it fits within the same
-   line as xloc1 according to diagnostic_same_line (), then caret2 is
-   printed at xloc2.colum.  Otherwise, the caller has to set up things
-   to print a second caret line for xloc2.  */
+/* Print text describing a line of source code.
+   This typically prints two lines:
+
+   (1) the source code itself, potentially colorized at any ranges, and
+   (2) an annotation line containing any carets/underlines
+   describing the ranges.  */
+
 void
-diagnostic_print_caret_line (diagnostic_context * context,
-			     expanded_location xloc1,
-			     expanded_location xloc2,
-			     char caret1, char caret2)
-{
-  if (!diagnostic_same_line (context, xloc1, xloc2))
-    /* This will mean ignore xloc2.  */
-    xloc2.column = 0;
-  else if (xloc1.column == xloc2.column)
-    xloc2.column++;
-
-  int cmax = MAX (xloc1.column, xloc2.column);
+layout::print_line (int row)
+{
   int line_width;
-  const char *line = location_get_source_line (xloc1.file, xloc1.line,
+  const char *line = location_get_source_line (m_exploc.file, row,
 					       &line_width);
-  if (line == NULL || cmax > line_width)
+  if (!line)
     return;
 
-  /* Center the interesting part of the source line to fit in
-     max_width, and adjust all columns accordingly.  */
-  int max_width = context->caret_max_width;
-  int offset = (int) cmax;
-  line = adjust_line (line, line_width, max_width, &offset);
-  offset -= cmax;
-  cmax += offset;
-  xloc1.column += offset;
-  if (xloc2.column)
-    xloc2.column += offset;
-
-  /* Print the source line.  */
-  pp_newline (context->printer);
-  const char *saved_prefix = pp_get_prefix (context->printer);
-  pp_set_prefix (context->printer, NULL);
-  pp_space (context->printer);
-  while (max_width > 0 && line_width > 0)
+  line += m_x_offset;
+
+  m_colorizer.set_normal_text ();
+
+  /* Step 1: print the source code line.  */
+
+  /* We will stop printing at any trailing whitespace.  */
+  line_width
+    = get_line_width_without_trailing_whitespace (line,
+						  line_width);
+  pp_space (m_pp);
+  int first_non_ws = INT_MAX;
+  int last_non_ws = 0;
+  int column;
+  for (column = 1 + m_x_offset; column <= line_width; column++)
     {
+      /* Assuming colorization is enabled for the caret and underline
+	 characters, we may also colorize the associated characters
+	 within the source line.
+
+	 For frontends that generate range information, we color the
+	 associated characters in the source line the same as the
+	 carets and underlines in the annotation line, to make it easier
+	 for the reader to see the pertinent code.
+
+	 For frontends that only generate carets, we don't colorize the
+	 characters above them, since this would look strange (e.g.
+	 colorizing just the first character in a token).  */
+      if (m_colorize_source_p)
+	{
+	  bool in_range_p;
+	  point_state state;
+	  in_range_p = get_state_at_point (row, column,
+					   0, INT_MAX,
+					   &state);
+	  if (in_range_p)
+	    m_colorizer.set_range (state.range_idx);
+	  else
+	    m_colorizer.set_normal_text ();
+	}
       char c = *line == '\t' ? ' ' : *line;
       if (c == '\0')
 	c = ' ';
-      pp_character (context->printer, c);
-      max_width--;
-      line_width--;
+      if (c != ' ')
+	{
+	  last_non_ws = column;
+	  if (first_non_ws == INT_MAX)
+	    first_non_ws = column;
+	}
+      pp_character (m_pp, c);
       line++;
     }
-  pp_newline (context->printer);
+  pp_newline (m_pp);
+
+  /* Step 2: print a line consisting of the caret/underlines for the
+     given source line.  */
+  int x_bound = get_x_bound_for_row (row, m_exploc.column,
+				     last_non_ws);
+
+  pp_space (m_pp);
+  for (int column = 1 + m_x_offset; column < x_bound; column++)
+    {
+      bool in_range_p;
+      point_state state;
+      in_range_p = get_state_at_point (row, column,
+				       first_non_ws, last_non_ws,
+				       &state);
+      if (in_range_p)
+	{
+	  /* Within a range.  Draw either the caret or an underline.  */
+	  m_colorizer.set_range (state.range_idx);
+	  if (state.draw_caret_p)
+	    /* Draw the caret.  */
+	    pp_character (m_pp, m_context->caret_chars[state.range_idx]);
+	  else
+	    pp_character (m_pp, '~');
+	}
+      else
+	{
+	  /* Not in a range.  */
+	  m_colorizer.set_normal_text ();
+	  pp_character (m_pp, ' ');
+	}
+    }
+  pp_newline (m_pp);
+}
+
+/* Return true if (ROW/COLUMN) is within a range of the layout.
+   If it returns true, OUT_STATE is written to, with the
+   range index, and whether we should draw the caret at
+   (ROW/COLUMN) (as opposed to an underline).  */
+
+bool
+layout::get_state_at_point (/* Inputs.  */
+			    int row, int column,
+			    int first_non_ws, int last_non_ws,
+			    /* Outputs.  */
+			    point_state *out_state)
+{
+  layout_range *range;
+  int i;
+  FOR_EACH_VEC_ELT (m_layout_ranges, i, range)
+    {
+      if (0)
+	fprintf (stderr,
+		 "range ( (%i, %i), (%i, %i))->contains_point (%i, %i): %s\n",
+		 range->m_start.m_line,
+		 range->m_start.m_column,
+		 range->m_finish.m_line,
+		 range->m_finish.m_column,
+		 row,
+		 column,
+		 range->contains_point (row, column) ? "true" : "false");
+
+      if (range->contains_point (row, column))
+	{
+	  out_state->range_idx = i;
+
+	  /* Are we at the range's caret?  is it visible? */
+	  out_state->draw_caret_p = false;
+	  if (row == range->m_caret.m_line
+	      && column == range->m_caret.m_column)
+	    out_state->draw_caret_p = range->m_show_caret_p;
+
+	  /* Within a multiline range, don't display any underline
+	     in any leading or trailing whitespace on a line.
+	     We do display carets, however.  */
+	  if (!out_state->draw_caret_p)
+	    if (column < first_non_ws || column > last_non_ws)
+	      return false;
+
+	  /* We are within a range.  */
+	  return true;
+	}
+    }
+
+  return false;
+}
+
+/* Helper function for use by layout::print_line when printing the
+   annotation line under the source line.
+   Get the column beyond the rightmost one that could contain a caret or
+   range marker, given that we stop rendering at trailing whitespace.
+   ROW is the source line within the given file.
+   CARET_COLUMN is the column of range 0's caret.
+   LAST_NON_WS_COLUMN is the last column containing a non-whitespace
+   character of source (as determined when printing the source line).  */
+
+int
+layout::get_x_bound_for_row (int row, int caret_column,
+			     int last_non_ws_column)
+{
+  int result = caret_column + 1;
 
-  /* Print the caret under the line.  */
-  const char *caret_cs, *caret_ce;
-  caret_cs = colorize_start (pp_show_color (context->printer), "caret");
-  caret_ce = colorize_stop (pp_show_color (context->printer));
-  int cmin = xloc2.column
-    ? MIN (xloc1.column, xloc2.column) : xloc1.column;
-  int caret_min = cmin == xloc1.column ? caret1 : caret2;
-  int caret_max = cmin == xloc1.column ? caret2 : caret1;
-
-  /* cmin is >= 1, but we indent with an extra space at the start like
-     we did above.  */
+  layout_range *range;
   int i;
-  for (i = 0; i < cmin; i++)
-    pp_space (context->printer);
-  pp_printf (context->printer, "%s%c%s", caret_cs, caret_min, caret_ce);
+  FOR_EACH_VEC_ELT (m_layout_ranges, i, range)
+    {
+      if (row >= range->m_start.m_line)
+	{
+	  if (range->m_finish.m_line == row)
+	    {
+	      /* On the final line within a range; ensure that
+		 we render up to the end of the range.  */
+	      if (result <= range->m_finish.m_column)
+		result = range->m_finish.m_column + 1;
+	    }
+	  else if (row < range->m_finish.m_line)
+	    {
+	      /* Within a multiline range; ensure that we render up to the
+		 last non-whitespace column.  */
+	      if (result <= last_non_ws_column)
+		result = last_non_ws_column + 1;
+	    }
+	}
+    }
+
+  return result;
+}
+
+} /* End of anonymous namespace.  */
+
+/* For debugging layout issues in diagnostic_show_locus and friends,
+   render a ruler giving column numbers (after the 1-column indent).  */
 
-  if (xloc2.column)
+static void
+show_ruler (diagnostic_context *context, int max_width, int x_offset)
+{
+  /* Hundreds.  */
+  if (max_width > 99)
     {
-      for (i++; i < cmax; i++)
-	pp_space (context->printer);
-      pp_printf (context->printer, "%s%c%s", caret_cs, caret_max, caret_ce);
+      pp_space (context->printer);
+      for (int column = 1 + x_offset; column < max_width; column++)
+	if (0 == column % 10)
+	  pp_character (context->printer, '0' + (column / 100) % 10);
+	else
+	  pp_space (context->printer);
+      pp_newline (context->printer);
     }
+
+  /* Tens.  */
+  pp_space (context->printer);
+  for (int column = 1 + x_offset; column < max_width; column++)
+    if (0 == column % 10)
+      pp_character (context->printer, '0' + (column / 10) % 10);
+    else
+      pp_space (context->printer);
+  pp_newline (context->printer);
+
+  /* Units.  */
+  pp_space (context->printer);
+  for (int column = 1 + x_offset; column < max_width; column++)
+    pp_character (context->printer, '0' + (column % 10));
+  pp_newline (context->printer);
+}
+
+/* Print the physical source code corresponding to the location of
+   this diagnostic, with additional annotations.  */
+
+void
+diagnostic_show_locus (diagnostic_context * context,
+		       const diagnostic_info *diagnostic)
+{
+  if (!context->show_caret
+      || diagnostic_location (diagnostic, 0) <= BUILTINS_LOCATION
+      || diagnostic_location (diagnostic, 0) == context->last_location)
+    return;
+
+  context->last_location = diagnostic_location (diagnostic, 0);
+
+  pp_newline (context->printer);
+
+  const char *saved_prefix = pp_get_prefix (context->printer);
+  pp_set_prefix (context->printer, NULL);
+
+  {
+    layout layout (context, diagnostic);
+    int last_line = layout.get_last_line ();
+    for (int row = layout.get_first_line ();
+	 row <= last_line;
+	 row++)
+      layout.print_line (row);
+
+    /* The closing scope here leads to the dtor for layout and thus
+       colorizer being called here, which affects the precise
+       place where colorization is turned off in the unittest
+       for colorized output.  */
+  }
+
   pp_set_prefix (context->printer, saved_prefix);
-  pp_needs_newline (context->printer) = true;
 }
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 831859a..5fe6627 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -144,7 +144,7 @@ diagnostic_initialize (diagnostic_context *context, int n_opts)
     context->classify_diagnostic[i] = DK_UNSPECIFIED;
   context->show_caret = false;
   diagnostic_set_caret_max_width (context, pp_line_cutoff (context->printer));
-  for (i = 0; i < MAX_LOCATIONS_PER_MESSAGE; i++)
+  for (i = 0; i < rich_location::MAX_RANGES; i++)
     context->caret_chars[i] = '^';
   context->show_option_requested = false;
   context->abort_on_error = false;
@@ -234,16 +234,15 @@ diagnostic_finish (diagnostic_context *context)
    translated.  */
 void
 diagnostic_set_info_translated (diagnostic_info *diagnostic, const char *msg,
-				va_list *args, location_t location,
+				va_list *args, rich_location *richloc,
 				diagnostic_t kind)
 {
+  gcc_assert (richloc);
   diagnostic->message.err_no = errno;
   diagnostic->message.args_ptr = args;
   diagnostic->message.format_spec = msg;
-  diagnostic->message.set_location (0, location);
-  for (int i = 1; i < MAX_LOCATIONS_PER_MESSAGE; i++)
-    diagnostic->message.set_location (i, UNKNOWN_LOCATION);
-  diagnostic->override_column = 0;
+  diagnostic->message.m_richloc = richloc;
+  diagnostic->richloc = richloc;
   diagnostic->kind = kind;
   diagnostic->option_index = 0;
 }
@@ -252,10 +251,27 @@ diagnostic_set_info_translated (diagnostic_info *diagnostic, const char *msg,
    translated.  */
 void
 diagnostic_set_info (diagnostic_info *diagnostic, const char *gmsgid,
-		     va_list *args, location_t location,
+		     va_list *args, rich_location *richloc,
 		     diagnostic_t kind)
 {
-  diagnostic_set_info_translated (diagnostic, _(gmsgid), args, location, kind);
+  gcc_assert (richloc);
+  diagnostic_set_info_translated (diagnostic, _(gmsgid), args, richloc, kind);
+}
+
+static const char *const diagnostic_kind_color[] = {
+#define DEFINE_DIAGNOSTIC_KIND(K, T, C) (C),
+#include "diagnostic.def"
+#undef DEFINE_DIAGNOSTIC_KIND
+  NULL
+};
+
+/* Get a color name for diagnostics of type KIND
+   Result could be NULL.  */
+
+const char *
+diagnostic_get_color_for_kind (diagnostic_t kind)
+{
+  return diagnostic_kind_color[kind];
 }
 
 /* Return a malloc'd string describing a location.  The caller is
@@ -270,12 +286,6 @@ diagnostic_build_prefix (diagnostic_context *context,
 #undef DEFINE_DIAGNOSTIC_KIND
     "must-not-happen"
   };
-  static const char *const diagnostic_kind_color[] = {
-#define DEFINE_DIAGNOSTIC_KIND(K, T, C) (C),
-#include "diagnostic.def"
-#undef DEFINE_DIAGNOSTIC_KIND
-    NULL
-  };
   gcc_assert (diagnostic->kind < DK_LAST_DIAGNOSTIC_KIND);
 
   const char *text = _(diagnostic_kind_text[diagnostic->kind]);
@@ -771,10 +781,14 @@ diagnostic_report_diagnostic (diagnostic_context *context,
 
       if (option_text)
 	{
+	  const char *cs
+	    = colorize_start (pp_show_color (context->printer),
+			      diagnostic_kind_color[diagnostic->kind]);
+	  const char *ce = colorize_stop (pp_show_color (context->printer));
 	  diagnostic->message.format_spec
 	    = ACONCAT ((diagnostic->message.format_spec,
 			" ", 
-			"[", option_text, "]",
+			"[", cs, option_text, ce, "]",
 			NULL));
 	  free (option_text);
 	}
@@ -854,9 +868,40 @@ diagnostic_append_note (diagnostic_context *context,
   diagnostic_info diagnostic;
   va_list ap;
   const char *saved_prefix;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_NOTE);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_NOTE);
+  if (context->inhibit_notes_p)
+    {
+      va_end (ap);
+      return;
+    }
+  saved_prefix = pp_get_prefix (context->printer);
+  pp_set_prefix (context->printer,
+                 diagnostic_build_prefix (context, &diagnostic));
+  pp_newline (context->printer);
+  pp_format (context->printer, &diagnostic.message);
+  pp_output_formatted_text (context->printer);
+  pp_destroy_prefix (context->printer);
+  pp_set_prefix (context->printer, saved_prefix);
+  diagnostic_show_locus (context, &diagnostic);
+  va_end (ap);
+}
+
+/* Same as diagnostic_append_note, but at RICHLOC. */
+
+void
+diagnostic_append_note_at_rich_loc (diagnostic_context *context,
+				    rich_location *richloc,
+				    const char * gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
+  const char *saved_prefix;
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, richloc, DK_NOTE);
   if (context->inhibit_notes_p)
     {
       va_end (ap);
@@ -881,16 +926,17 @@ emit_diagnostic (diagnostic_t kind, location_t location, int opt,
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
   if (kind == DK_PERMERROR)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 			   permissive_error_kind (global_dc));
       diagnostic.option_index = permissive_error_option (global_dc);
     }
   else {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location, kind);
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, kind);
       if (kind == DK_WARNING || kind == DK_PEDWARN)
 	diagnostic.option_index = opt;
   }
@@ -907,9 +953,23 @@ inform (location_t location, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_NOTE);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_NOTE);
+  report_diagnostic (&diagnostic);
+  va_end (ap);
+}
+
+/* Same as "inform", but at RICHLOC.  */
+void
+inform_at_rich_loc (rich_location *richloc, const char *gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, richloc, DK_NOTE);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -922,11 +982,12 @@ inform_n (location_t location, int n, const char *singular_gmsgid,
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
 
   va_start (ap, plural_gmsgid);
   diagnostic_set_info_translated (&diagnostic,
                                   ngettext (singular_gmsgid, plural_gmsgid, n),
-                                  &ap, location, DK_NOTE);
+                                  &ap, &richloc, DK_NOTE);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -940,9 +1001,10 @@ warning (int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_WARNING);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_WARNING);
   diagnostic.option_index = opt;
 
   ret = report_diagnostic (&diagnostic);
@@ -960,9 +1022,27 @@ warning_at (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_WARNING);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_WARNING);
+  diagnostic.option_index = opt;
+  ret = report_diagnostic (&diagnostic);
+  va_end (ap);
+  return ret;
+}
+
+/* Same as warning at, but using RICHLOC.  */
+
+bool
+warning_at_rich_loc (rich_location *richloc, int opt, const char *gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
+  bool ret;
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, richloc, DK_WARNING);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
   va_end (ap);
@@ -980,11 +1060,13 @@ warning_n (location_t location, int opt, int n, const char *singular_gmsgid,
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, plural_gmsgid);
   diagnostic_set_info_translated (&diagnostic,
                                   ngettext (singular_gmsgid, plural_gmsgid, n),
-                                  &ap, location, DK_WARNING);
+                                  &ap, &richloc, DK_WARNING
+);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
   va_end (ap);
@@ -1010,9 +1092,10 @@ pedwarn (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,  DK_PEDWARN);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,  DK_PEDWARN);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
   va_end (ap);
@@ -1032,9 +1115,28 @@ permerror (location_t location, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
+                       permissive_error_kind (global_dc));
+  diagnostic.option_index = permissive_error_option (global_dc);
+  ret = report_diagnostic (&diagnostic);
+  va_end (ap);
+  return ret;
+}
+
+/* Same as "permerror", but at RICHLOC.  */
+
+bool
+permerror_at_rich_loc (rich_location *richloc, const char *gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
+  bool ret;
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, richloc,
                        permissive_error_kind (global_dc));
   diagnostic.option_index = permissive_error_option (global_dc);
   ret = report_diagnostic (&diagnostic);
@@ -1049,9 +1151,10 @@ error (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_ERROR);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ERROR);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -1064,11 +1167,12 @@ error_n (location_t location, int n, const char *singular_gmsgid,
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
 
   va_start (ap, plural_gmsgid);
   diagnostic_set_info_translated (&diagnostic,
                                   ngettext (singular_gmsgid, plural_gmsgid, n),
-                                  &ap, location, DK_ERROR);
+                                  &ap, &richloc, DK_ERROR);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -1079,9 +1183,25 @@ error_at (location_t loc, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (loc);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, loc, DK_ERROR);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ERROR);
+  report_diagnostic (&diagnostic);
+  va_end (ap);
+}
+
+/* Same as above, but use RICH_LOC.  */
+
+void
+error_at_rich_loc (rich_location *rich_loc, const char *gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, rich_loc,
+		       DK_ERROR);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -1094,9 +1214,10 @@ sorry (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_SORRY);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_SORRY);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -1117,9 +1238,10 @@ fatal_error (location_t loc, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (loc);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, loc, DK_FATAL);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_FATAL);
   report_diagnostic (&diagnostic);
   va_end (ap);
 
@@ -1135,9 +1257,10 @@ internal_error (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_ICE);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ICE);
   report_diagnostic (&diagnostic);
   va_end (ap);
 
@@ -1152,9 +1275,10 @@ internal_error_no_backtrace (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_ICE_NOBT);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ICE_NOBT);
   report_diagnostic (&diagnostic);
   va_end (ap);
 
@@ -1218,3 +1342,11 @@ real_abort (void)
 {
   abort ();
 }
+
+void
+source_range::debug (const char *msg) const
+{
+  rich_location richloc (m_start);
+  richloc.add_range (m_start, m_finish);
+  inform_at_rich_loc (&richloc, "%s", msg);
+}
diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h
index 7fcb6a8..153e84c 100644
--- a/gcc/diagnostic.h
+++ b/gcc/diagnostic.h
@@ -29,10 +29,12 @@ along with GCC; see the file COPYING3.  If not see
    list in diagnostic.def.  */
 struct diagnostic_info
 {
-  /* Text to be formatted. It also contains the location(s) for this
-     diagnostic.  */
+  /* Text to be formatted.  */
   text_info message;
-  unsigned int override_column;
+
+  /* The location at which the diagnostic is to be reported.  */
+  rich_location *richloc;
+
   /* Auxiliary data for client.  */
   void *x_data;
   /* The kind of diagnostic it is about.  */
@@ -102,8 +104,8 @@ struct diagnostic_context
   /* Maximum width of the source line printed.  */
   int caret_max_width;
 
-  /* Characters used for caret diagnostics.  */
-  char caret_chars[MAX_LOCATIONS_PER_MESSAGE];
+  /* Character used for caret diagnostics.  */
+  char caret_chars[rich_location::MAX_RANGES];
 
   /* True if we should print the command line option which controls
      each diagnostic, if known.  */
@@ -181,6 +183,15 @@ struct diagnostic_context
   int lock;
 
   bool inhibit_notes_p;
+
+  /* When printing source code, should the characters at carets and ranges
+     be colorized? (assuming colorization is on at all).
+     This should be true for frontends that generate range information
+     (so that the ranges of code are colorized),
+     and false for frontends that merely specify points within the
+     source code (to avoid e.g. colorizing just the first character in
+     a token, which would look strange).  */
+  bool colorize_source_p;
 };
 
 static inline void
@@ -252,10 +263,6 @@ extern diagnostic_context *global_dc;
 
 #define report_diagnostic(D) diagnostic_report_diagnostic (global_dc, D)
 
-/* Override the column number to be used for reporting a
-   diagnostic.  */
-#define diagnostic_override_column(DI, COL) (DI)->override_column = (COL)
-
 /* Override the option index to be used for reporting a
    diagnostic.  */
 #define diagnostic_override_option_index(DI, OPTIDX) \
@@ -279,13 +286,17 @@ extern bool diagnostic_report_diagnostic (diagnostic_context *,
 					  diagnostic_info *);
 #ifdef ATTRIBUTE_GCC_DIAG
 extern void diagnostic_set_info (diagnostic_info *, const char *, va_list *,
-				 location_t, diagnostic_t) ATTRIBUTE_GCC_DIAG(2,0);
+				 rich_location *, diagnostic_t) ATTRIBUTE_GCC_DIAG(2,0);
 extern void diagnostic_set_info_translated (diagnostic_info *, const char *,
-					    va_list *, location_t,
+					    va_list *, rich_location *,
 					    diagnostic_t)
      ATTRIBUTE_GCC_DIAG(2,0);
 extern void diagnostic_append_note (diagnostic_context *, location_t,
                                     const char *, ...) ATTRIBUTE_GCC_DIAG(3,4);
+extern void diagnostic_append_note_at_rich_loc (diagnostic_context *,
+						rich_location *,
+						const char *, ...)
+  ATTRIBUTE_GCC_DIAG(3,4);
 #endif
 extern char *diagnostic_build_prefix (diagnostic_context *, const diagnostic_info *);
 void default_diagnostic_starter (diagnostic_context *, diagnostic_info *);
@@ -306,6 +317,14 @@ diagnostic_location (const diagnostic_info * diagnostic, int which = 0)
   return diagnostic->message.get_location (which);
 }
 
+/* Return the number of locations to be printed in DIAGNOSTIC.  */
+
+static inline unsigned int
+diagnostic_num_locations (const diagnostic_info * diagnostic)
+{
+  return diagnostic->message.m_richloc->get_num_locations ();
+}
+
 /* Expand the location of this diagnostic. Use this function for
    consistency.  Parameter WHICH specifies which location. By default,
    expand the first one.  */
@@ -313,12 +332,7 @@ diagnostic_location (const diagnostic_info * diagnostic, int which = 0)
 static inline expanded_location
 diagnostic_expand_location (const diagnostic_info * diagnostic, int which = 0)
 {
-  expanded_location s
-    = expand_location_to_spelling_point (diagnostic_location (diagnostic,
-							      which));
-  if (which == 0 && diagnostic->override_column)
-    s.column = diagnostic->override_column;
-  return s;
+  return diagnostic->richloc->get_range (which)->m_caret;
 }
 
 /* This is somehow the right-side margin of a caret line, that is, we
@@ -338,11 +352,8 @@ diagnostic_same_line (const diagnostic_context *context,
     && context->caret_max_width - CARET_LINE_MARGIN > abs (s1.column - s2.column);
 }
 
-void
-diagnostic_print_caret_line (diagnostic_context * context,
-			     expanded_location xloc1,
-			     expanded_location xloc2,
-			     char caret1, char caret2);
+extern const char *
+diagnostic_get_color_for_kind (diagnostic_t kind);
 
 /* Pure text formatting support functions.  */
 extern char *file_name_as_prefix (diagnostic_context *, const char *);
diff --git a/gcc/fortran/cpp.c b/gcc/fortran/cpp.c
index daffc20..92dc584 100644
--- a/gcc/fortran/cpp.c
+++ b/gcc/fortran/cpp.c
@@ -149,9 +149,9 @@ static void cb_include (cpp_reader *, source_location, const unsigned char *,
 static void cb_ident (cpp_reader *, source_location, const cpp_string *);
 static void cb_used_define (cpp_reader *, source_location, cpp_hashnode *);
 static void cb_used_undef (cpp_reader *, source_location, cpp_hashnode *);
-static bool cb_cpp_error (cpp_reader *, int, int, location_t, unsigned int,
+static bool cb_cpp_error (cpp_reader *, int, int, rich_location *,
 			  const char *, va_list *)
-     ATTRIBUTE_GCC_DIAG(6,0);
+     ATTRIBUTE_GCC_DIAG(5,0);
 void pp_dir_change (cpp_reader *, const char *);
 
 static int dump_macro (cpp_reader *, cpp_hashnode *, void *);
@@ -1026,13 +1026,12 @@ cb_used_define (cpp_reader *pfile, source_location line ATTRIBUTE_UNUSED,
 /* Callback from cpp_error for PFILE to print diagnostics from the
    preprocessor.  The diagnostic is of type LEVEL, with REASON set
    to the reason code if LEVEL is represents a warning, at location
-   LOCATION, with column number possibly overridden by COLUMN_OVERRIDE
-   if not zero; MSG is the translated message and AP the arguments.
+   RICHLOC; MSG is the translated message and AP the arguments.
    Returns true if a diagnostic was emitted, false otherwise.  */
 
 static bool
 cb_cpp_error (cpp_reader *pfile ATTRIBUTE_UNUSED, int level, int reason,
-	      location_t location, unsigned int column_override,
+	      rich_location *richloc,
 	      const char *msg, va_list *ap)
 {
   diagnostic_info diagnostic;
@@ -1067,9 +1066,7 @@ cb_cpp_error (cpp_reader *pfile ATTRIBUTE_UNUSED, int level, int reason,
       gcc_unreachable ();
     }
   diagnostic_set_info_translated (&diagnostic, msg, ap,
-				  location, dlevel);
-  if (column_override)
-    diagnostic_override_column (&diagnostic, column_override);
+				  richloc, dlevel);
   if (reason == CPP_W_WARNING_DIRECTIVE)
     diagnostic_override_option_index (&diagnostic, OPT_Wcpp);
   ret = report_diagnostic (&diagnostic);
diff --git a/gcc/fortran/error.c b/gcc/fortran/error.c
index 3825751..4b3d31c 100644
--- a/gcc/fortran/error.c
+++ b/gcc/fortran/error.c
@@ -773,6 +773,7 @@ gfc_warning (int opt, const char *gmsgid, va_list ap)
   va_copy (argp, ap);
 
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
   bool fatal_errors = global_dc->fatal_errors;
   pretty_printer *pp = global_dc->printer;
   output_buffer *tmp_buffer = pp->buffer;
@@ -787,7 +788,7 @@ gfc_warning (int opt, const char *gmsgid, va_list ap)
       --werrorcount;
     }
 
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION,
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc,
 		       DK_WARNING);
   diagnostic.option_index = opt;
   bool ret = report_diagnostic (&diagnostic);
@@ -938,10 +939,12 @@ gfc_format_decoder (pretty_printer *pp,
 	/* If location[0] != UNKNOWN_LOCATION means that we already
 	   processed one of %C/%L.  */
 	int loc_num = text->get_location (0) == UNKNOWN_LOCATION ? 0 : 1;
-	text->set_location (loc_num,
-			    linemap_position_for_loc_and_offset (line_table,
-								 loc->lb->location,
-								 offset));
+	source_range range
+	  = source_range::from_location (
+	      linemap_position_for_loc_and_offset (line_table,
+						   loc->lb->location,
+						   offset));
+	text->set_range (loc_num, range, true);
 	pp_string (pp, result[loc_num]);
 	return true;
       }
@@ -1024,48 +1027,21 @@ gfc_diagnostic_build_locus_prefix (diagnostic_context *context,
 }
 
 /* This function prints the locus (file:line:column), the diagnostic kind
-   (Error, Warning) and (optionally) the caret line (a source line
-   with '1' and/or '2' below it).
+   (Error, Warning) and (optionally) the relevant lines of code with
+   annotation lines with '1' and/or '2' below them.
 
-   With -fdiagnostic-show-caret (the default) and for valid locations,
-   it prints for one location:
+   With -fdiagnostic-show-caret (the default) it prints:
 
-       [locus]:
+       [locus of primary range]:
        
           some code
                  1
        Error: Some error at (1)
         
-   for two locations that fit in the same locus line:
+  With -fno-diagnostic-show-caret or if the primary range is not
+  valid, it prints:
 
-       [locus]:
-       
-         some code and some more code
-                1       2
-       Error: Some error at (1) and (2)
-
-   and for two locations that do not fit in the same locus line:
-
-       [locus]:
-       
-         some code
-                1
-       [locus2]:
-       
-         some other code
-           2
-       Error: Some error at (1) and (2)
-       
-  With -fno-diagnostic-show-caret or if one of the locations is not
-  valid, it prints for one location (or for two locations that fit in
-  the same locus line):
-
-       [locus]: Error: Some error at (1) and (2)
-
-   and for two locations that do not fit in the same locus line:
-
-       [name]:[locus]: Error: (1)
-       [name]:[locus2]: Error: Some error at (1) and (2)
+       [locus of primary range]: Error: Some error at (1) and (2)
 */
 static void 
 gfc_diagnostic_starter (diagnostic_context *context,
@@ -1075,7 +1051,7 @@ gfc_diagnostic_starter (diagnostic_context *context,
 
   expanded_location s1 = diagnostic_expand_location (diagnostic);
   expanded_location s2;
-  bool one_locus = diagnostic_location (diagnostic, 1) == UNKNOWN_LOCATION;
+  bool one_locus = diagnostic->richloc->get_num_locations () < 2;
   bool same_locus = false;
 
   if (!one_locus) 
@@ -1125,35 +1101,6 @@ gfc_diagnostic_starter (diagnostic_context *context,
       /* If the caret line was shown, the prefix does not contain the
 	 locus.  */
       pp_set_prefix (context->printer, kind_prefix);
-
-      if (one_locus || same_locus)
-	  return;
-
-      locus_prefix = gfc_diagnostic_build_locus_prefix (context, s2);
-      if (diagnostic_location (diagnostic, 1) <= BUILTINS_LOCATION)
-	{
-	  /* No caret line for the second location. Override the previous
-	     prefix with [locus2]:[prefix].  */
-	  pp_set_prefix (context->printer,
-			 concat (locus_prefix, " ", kind_prefix, NULL));
-	  free (kind_prefix);
-	  free (locus_prefix);
-	}
-      else
-	{
-	  /* We print the caret for the second location.  */
-	  pp_verbatim (context->printer, locus_prefix);
-	  free (locus_prefix);
-	  /* Fortran uses an empty line between locus and caret line.  */
-	  pp_newline (context->printer);
-	  s1.column = 0; /* Print only a caret line for s2.  */
-	  diagnostic_print_caret_line (context, s2, s1,
-				       context->caret_chars[1], '\0');
-	  pp_newline (context->printer);
-	  /* If the caret line was shown, the prefix does not contain the
-	     locus.  */
-	  pp_set_prefix (context->printer, kind_prefix);
-	}
     }
 }
 
@@ -1173,10 +1120,11 @@ gfc_warning_now_at (location_t loc, int opt, const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (loc);
   bool ret;
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, loc, DK_WARNING);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc, DK_WARNING);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
   va_end (argp);
@@ -1190,10 +1138,11 @@ gfc_warning_now (int opt, const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
   bool ret;
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION,
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc,
 		       DK_WARNING);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
@@ -1209,11 +1158,12 @@ gfc_error_now (const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
 
   error_buffer.flag = true;
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION, DK_ERROR);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc, DK_ERROR);
   report_diagnostic (&diagnostic);
   va_end (argp);
 }
@@ -1226,9 +1176,10 @@ gfc_fatal_error (const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION, DK_FATAL);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc, DK_FATAL);
   report_diagnostic (&diagnostic);
   va_end (argp);
 
@@ -1291,6 +1242,7 @@ gfc_error (const char *gmsgid, va_list ap)
     }
 
   diagnostic_info diagnostic;
+  rich_location richloc (UNKNOWN_LOCATION);
   bool fatal_errors = global_dc->fatal_errors;
   pretty_printer *pp = global_dc->printer;
   output_buffer *tmp_buffer = pp->buffer;
@@ -1306,7 +1258,7 @@ gfc_error (const char *gmsgid, va_list ap)
       --errorcount;
     }
 
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION, DK_ERROR);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &richloc, DK_ERROR);
   report_diagnostic (&diagnostic);
 
   if (buffered_p)
@@ -1336,9 +1288,10 @@ gfc_internal_error (const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION, DK_ICE);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc, DK_ICE);
   report_diagnostic (&diagnostic);
   va_end (argp);
 
diff --git a/gcc/genmatch.c b/gcc/genmatch.c
index 102a635..6bfde06 100644
--- a/gcc/genmatch.c
+++ b/gcc/genmatch.c
@@ -53,14 +53,23 @@ unsigned verbose;
 
 static struct line_maps *line_table;
 
+expanded_location
+linemap_client_expand_location_to_spelling_point (source_location loc)
+{
+  const struct line_map_ordinary *map;
+  loc = linemap_resolve_location (line_table, loc, LRK_SPELLING_LOCATION, &map);
+  return linemap_expand_location (line_table, map, loc);
+}
+
 static bool
 #if GCC_VERSION >= 4001
-__attribute__((format (printf, 6, 0)))
+__attribute__((format (printf, 5, 0)))
 #endif
-error_cb (cpp_reader *, int errtype, int, source_location location,
-	  unsigned int, const char *msg, va_list *ap)
+error_cb (cpp_reader *, int errtype, int, rich_location *richloc,
+	  const char *msg, va_list *ap)
 {
   const line_map_ordinary *map;
+  source_location location = richloc->get_loc ();
   linemap_resolve_location (line_table, location, LRK_SPELLING_LOCATION, &map);
   expanded_location loc = linemap_expand_location (line_table, map, location);
   fprintf (stderr, "%s:%d:%d %s: ", loc.file, loc.line, loc.column,
@@ -102,9 +111,10 @@ __attribute__((format (printf, 2, 3)))
 #endif
 fatal_at (const cpp_token *tk, const char *msg, ...)
 {
+  rich_location richloc (tk->src_loc);
   va_list ap;
   va_start (ap, msg);
-  error_cb (NULL, CPP_DL_FATAL, 0, tk->src_loc, 0, msg, &ap);
+  error_cb (NULL, CPP_DL_FATAL, 0, &richloc, msg, &ap);
   va_end (ap);
 }
 
@@ -114,9 +124,10 @@ __attribute__((format (printf, 2, 3)))
 #endif
 fatal_at (source_location loc, const char *msg, ...)
 {
+  rich_location richloc (loc);
   va_list ap;
   va_start (ap, msg);
-  error_cb (NULL, CPP_DL_FATAL, 0, loc, 0, msg, &ap);
+  error_cb (NULL, CPP_DL_FATAL, 0, &richloc, msg, &ap);
   va_end (ap);
 }
 
@@ -126,9 +137,10 @@ __attribute__((format (printf, 2, 3)))
 #endif
 warning_at (const cpp_token *tk, const char *msg, ...)
 {
+  rich_location richloc (tk->src_loc);
   va_list ap;
   va_start (ap, msg);
-  error_cb (NULL, CPP_DL_WARNING, 0, tk->src_loc, 0, msg, &ap);
+  error_cb (NULL, CPP_DL_WARNING, 0, &richloc, msg, &ap);
   va_end (ap);
 }
 
@@ -138,9 +150,10 @@ __attribute__((format (printf, 2, 3)))
 #endif
 warning_at (source_location loc, const char *msg, ...)
 {
+  rich_location richloc (loc);
   va_list ap;
   va_start (ap, msg);
-  error_cb (NULL, CPP_DL_WARNING, 0, loc, 0, msg, &ap);
+  error_cb (NULL, CPP_DL_WARNING, 0, &richloc, msg, &ap);
   va_end (ap);
 }
 
diff --git a/gcc/input.c b/gcc/input.c
index ff80dd9..baf8e7e 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -751,6 +751,13 @@ expand_location_to_spelling_point (source_location loc)
   return expand_location_1 (loc, /*expansion_point_p=*/false);
 }
 
+expanded_location
+linemap_client_expand_location_to_spelling_point (source_location loc)
+{
+  return expand_location_to_spelling_point (loc);
+}
+
+
 /* If LOCATION is in a system header and if it is a virtual location for
    a token coming from the expansion of a macro, unwind it to the
    location of the expansion point of the macro.  Otherwise, just return
diff --git a/gcc/pretty-print.c b/gcc/pretty-print.c
index 5889015..aee4172 100644
--- a/gcc/pretty-print.c
+++ b/gcc/pretty-print.c
@@ -31,6 +31,27 @@ along with GCC; see the file COPYING3.  If not see
 #include <iconv.h>
 #endif
 
+/* Overwrite the range within this text_info's rich_location.
+   For use e.g. when implementing "+" in client format decoders.  */
+
+void
+text_info::set_range (unsigned int idx, source_range range, bool caret_p)
+{
+  gcc_checking_assert (m_richloc);
+  m_richloc->set_range (idx, range, caret_p, true);
+}
+
+location_t
+text_info::get_location (unsigned int index_of_location) const
+{
+  gcc_checking_assert (m_richloc);
+
+  if (index_of_location == 0)
+    return m_richloc->get_loc ();
+  else
+    return UNKNOWN_LOCATION;
+}
+
 // Default construct an output buffer.
 
 output_buffer::output_buffer ()
diff --git a/gcc/pretty-print.h b/gcc/pretty-print.h
index 2654b0f..cdee253 100644
--- a/gcc/pretty-print.h
+++ b/gcc/pretty-print.h
@@ -27,11 +27,6 @@ along with GCC; see the file COPYING3.  If not see
 /* Maximum number of format string arguments.  */
 #define PP_NL_ARGMAX   30
 
-/* Maximum number of locations associated to each message.  If
-   location 'i' is UNKNOWN_LOCATION, then location 'i+1' is not
-   valid.  */
-#define MAX_LOCATIONS_PER_MESSAGE 2
-
 /* The type of a text to be formatted according a format specification
    along with a list of things.  */
 struct text_info
@@ -40,21 +35,17 @@ struct text_info
   va_list *args_ptr;
   int err_no;  /* for %m */
   void **x_data;
+  rich_location *m_richloc;
 
-  inline void set_location (unsigned int index_of_location, location_t loc)
+  inline void set_location (unsigned int idx, location_t loc, bool caret_p)
   {
-    gcc_checking_assert (index_of_location < MAX_LOCATIONS_PER_MESSAGE);
-    this->locations[index_of_location] = loc;
+    source_range src_range;
+    src_range.m_start = loc;
+    src_range.m_finish = loc;
+    set_range (idx, src_range, caret_p);
   }
-
-  inline location_t get_location (unsigned int index_of_location) const
-  {
-    gcc_checking_assert (index_of_location < MAX_LOCATIONS_PER_MESSAGE);
-    return this->locations[index_of_location];
-  }
-
-private:
-  location_t locations[MAX_LOCATIONS_PER_MESSAGE];
+  void set_range (unsigned int idx, source_range range, bool caret_p);
+  location_t get_location (unsigned int index_of_location) const;
 };
 
 /* How often diagnostics are prefixed by their locations:
diff --git a/gcc/rtl-error.c b/gcc/rtl-error.c
index 8b9b391..d28be1d 100644
--- a/gcc/rtl-error.c
+++ b/gcc/rtl-error.c
@@ -69,9 +69,10 @@ diagnostic_for_asm (const rtx_insn *insn, const char *msg, va_list *args_ptr,
 		    diagnostic_t kind)
 {
   diagnostic_info diagnostic;
+  rich_location richloc (location_for_asm (insn));
 
   diagnostic_set_info (&diagnostic, msg, args_ptr,
-		       location_for_asm (insn), kind);
+		       &richloc, kind);
   report_diagnostic (&diagnostic);
 }
 
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-bw.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-bw.c
new file mode 100644
index 0000000..a4b16da
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-bw.c
@@ -0,0 +1,149 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdiagnostics-show-caret" } */
+
+/* This is a collection of unittests for diagnostic_show_locus;
+   see the overview in diagnostic_plugin_test_show_locus.c.
+
+   In particular, note the discussion of why we need a very long line here:
+01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
+   and that we can't use macros in this file.  */
+
+void test_simple (void)
+{
+#if 0
+  myvar = myvar.x; /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   myvar = myvar.x;
+           ~~~~~^~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_simple_2 (void)
+{
+#if 0
+  x = first_function () + second_function ();  /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   x = first_function () + second_function ();
+       ~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+
+void test_multiline (void)
+{
+#if 0
+  x = (first_function ()
+       + second_function ()); /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   x = (first_function ()
+        ~~~~~~~~~~~~~~~~~
+        + second_function ());
+        ^ ~~~~~~~~~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_many_lines (void)
+{
+#if 0
+  x = (first_function_with_a_very_long_name (lorem, ipsum, dolor, sit, amet,
+                                            consectetur, adipiscing, elit,
+                                            sed, eiusmod, tempor,
+                                            incididunt, ut, labore, et,
+                                            dolore, magna, aliqua)
+       + second_function_with_a_very_long_name (lorem, ipsum, dolor, sit, /* { dg-warning "test" } */
+                                                amet, consectetur,
+                                                adipiscing, elit, sed,
+                                                eiusmod, tempor, incididunt,
+                                                ut, labore, et, dolore,
+                                                magna, aliqua));
+
+/* { dg-begin-multiline-output "" }
+   x = (first_function_with_a_very_long_name (lorem, ipsum, dolor, sit, amet,
+        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                             consectetur, adipiscing, elit,
+                                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                             sed, eiusmod, tempor,
+                                             ~~~~~~~~~~~~~~~~~~~~~
+                                             incididunt, ut, labore, et,
+                                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                             dolore, magna, aliqua)
+                                             ~~~~~~~~~~~~~~~~~~~~~~
+        + second_function_with_a_very_long_name (lorem, ipsum, dolor, sit,
+        ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                                 amet, consectetur,
+                                                 ~~~~~~~~~~~~~~~~~~
+                                                 adipiscing, elit, sed,
+                                                 ~~~~~~~~~~~~~~~~~~~~~~
+                                                 eiusmod, tempor, incididunt,
+                                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                                 ut, labore, et, dolore,
+                                                 ~~~~~~~~~~~~~~~~~~~~~~~
+                                                 magna, aliqua));
+                                                 ~~~~~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_richloc_from_proper_range (void)
+{
+#if 0
+  float f = 98.6f; /* { dg-warning "test" } */
+/* { dg-begin-multiline-output "" }
+   float f = 98.6f;
+             ^~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_caret_within_proper_range (void)
+{
+#if 0
+  float f = foo * bar; /* { dg-warning "17: test" } */
+/* { dg-begin-multiline-output "" }
+   float f = foo * bar;
+             ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_very_wide_line (void)
+{
+#if 0
+                                                                                float f = foo * bar; /* { dg-warning "95: test" } */
+/* { dg-begin-multiline-output "" }
+                                              float f = foo * bar;
+                                                        ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_multiple_carets (void)
+{
+#if 0
+   x = x + y /* { dg-warning "8: test" } */
+/* { dg-begin-multiline-output "" }
+    x = x + y
+        A   B
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_caret_on_leading_whitespace (void)
+{
+#if 0
+    ASSOCIATE (y => x)
+      y = 5 /* { dg-warning "6: test" } */
+/* { dg-begin-multiline-output "" }
+     ASSOCIATE (y => x)
+                    2
+       y = 5
+      1
+   { dg-end-multiline-output "" } */
+#endif
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-color.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-color.c
new file mode 100644
index 0000000..47639b2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-color.c
@@ -0,0 +1,158 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdiagnostics-show-caret -fplugin-arg-diagnostic_plugin_test_show_locus-color" } */
+
+/* This is a collection of unittests for diagnostic_show_locus;
+   see the overview in diagnostic_plugin_test_show_locus.c.
+
+   In particular, note the discussion of why we need a very long line here:
+01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
+   and that we can't use macros in this file.  */
+
+void test_simple (void)
+{
+#if 0
+  myvar = myvar.x; /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   myvar = ^[[32m^[[Kmyvar^[[m^[[K^[[01;35m^[[K.^[[m^[[K^[[34m^[[Kx^[[m^[[K;
+           ^[[32m^[[K~~~~~^[[m^[[K^[[01;35m^[[K^^[[m^[[K^[[34m^[[K~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_simple_2 (void)
+{
+#if 0
+  x = first_function () + second_function ();  /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   x = ^[[32m^[[Kfirst_function ()^[[m^[[K ^[[01;35m^[[K+^[[m^[[K ^[[34m^[[Ksecond_function ()^[[m^[[K;
+       ^[[32m^[[K~~~~~~~~~~~~~~~~~^[[m^[[K ^[[01;35m^[[K^^[[m^[[K ^[[34m^[[K~~~~~~~~~~~~~~~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+
+void test_multiline (void)
+{
+#if 0
+  x = (first_function ()
+       + second_function ()); /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   x = (^[[32m^[[Kfirst_function ()
+ ^[[m^[[K       ^[[32m^[[K~~~~~~~~~~~~~~~~~
+^[[m^[[K        ^[[01;35m^[[K+^[[m^[[K ^[[34m^[[Ksecond_function ()^[[m^[[K);
+        ^[[01;35m^[[K^^[[m^[[K ^[[34m^[[K~~~~~~~~~~~~~~~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_many_lines (void)
+{
+#if 0
+  x = (first_function_with_a_very_long_name (lorem, ipsum, dolor, sit, amet,
+                                            consectetur, adipiscing, elit,
+                                            sed, eiusmod, tempor,
+                                            incididunt, ut, labore, et,
+                                            dolore, magna, aliqua)
+       + second_function_with_a_very_long_name (lorem, ipsum, dolor, sit, /* { dg-warning "test" } */
+                                                amet, consectetur,
+                                                adipiscing, elit, sed,
+                                                eiusmod, tempor, incididunt,
+                                                ut, labore, et, dolore,
+                                                magna, aliqua));
+
+/* { dg-begin-multiline-output "" }
+   x = (^[[32m^[[Kfirst_function_with_a_very_long_name (lorem, ipsum, dolor, sit, amet,
+ ^[[m^[[K       ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[32m^[[K                                            consectetur, adipiscing, elit,
+ ^[[m^[[K                                            ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[32m^[[K                                            sed, eiusmod, tempor,
+ ^[[m^[[K                                            ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[32m^[[K                                            incididunt, ut, labore, et,
+ ^[[m^[[K                                            ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[32m^[[K                                            dolore, magna, aliqua)
+ ^[[m^[[K                                            ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K        ^[[01;35m^[[K+^[[m^[[K ^[[34m^[[Ksecond_function_with_a_very_long_name (lorem, ipsum, dolor, sit,
+ ^[[m^[[K       ^[[01;35m^[[K^^[[m^[[K ^[[34m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                amet, consectetur,
+ ^[[m^[[K                                                ^[[34m^[[K~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                adipiscing, elit, sed,
+ ^[[m^[[K                                                ^[[34m^[[K~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                eiusmod, tempor, incididunt,
+ ^[[m^[[K                                                ^[[34m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                ut, labore, et, dolore,
+ ^[[m^[[K                                                ^[[34m^[[K~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                magna, aliqua)^[[m^[[K);
+                                                 ^[[34m^[[K~~~~~~~~~~~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_richloc_from_proper_range (void)
+{
+#if 0
+  float f = 98.6f; /* { dg-warning "test" } */
+/* { dg-begin-multiline-output "" }
+   float f = ^[[01;35m^[[K98.6f^[[m^[[K;
+             ^[[01;35m^[[K^~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_caret_within_proper_range (void)
+{
+#if 0
+  float f = foo * bar; /* { dg-warning "17: test" } */
+/* { dg-begin-multiline-output "" }
+   float f = ^[[01;35m^[[Kfoo * bar^[[m^[[K;
+             ^[[01;35m^[[K~~~~^~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_very_wide_line (void)
+{
+#if 0
+                                                                                float f = foo * bar; /* { dg-warning "95: test" } */
+/* { dg-begin-multiline-output "" }
+                                              float f = ^[[01;35m^[[Kfoo * bar^[[m^[[K;
+                                                        ^[[01;35m^[[K~~~~^~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_multiple_carets (void)
+{
+#if 0
+   x = x + y /* { dg-warning "8: test" } */
+/* { dg-begin-multiline-output "" }
+    x = ^[[01;35m^[[Kx^[[m^[[K + ^[[32m^[[Ky^[[m^[[K
+        ^[[01;35m^[[KA^[[m^[[K   ^[[32m^[[KB
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_caret_on_leading_whitespace (void)
+{
+#if 0
+    ASSOCIATE (y => x)
+      y = 5 /* { dg-warning "6: test" } */
+/* { dg-begin-multiline-output "" }
+     ASSOCIATE (y =>^[[32m^[[K ^[[m^[[Kx)
+                    ^[[32m^[[K2
+^[[m^[[K      ^[[01;35m^[[K ^[[m^[[Ky = 5
+      ^[[01;35m^[[K1
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
new file mode 100644
index 0000000..3471a4e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
@@ -0,0 +1,326 @@
+/* { dg-options "-O" } */
+
+/* This plugin exercises the diagnostics-printing code.
+
+   The goal is to unit-test the range-printing code without needing any
+   correct range data within the compiler's IR.  We can't use any real
+   diagnostics for this, so we have to fake it, hence this plugin.
+
+   There are two test files used with this code:
+
+     diagnostic-test-show-locus-ascii-bw.c
+     ..........................-ascii-color.c
+
+   to exercise uncolored vs colored output by supplying plugin arguments
+   to hack in the desired behavior:
+
+     -fplugin-arg-diagnostic_plugin_test_show_locus-color
+
+   The test files contain functions, but the body of each
+   function is disabled using the preprocessor.  The plugin detects
+   the functions by name, and inject diagnostics within them, using
+   hard-coded locations relative to the top of each function.
+
+   The plugin uses a function "get_loc" below to map from line/column
+   numbers to source_location, and this relies on input_location being in
+   the same ordinary line_map as the locations in question.  The plugin
+   runs after parsing, so input_location will be at the end of the file.
+
+   This need for all of the test code to be in a single ordinary line map
+   means that each test file needs to have a very long line near the top
+   (potentially to cover the extra byte-count of colorized data),
+   to ensure that further very long lines don't start a new linemap.
+   This also means that we can't use macros in the test files.  */
+
+#include "gcc-plugin.h"
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "stringpool.h"
+#include "toplev.h"
+#include "basic-block.h"
+#include "hash-table.h"
+#include "vec.h"
+#include "ggc.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "internal-fn.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "tree.h"
+#include "tree-pass.h"
+#include "intl.h"
+#include "plugin-version.h"
+#include "diagnostic.h"
+#include "context.h"
+#include "print-tree.h"
+
+int plugin_is_GPL_compatible;
+
+const pass_data pass_data_test_show_locus =
+{
+  GIMPLE_PASS, /* type */
+  "test_show_locus", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_NONE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  0, /* todo_flags_finish */
+};
+
+class pass_test_show_locus : public gimple_opt_pass
+{
+public:
+  pass_test_show_locus(gcc::context *ctxt)
+    : gimple_opt_pass(pass_data_test_show_locus, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  bool gate (function *) { return true; }
+  virtual unsigned int execute (function *);
+
+}; // class pass_test_show_locus
+
+/* Given LINE_NUM and COL_NUM, generate a source_location in the
+   current file, relative to input_location.  This relies on the
+   location being expressible in the same ordinary line_map as
+   input_location (which is typically at the end of the source file
+   when this is called).  Hence the test files we compile with this
+   plugin must have an initial very long line (to avoid long lines
+   starting a new line map), and must not use macros.
+
+   COL_NUM uses the Emacs convention of 0-based column numbers.  */
+
+static source_location
+get_loc (unsigned int line_num, unsigned int col_num)
+{
+  /* Use input_location to get the relevant line_map */
+  const struct line_map_ordinary *line_map
+    = (const line_map_ordinary *)(linemap_lookup (line_table,
+						  input_location));
+
+  /* Convert from 0-based column numbers to 1-based column numbers.  */
+  source_location loc
+    = linemap_position_for_line_and_column (line_map,
+					    line_num, col_num + 1);
+
+  return loc;
+}
+
+/* Was "color" passed in as a plugin argument?  */
+static bool force_show_locus_color = false;
+
+/* We want to verify the colorized output of diagnostic_show_locus,
+   but turning on colorization for everything confuses "dg-warning" etc.
+   Hence we special-case it within this plugin by using this modified
+   version of default_diagnostic_finalizer, which, if "color" is
+   passed in as a plugin argument turns on colorization, but just
+   for diagnostic_show_locus.  */
+
+static void
+custom_diagnostic_finalizer (diagnostic_context *context,
+			     diagnostic_info *diagnostic)
+{
+  bool old_show_color = pp_show_color (context->printer);
+  if (force_show_locus_color)
+    pp_show_color (context->printer) = true;
+  diagnostic_show_locus (context, diagnostic);
+  pp_show_color (context->printer) = old_show_color;
+
+  pp_destroy_prefix (context->printer);
+  pp_newline_and_flush (context->printer);
+}
+
+/* Exercise the diagnostic machinery to emit various warnings,
+   for use by diagnostic-test-show-locus-*.c.
+
+   We inject each warning relative to the start of a function,
+   which avoids lots of hardcoded absolute locations.  */
+
+static void
+test_show_locus (function *fun)
+{
+  tree fndecl = fun->decl;
+  tree identifier = DECL_NAME (fndecl);
+  const char *fnname = IDENTIFIER_POINTER (identifier);
+  location_t fnstart = fun->function_start_locus;
+  int fnstart_line = LOCATION_LINE (fnstart);
+
+  diagnostic_finalizer (global_dc) = custom_diagnostic_finalizer;
+
+  /* Hardcode the "terminal width", to verify the behavior of
+     very wide lines.  */
+  global_dc->caret_max_width = 70;
+
+  if (0 == strcmp (fnname, "test_simple"))
+    {
+      const int line = fnstart_line + 2;
+      rich_location richloc (get_loc (line, 15));
+      richloc.add_range (get_loc (line, 10), get_loc (line, 14));
+      richloc.add_range (get_loc (line, 16), get_loc (line, 16));
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  if (0 == strcmp (fnname, "test_simple_2"))
+    {
+      const int line = fnstart_line + 2;
+      rich_location richloc (get_loc (line, 24));
+      richloc.add_range (get_loc (line, 6),
+			 get_loc (line, 22));
+      richloc.add_range (get_loc (line, 26),
+			 get_loc (line, 43));
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  if (0 == strcmp (fnname, "test_multiline"))
+    {
+      const int line = fnstart_line + 2;
+      rich_location richloc (get_loc (line + 1, 7));
+      richloc.add_range (get_loc (line, 7),
+			 get_loc (line, 23));
+      richloc.add_range (get_loc (line + 1, 9),
+			 get_loc (line + 1, 26));
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  if (0 == strcmp (fnname, "test_many_lines"))
+    {
+      const int line = fnstart_line + 2;
+      rich_location richloc (get_loc (line + 5, 7));
+      richloc.add_range (get_loc (line, 7),
+			 get_loc (line + 4, 65));
+      richloc.add_range (get_loc (line + 5, 9),
+			 get_loc (line + 10, 61));
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  /* Example of a rich_location constructed directly from a
+     source_range where the range is larger than one character.  */
+  if (0 == strcmp (fnname, "test_richloc_from_proper_range"))
+    {
+      const int line = fnstart_line + 2;
+      source_range src_range;
+      src_range.m_start = get_loc (line, 12);
+      src_range.m_finish = get_loc (line, 16);
+      rich_location richloc (src_range);
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  /* Example of a single-range location where the range starts
+     before the caret.  */
+  if (0 == strcmp (fnname, "test_caret_within_proper_range"))
+    {
+      const int line = fnstart_line + 2;
+      location_t caret = get_loc (line, 16);
+      source_range src_range;
+      src_range.m_start = get_loc (line, 12);
+      src_range.m_finish = get_loc (line, 20);
+      rich_location richloc (caret);
+      richloc.set_range (0, src_range, true, false);
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  /* Example of a very wide line, where the information of interest
+     is beyond the width of the terminal (hardcoded above).  */
+  if (0 == strcmp (fnname, "test_very_wide_line"))
+    {
+      const int line = fnstart_line + 2;
+      location_t caret = get_loc (line, 94);
+      source_range src_range;
+      src_range.m_start = get_loc (line, 90);
+      src_range.m_finish = get_loc (line, 98);
+      rich_location richloc (caret);
+      richloc.set_range (0, src_range, true, false);
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  /* Example of multiple carets.  */
+  if (0 == strcmp (fnname, "test_multiple_carets"))
+    {
+      const int line = fnstart_line + 2;
+      location_t caret_a = get_loc (line, 7);
+      location_t caret_b = get_loc (line, 11);
+      rich_location richloc (caret_a);
+      richloc.add_range (caret_b, caret_b, true);
+      global_dc->caret_chars[0] = 'A';
+      global_dc->caret_chars[1] = 'B';
+      warning_at_rich_loc (&richloc, 0, "test");
+      global_dc->caret_chars[0] = '^';
+      global_dc->caret_chars[1] = '^';
+    }
+
+  /* Example of two carets where both carets appear to have an off-by-one
+     error appearing one column early.
+     Seen with gfortran.dg/associate_5.f03.
+     In an earlier version of the printer, the printing of caret 0 aka
+     "1" was suppressed due to it appearing within the leading whitespace
+     before the text in its line.  Ensure that we at least faithfully
+     print both carets, at the given (erroneous) locations.  */
+  if (0 == strcmp (fnname, "test_caret_on_leading_whitespace"))
+    {
+      const int line = fnstart_line + 3;
+      location_t caret_a = get_loc (line, 5);
+      location_t caret_b = get_loc (line - 1, 19);
+      rich_location richloc (caret_a);
+      richloc.add_range (caret_b, caret_b, true);
+      global_dc->caret_chars[0] = '1';
+      global_dc->caret_chars[1] = '2';
+      warning_at_rich_loc (&richloc, 0, "test");
+      global_dc->caret_chars[0] = '^';
+      global_dc->caret_chars[1] = '^';
+    }
+}
+
+unsigned int
+pass_test_show_locus::execute (function *fun)
+{
+  test_show_locus (fun);
+  return 0;
+}
+
+static gimple_opt_pass *
+make_pass_test_show_locus (gcc::context *ctxt)
+{
+  return new pass_test_show_locus (ctxt);
+}
+
+int
+plugin_init (struct plugin_name_args *plugin_info,
+	     struct plugin_gcc_version *version)
+{
+  struct register_pass_info pass_info;
+  const char *plugin_name = plugin_info->base_name;
+  int argc = plugin_info->argc;
+  struct plugin_argument *argv = plugin_info->argv;
+
+  if (!plugin_default_version_check (version, &gcc_version))
+    return 1;
+
+  /* For now, tell the dc to expect ranges and thus to colorize the source
+     lines, not just the carets/underlines.  This will be redundant
+     once the C frontend generates ranges.  */
+  global_dc->colorize_source_p = true;
+
+  for (int i = 0; i < argc; i++)
+    {
+      if (0 == strcmp (argv[i].key, "color"))
+	force_show_locus_color = true;
+    }
+
+  pass_info.pass = make_pass_test_show_locus (g);
+  pass_info.reference_pass_name = "ssa";
+  pass_info.ref_pass_instance_number = 1;
+  pass_info.pos_op = PASS_POS_INSERT_AFTER;
+  register_callback (plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL,
+		     &pass_info);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/plugin.exp b/gcc/testsuite/gcc.dg/plugin/plugin.exp
index 39fab6e..941bccc 100644
--- a/gcc/testsuite/gcc.dg/plugin/plugin.exp
+++ b/gcc/testsuite/gcc.dg/plugin/plugin.exp
@@ -63,6 +63,9 @@ set plugin_test_list [list \
     { start_unit_plugin.c start_unit-test-1.c } \
     { finish_unit_plugin.c finish_unit-test-1.c } \
     { wide-int_plugin.c wide-int-test-1.c } \
+    { diagnostic_plugin_test_show_locus.c \
+	  diagnostic-test-show-locus-bw.c \
+	  diagnostic-test-show-locus-color.c } \
 ]
 
 foreach plugin_test $plugin_test_list {
diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp
index 7c1ab85..8cc1d87 100644
--- a/gcc/testsuite/lib/gcc-dg.exp
+++ b/gcc/testsuite/lib/gcc-dg.exp
@@ -29,6 +29,7 @@ load_lib libgloss.exp
 load_lib target-libpath.exp
 load_lib torture-options.exp
 load_lib fortran-modules.exp
+load_lib multiline.exp
 
 # We set LC_ALL and LANG to C so that we get the same error messages as expected.
 setenv LC_ALL C
diff --git a/gcc/tree-diagnostic.c b/gcc/tree-diagnostic.c
index 135f142..02009d8 100644
--- a/gcc/tree-diagnostic.c
+++ b/gcc/tree-diagnostic.c
@@ -289,7 +289,7 @@ default_tree_printer (pretty_printer *pp, text_info *text, const char *spec,
     }
 
   if (set_locus)
-    text->set_location (0, DECL_SOURCE_LOCATION (t));
+    text->set_location (0, DECL_SOURCE_LOCATION (t), true);
 
   if (DECL_P (t))
     {
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index ce3f6a8..29bc48a 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -3592,7 +3592,7 @@ void
 percent_K_format (text_info *text)
 {
   tree t = va_arg (*text->args_ptr, tree), block;
-  text->set_location (0, EXPR_LOCATION (t));
+  text->set_location (0, EXPR_LOCATION (t), true);
   gcc_assert (pp_ti_abstract_origin (text) != NULL);
   block = TREE_BLOCK (t);
   *pp_ti_abstract_origin (text) = NULL;
diff --git a/libcpp/errors.c b/libcpp/errors.c
index a33196e..c351c11 100644
--- a/libcpp/errors.c
+++ b/libcpp/errors.c
@@ -57,7 +57,8 @@ cpp_diagnostic (cpp_reader * pfile, int level, int reason,
 
   if (!pfile->cb.error)
     abort ();
-  ret = pfile->cb.error (pfile, level, reason, src_loc, 0, _(msgid), ap);
+  rich_location richloc (src_loc);
+  ret = pfile->cb.error (pfile, level, reason, &richloc, _(msgid), ap);
 
   return ret;
 }
@@ -139,7 +140,9 @@ cpp_diagnostic_with_line (cpp_reader * pfile, int level, int reason,
   
   if (!pfile->cb.error)
     abort ();
-  ret = pfile->cb.error (pfile, level, reason, src_loc, column, _(msgid), ap);
+  rich_location richloc (src_loc);
+  richloc.override_column (column);
+  ret = pfile->cb.error (pfile, level, reason, &richloc, _(msgid), ap);
 
   return ret;
 }
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 5eaea6b..a2bdfa0 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -573,9 +573,9 @@ struct cpp_callbacks
 
   /* Called to emit a diagnostic.  This callback receives the
      translated message.  */
-  bool (*error) (cpp_reader *, int, int, source_location, unsigned int,
+  bool (*error) (cpp_reader *, int, int, rich_location *,
 		 const char *, va_list *)
-       ATTRIBUTE_FPTR_PRINTF(6,0);
+       ATTRIBUTE_FPTR_PRINTF(5,0);
 
   /* Callbacks for when a macro is expanded, or tested (whether
      defined or not at the time) in #ifdef, #ifndef or "defined".  */
diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index 09378f9..84a5ab7 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -131,6 +131,35 @@ typedef unsigned int linenum_type;
   libcpp/location-example.txt.  */
 typedef unsigned int source_location;
 
+/* A range of source locations.
+
+   Ranges are closed:
+   m_start is the first location within the range,
+   m_finish is the last location within the range.
+
+   We may need a more compact way to store these, but for now,
+   let's do it the simple way, as a pair.  */
+struct GTY(()) source_range
+{
+  source_location m_start;
+  source_location m_finish;
+
+  void debug (const char *msg) const;
+
+  /* We avoid using constructors, since various structs that
+     don't yet have constructors will embed instances of
+     source_range.  */
+
+  /* Make a source_range from a source_location.  */
+  static source_range from_location (source_location loc)
+  {
+    source_range result;
+    result.m_start = loc;
+    result.m_finish = loc;
+    return result;
+  }
+};
+
 /* Memory allocation function typedef.  Works like xrealloc.  */
 typedef void *(*line_map_realloc) (void *, size_t);
 
@@ -1028,6 +1057,175 @@ typedef struct
   bool sysp;
 } expanded_location;
 
+/* Both gcc and emacs number source *lines* starting at 1, but
+   they have differing conventions for *columns*.
+
+   GCC uses a 1-based convention for source columns,
+   whereas Emacs's M-x column-number-mode uses a 0-based convention.
+
+   For example, an error in the initial, left-hand
+   column of source line 3 is reported by GCC as:
+
+      some-file.c:3:1: error: ...etc...
+
+   On navigating to the location of that error in Emacs
+   (e.g. via "next-error"),
+   the locus is reported in the Mode Line
+   (assuming M-x column-number-mode) as:
+
+     some-file.c   10%   (3, 0)
+
+   i.e. "3:1:" in GCC corresponds to "(3, 0)" in Emacs.  */
+
+/* Ranges are closed
+   m_start is the first location within the range, and
+   m_finish is the last location within the range.  */
+struct location_range
+{
+  expanded_location m_start;
+  expanded_location m_finish;
+
+  /* Should a caret be drawn for this range?  Typically this is
+     true for the 0th range, and false for subsequent ranges,
+     but the Fortran frontend overrides this for rendering things like:
+
+       x = x + y
+           1   2
+       Error: Shapes for operands at (1) and (2) are not conformable
+
+     where "1" and "2" are notionally carets.  */
+  bool m_show_caret_p;
+  expanded_location m_caret;
+};
+
+/* A "rich" source code location, for use when printing diagnostics.
+   A rich_location has one or more ranges, each optionally with
+   a caret.   Typically the zeroth range has a caret; other ranges
+   sometimes have carets.
+
+   The "primary" location of a rich_location is the caret of range 0,
+   used for determining the line/column when printing diagnostic
+   text, such as:
+
+      some-file.c:3:1: error: ...etc...
+
+   Additional ranges may be added to help the user identify other
+   pertinent clauses in a diagnostic.
+
+   rich_location instances are intended to be allocated on the stack
+   when generating diagnostics, and to be short-lived.
+
+   Examples of rich locations
+   --------------------------
+
+   Example A
+   *********
+      int i = "foo";
+              ^
+   This "rich" location is simply a single range (range 0), with
+   caret = start = finish at the given point.
+
+   Example B
+   *********
+      a = (foo && bar)
+          ~~~~~^~~~~~~
+   This rich location has a single range (range 0), with the caret
+   at the first "&", and the start/finish at the parentheses.
+   Compare with example C below.
+
+   Example C
+   *********
+      a = (foo && bar)
+           ~~~ ^~ ~~~
+   This rich location has three ranges:
+   - Range 0 has its caret and start location at the first "&" and
+     end at the second "&.
+   - Range 1 has its start and finish at the "f" and "o" of "foo";
+     the caret is not flagged for display, but is perhaps at the "f"
+     of "foo".
+   - Similarly, range 2 has its start and finish at the "b" and "r" of
+     "bar"; the caret is not flagged for display, but is perhaps at the
+     "b" of "bar".
+   Compare with example B above.
+
+   Example D (Fortran frontend)
+   ****************************
+       x = x + y
+           1   2
+   This rich location has range 0 at "1", and range 1 at "2".
+   Both are flagged for caret display.  Both ranges have start/finish
+   equal to their caret point.  The frontend overrides the diagnostic
+   context's default caret character for these ranges.
+
+   Example E
+   *********
+      printf ("arg0: %i  arg1: %s arg2: %i",
+                               ^~
+              100, 101, 102);
+                   ~~~
+   This rich location has two ranges:
+   - range 0 is at the "%s" with start = caret = "%" and finish at
+     the "s".
+   - range 1 has start/finish covering the "101" and is not flagged for
+     caret printing; it is perhaps at the start of "101".  */
+
+class rich_location
+{
+ public:
+  /* Constructors.  */
+
+  /* Constructing from a location.  */
+  rich_location (source_location loc);
+
+  /* Constructing from a source_range.  */
+  rich_location (source_range src_range);
+
+  /* Accessors.  */
+  source_location get_loc () const { return m_loc; }
+
+  source_location *get_loc_addr () { return &m_loc; }
+
+  void
+  add_range (source_location start, source_location finish,
+	     bool show_caret_p = false);
+
+  void
+  add_range (source_range src_range,
+	     bool show_caret_p = false);
+
+  void
+  add_range (location_range *src_range);
+
+  void
+  set_range (unsigned int idx, source_range src_range,
+	     bool show_caret_p, bool overwrite_loc_p);
+
+  unsigned int get_num_locations () const { return m_num_ranges; }
+
+  location_range *get_range (unsigned int idx)
+  {
+    linemap_assert (idx < m_num_ranges);
+    return &m_ranges[idx];
+  }
+
+  expanded_location lazily_expand_location ();
+
+  void
+  override_column (int column);
+
+public:
+  static const int MAX_RANGES = 3;
+
+protected:
+  source_location m_loc;
+
+  unsigned int m_num_ranges;
+  location_range m_ranges[MAX_RANGES];
+
+  bool m_have_expanded_location;
+  expanded_location m_expanded_location;
+};
+
 /* This is enum is used by the function linemap_resolve_location
    below.  The meaning of the values is explained in the comment of
    that function.  */
@@ -1173,4 +1371,13 @@ void linemap_dump (FILE *, struct line_maps *, unsigned, bool);
    specifies how many macro maps to dump.  */
 void line_table_dump (FILE *, struct line_maps *, unsigned int, unsigned int);
 
+/* The rich_location class requires a way to expand source_location instances.
+   We would directly use expand_location_to_spelling_point, which is
+   implemented in gcc/input.c, but we also need to use it for rich_location
+   within genmatch.c.
+   Hence we require client code of libcpp to implement the following
+   symbol.  */
+extern expanded_location
+linemap_client_expand_location_to_spelling_point (source_location );
+
 #endif /* !LIBCPP_LINE_MAP_H  */
diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index 84403de..3c19f93 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -1755,3 +1755,133 @@ line_table_dump (FILE *stream, struct line_maps *set, unsigned int num_ordinary,
       fprintf (stream, "\n");
     }
 }
+
+/* class rich_location.  */
+
+/* Construct a rich_location with location LOC as its initial range.  */
+
+rich_location::rich_location (source_location loc) :
+  m_loc (loc),
+  m_num_ranges (0),
+  m_have_expanded_location (false)
+{
+  /* Set up the 0th range: */
+  add_range (loc, loc, true);
+  m_ranges[0].m_caret = lazily_expand_location ();
+}
+
+/* Construct a rich_location with source_range SRC_RANGE as its
+   initial range.  */
+
+rich_location::rich_location (source_range src_range)
+: m_loc (src_range.m_start),
+  m_num_ranges (0),
+  m_have_expanded_location (false)
+{
+  /* Set up the 0th range: */
+  add_range (src_range, true);
+}
+
+/* Get an expanded_location for this rich_location's primary
+   location.  */
+
+expanded_location
+rich_location::lazily_expand_location ()
+{
+  if (!m_have_expanded_location)
+    {
+      m_expanded_location
+	= linemap_client_expand_location_to_spelling_point (m_loc);
+      m_have_expanded_location = true;
+    }
+
+  return m_expanded_location;
+}
+
+/* Set the column of the primary location.  */
+
+void
+rich_location::override_column (int column)
+{
+  lazily_expand_location ();
+  m_expanded_location.column = column;
+}
+
+/* Add the given range.  */
+
+void
+rich_location::add_range (source_location start, source_location finish,
+			  bool show_caret_p)
+{
+  linemap_assert (m_num_ranges < MAX_RANGES);
+
+  location_range *range = &m_ranges[m_num_ranges++];
+  range->m_start = linemap_client_expand_location_to_spelling_point (start);
+  range->m_finish = linemap_client_expand_location_to_spelling_point (finish);
+  range->m_caret = range->m_start;
+  range->m_show_caret_p = show_caret_p;
+}
+
+/* Add the given range.  */
+
+void
+rich_location::add_range (source_range src_range, bool show_caret_p)
+{
+  linemap_assert (m_num_ranges < MAX_RANGES);
+
+  add_range (src_range.m_start, src_range.m_finish, show_caret_p);
+}
+
+void
+rich_location::add_range (location_range *src_range)
+{
+  linemap_assert (m_num_ranges < MAX_RANGES);
+
+  m_ranges[m_num_ranges++] = *src_range;
+}
+
+/* Add or overwrite the range given by IDX.  It must either
+   overwrite an existing range, or add one *exactly* on the end of
+   the array.
+
+   This is primarily for use by gcc when implementing diagnostic
+   format decoders e.g. the "+" in the C/C++ frontends, for handling
+   format codes like "%q+D" (which writes the source location of a
+   tree back into range 0 of the rich_location).
+
+   If SHOW_CARET_P is true, then the range should be rendered with
+   a caret at its starting location.  This
+   is for use by the Fortran frontend, for implementing the
+   "%C" and "%L" format codes.  */
+
+void
+rich_location::set_range (unsigned int idx, source_range src_range,
+			  bool show_caret_p, bool overwrite_loc_p)
+{
+  linemap_assert (idx < MAX_RANGES);
+
+  /* We can either overwrite an existing range, or add one exactly
+     on the end of the array.  */
+  linemap_assert (idx <= m_num_ranges);
+
+  location_range *locrange = &m_ranges[idx];
+  locrange->m_start
+    = linemap_client_expand_location_to_spelling_point (src_range.m_start);
+  locrange->m_finish
+    = linemap_client_expand_location_to_spelling_point (src_range.m_finish);
+
+  locrange->m_show_caret_p = show_caret_p;
+  if (overwrite_loc_p)
+    locrange->m_caret = locrange->m_start;
+
+  /* Are we adding a range onto the end?  */
+  if (idx == m_num_ranges)
+    m_num_ranges = idx + 1;
+
+  if (idx == 0 && overwrite_loc_p)
+    {
+      m_loc = src_range.m_start;
+      /* Mark any cached value here as dirty.  */
+      m_have_expanded_location = false;
+    }
+}
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 08/10] Wire things up so that libcpp users get token underlines
  2015-10-23 20:25 ` [PATCH 00/10] Overhaul of diagnostics (v5) David Malcolm
                     ` (3 preceding siblings ...)
  2015-10-23 20:25   ` [PATCH 04/10] Reimplement diagnostic_show_locus, introducing rich_location classes (v5) David Malcolm
@ 2015-10-23 20:26   ` David Malcolm
  2015-10-30  6:15     ` Jeff Law
  2015-10-23 20:26   ` [PATCH 05/10] Add ranges to libcpp tokens (via ad-hoc data, unoptimized) David Malcolm
                     ` (5 subsequent siblings)
  10 siblings, 1 reply; 83+ messages in thread
From: David Malcolm @ 2015-10-23 20:26 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

A previous patch introduced the ability to print one or more ranges
for a diagnostic via a rich_location class.

Another patch generalized source_location (aka location_t) to be both
a caret and a range, and generated range information for all tokens
coming out of libcpp's lexer.

The attached patch combines these efforts by updating the
rich_location constructor for a single source_location so that it
makes use of the range within the source_location.  Doing so requires
passing the line_table to the ctor, so that it can extract the range
from there.

The effect of this is that all of the various "warning", "warning_at"
"error", "error_at" diagnostics now emit underlines showing the range
of the token associated with the location_t (or input_location), for
those frontends using libcpp.  Similar things should happen for
expressions in the C FE for diagnostics using EXPR_LOCATION.

A test case is added showing various token-based warnings that now
have underlines (without having to go through and add range information
to them).  For example:

diagnostic-token-ranges.c: In function ‘wide_string_literal_in_asm’:
diagnostic-token-ranges.c:68:8: error: wide string literal in ‘asm’
   asm (L"nop");
        ^~~~~~

gcc/c-family/ChangeLog:
	* c-opts.c (c_common_init_options): Set
	global_dc->colorize_source_p.

gcc/c/ChangeLog:
	* c-decl.c (warn_defaults_to): Pass line_table to
	rich_location ctor.
	* c-errors.c (pedwarn_c99): Likewise.
	(pedwarn_c90): Likewise.

gcc/cp/ChangeLog:
	* error.c (pedwarn_cxx98): Pass line_table to
	rich_location ctor.

gcc/ChangeLog:
	* diagnostic.c (diagnostic_append_note): Pass line_table to
	rich_location ctor.
	(emit_diagnostic): Likewise.
	(inform): Likewise.
	(inform_n): Likewise.
	(warning): Likewise.
	(warning_at): Likewise.
	(warning_n): Likewise.
	(pedwarn): Likewise.
	(permerror): Likewise.
	(error): Likewise.
	(error_n): Likewise.
	(error_at): Likewise.
	(sorry): Likewise.
	(fatal_error): Likewise.
	(internal_error): Likewise.
	(internal_error_no_backtrace): Likewise.
	(real_abort): Likewise.
	* gcc-rich-location.h (gcc_rich_location::gcc_rich_location):
	Likewise.
	* genmatch.c (fatal_at): Likewise.
	(warning_at): Likewise.
	* rtl-error.c (diagnostic_for_asm): Likewise.

gcc/fortran/ChangeLog:
	* error.c (gfc_warning): Pass line_table to rich_location ctor.
	(gfc_warning_now_at): Likewise.
	(gfc_warning_now): Likewise.
	(gfc_error_now): Likewise.
	(gfc_fatal_error): Likewise.
	(gfc_error): Likewise.
	(gfc_internal_error): Likewise.

gcc/testsuite/ChangeLog:
	* gcc.dg/diagnostic-token-ranges.c: New file.
	* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
	(test_show_locus): Pass line_table to rich_location ctors.
	(plugin_init): Remove setting of global_dc->colorize_source_p.
	* gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c:
	Remove include of gcc-rich-location.h.
	(get_range_for_expr): Delete.
	(gcc_rich_location::add_expr): Delete.
	(emit_warning): Change param from rich_location * to location_t.
	Require an ad-hoc location, and extract range from it.
	Use warning_at directly, without using a rich_location.
	(cb_walk_tree_fn): Pass EXPR_LOCATION (arg) directly to
	emit_warning, without creating a rich_location.

libcpp/ChangeLog:
	* errors.c (cpp_diagnostic): Pass pfile->line_table to
	rich_location ctor.
	(cpp_diagnostic_with_line): Likewise.
	* include/line-map.h (rich_location::rich_location): Add
	line_maps * param.
	* line-map.c (rich_location::rich_location): Likewise; use
	it to extract the range from the source_location.
---
 gcc/c-family/c-opts.c                              |   2 +
 gcc/c/c-decl.c                                     |   2 +-
 gcc/c/c-errors.c                                   |   4 +-
 gcc/cp/error.c                                     |   2 +-
 gcc/diagnostic.c                                   |  34 +++---
 gcc/fortran/error.c                                |  14 +--
 gcc/gcc-rich-location.h                            |   2 +-
 gcc/genmatch.c                                     |   8 +-
 gcc/rtl-error.c                                    |   2 +-
 gcc/testsuite/gcc.dg/diagnostic-token-ranges.c     | 120 +++++++++++++++++++++
 .../plugin/diagnostic_plugin_test_show_locus.c     |  21 ++--
 .../diagnostic_plugin_test_tree_expression_range.c |  70 ++----------
 libcpp/errors.c                                    |   4 +-
 libcpp/include/line-map.h                          |   2 +-
 libcpp/line-map.c                                  |   7 +-
 15 files changed, 182 insertions(+), 112 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/diagnostic-token-ranges.c

diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index bf2e6b0..681a542 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -247,6 +247,8 @@ c_common_init_options (unsigned int decoded_options_count,
 	    break;
 	  }
     }
+
+  global_dc->colorize_source_p = true;
 }
 
 /* Handle switch SCODE with argument ARG.  VALUE is true, unless no-
diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index 732080a..39c961e 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -5297,7 +5297,7 @@ warn_defaults_to (location_t location, int opt, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
diff --git a/gcc/c/c-errors.c b/gcc/c/c-errors.c
index 0f8b933..647423d 100644
--- a/gcc/c/c-errors.c
+++ b/gcc/c/c-errors.c
@@ -42,7 +42,7 @@ pedwarn_c99 (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool warned = false;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, gmsgid);
   /* If desired, issue the C99/C11 compat warning, which is more specific
@@ -81,7 +81,7 @@ pedwarn_c90 (location_t location, int opt, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, gmsgid);
   /* Warnings such as -Wvla are the most specific ones.  */
diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 2e2ff10..82d18e3 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -3676,7 +3676,7 @@ pedwarn_cxx98 (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 5fe6627..f1c5a96 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -868,7 +868,7 @@ diagnostic_append_note (diagnostic_context *context,
   diagnostic_info diagnostic;
   va_list ap;
   const char *saved_prefix;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_NOTE);
@@ -926,7 +926,7 @@ emit_diagnostic (diagnostic_t kind, location_t location, int opt,
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, gmsgid);
   if (kind == DK_PERMERROR)
@@ -953,7 +953,7 @@ inform (location_t location, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_NOTE);
@@ -982,7 +982,7 @@ inform_n (location_t location, int n, const char *singular_gmsgid,
 {
   diagnostic_info diagnostic;
   va_list ap;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, plural_gmsgid);
   diagnostic_set_info_translated (&diagnostic,
@@ -1001,7 +1001,7 @@ warning (int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
-  rich_location richloc (input_location);
+  rich_location richloc (line_table, input_location);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_WARNING);
@@ -1022,7 +1022,7 @@ warning_at (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_WARNING);
@@ -1060,7 +1060,7 @@ warning_n (location_t location, int opt, int n, const char *singular_gmsgid,
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, plural_gmsgid);
   diagnostic_set_info_translated (&diagnostic,
@@ -1092,7 +1092,7 @@ pedwarn (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,  DK_PEDWARN);
@@ -1115,7 +1115,7 @@ permerror (location_t location, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
@@ -1151,7 +1151,7 @@ error (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
-  rich_location richloc (input_location);
+  rich_location richloc (line_table, input_location);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ERROR);
@@ -1167,7 +1167,7 @@ error_n (location_t location, int n, const char *singular_gmsgid,
 {
   diagnostic_info diagnostic;
   va_list ap;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, plural_gmsgid);
   diagnostic_set_info_translated (&diagnostic,
@@ -1183,7 +1183,7 @@ error_at (location_t loc, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
-  rich_location richloc (loc);
+  rich_location richloc (line_table, loc);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ERROR);
@@ -1214,7 +1214,7 @@ sorry (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
-  rich_location richloc (input_location);
+  rich_location richloc (line_table, input_location);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_SORRY);
@@ -1238,7 +1238,7 @@ fatal_error (location_t loc, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
-  rich_location richloc (loc);
+  rich_location richloc (line_table, loc);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_FATAL);
@@ -1257,7 +1257,7 @@ internal_error (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
-  rich_location richloc (input_location);
+  rich_location richloc (line_table, input_location);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ICE);
@@ -1275,7 +1275,7 @@ internal_error_no_backtrace (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
-  rich_location richloc (input_location);
+  rich_location richloc (line_table, input_location);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ICE_NOBT);
@@ -1346,7 +1346,7 @@ real_abort (void)
 void
 source_range::debug (const char *msg) const
 {
-  rich_location richloc (m_start);
+  rich_location richloc (line_table, m_start);
   richloc.add_range (m_start, m_finish);
   inform_at_rich_loc (&richloc, "%s", msg);
 }
diff --git a/gcc/fortran/error.c b/gcc/fortran/error.c
index 4b3d31c..b4f7020 100644
--- a/gcc/fortran/error.c
+++ b/gcc/fortran/error.c
@@ -773,7 +773,7 @@ gfc_warning (int opt, const char *gmsgid, va_list ap)
   va_copy (argp, ap);
 
   diagnostic_info diagnostic;
-  rich_location rich_loc (UNKNOWN_LOCATION);
+  rich_location rich_loc (line_table, UNKNOWN_LOCATION);
   bool fatal_errors = global_dc->fatal_errors;
   pretty_printer *pp = global_dc->printer;
   output_buffer *tmp_buffer = pp->buffer;
@@ -1120,7 +1120,7 @@ gfc_warning_now_at (location_t loc, int opt, const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
-  rich_location rich_loc (loc);
+  rich_location rich_loc (line_table, loc);
   bool ret;
 
   va_start (argp, gmsgid);
@@ -1138,7 +1138,7 @@ gfc_warning_now (int opt, const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
-  rich_location rich_loc (UNKNOWN_LOCATION);
+  rich_location rich_loc (line_table, UNKNOWN_LOCATION);
   bool ret;
 
   va_start (argp, gmsgid);
@@ -1158,7 +1158,7 @@ gfc_error_now (const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
-  rich_location rich_loc (UNKNOWN_LOCATION);
+  rich_location rich_loc (line_table, UNKNOWN_LOCATION);
 
   error_buffer.flag = true;
 
@@ -1176,7 +1176,7 @@ gfc_fatal_error (const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
-  rich_location rich_loc (UNKNOWN_LOCATION);
+  rich_location rich_loc (line_table, UNKNOWN_LOCATION);
 
   va_start (argp, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc, DK_FATAL);
@@ -1242,7 +1242,7 @@ gfc_error (const char *gmsgid, va_list ap)
     }
 
   diagnostic_info diagnostic;
-  rich_location richloc (UNKNOWN_LOCATION);
+  rich_location richloc (line_table, UNKNOWN_LOCATION);
   bool fatal_errors = global_dc->fatal_errors;
   pretty_printer *pp = global_dc->printer;
   output_buffer *tmp_buffer = pp->buffer;
@@ -1288,7 +1288,7 @@ gfc_internal_error (const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
-  rich_location rich_loc (UNKNOWN_LOCATION);
+  rich_location rich_loc (line_table, UNKNOWN_LOCATION);
 
   va_start (argp, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc, DK_ICE);
diff --git a/gcc/gcc-rich-location.h b/gcc/gcc-rich-location.h
index c82cbf1..2f9291d 100644
--- a/gcc/gcc-rich-location.h
+++ b/gcc/gcc-rich-location.h
@@ -29,7 +29,7 @@ class gcc_rich_location : public rich_location
 
   /* Constructing from a location.  */
   gcc_rich_location (source_location loc) :
-    rich_location (loc) {}
+    rich_location (line_table, loc) {}
 
   /* Constructing from a source_range.  */
   gcc_rich_location (source_range src_range) :
diff --git a/gcc/genmatch.c b/gcc/genmatch.c
index 6bfde06..24edc99 100644
--- a/gcc/genmatch.c
+++ b/gcc/genmatch.c
@@ -111,7 +111,7 @@ __attribute__((format (printf, 2, 3)))
 #endif
 fatal_at (const cpp_token *tk, const char *msg, ...)
 {
-  rich_location richloc (tk->src_loc);
+  rich_location richloc (line_table, tk->src_loc);
   va_list ap;
   va_start (ap, msg);
   error_cb (NULL, CPP_DL_FATAL, 0, &richloc, msg, &ap);
@@ -124,7 +124,7 @@ __attribute__((format (printf, 2, 3)))
 #endif
 fatal_at (source_location loc, const char *msg, ...)
 {
-  rich_location richloc (loc);
+  rich_location richloc (line_table, loc);
   va_list ap;
   va_start (ap, msg);
   error_cb (NULL, CPP_DL_FATAL, 0, &richloc, msg, &ap);
@@ -137,7 +137,7 @@ __attribute__((format (printf, 2, 3)))
 #endif
 warning_at (const cpp_token *tk, const char *msg, ...)
 {
-  rich_location richloc (tk->src_loc);
+  rich_location richloc (line_table, tk->src_loc);
   va_list ap;
   va_start (ap, msg);
   error_cb (NULL, CPP_DL_WARNING, 0, &richloc, msg, &ap);
@@ -150,7 +150,7 @@ __attribute__((format (printf, 2, 3)))
 #endif
 warning_at (source_location loc, const char *msg, ...)
 {
-  rich_location richloc (loc);
+  rich_location richloc (line_table, loc);
   va_list ap;
   va_start (ap, msg);
   error_cb (NULL, CPP_DL_WARNING, 0, &richloc, msg, &ap);
diff --git a/gcc/rtl-error.c b/gcc/rtl-error.c
index d28be1d..17c5a58 100644
--- a/gcc/rtl-error.c
+++ b/gcc/rtl-error.c
@@ -69,7 +69,7 @@ diagnostic_for_asm (const rtx_insn *insn, const char *msg, va_list *args_ptr,
 		    diagnostic_t kind)
 {
   diagnostic_info diagnostic;
-  rich_location richloc (location_for_asm (insn));
+  rich_location richloc (line_table, location_for_asm (insn));
 
   diagnostic_set_info (&diagnostic, msg, args_ptr,
 		       &richloc, kind);
diff --git a/gcc/testsuite/gcc.dg/diagnostic-token-ranges.c b/gcc/testsuite/gcc.dg/diagnostic-token-ranges.c
new file mode 100644
index 0000000..ac969e3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/diagnostic-token-ranges.c
@@ -0,0 +1,120 @@
+/* { dg-options "-fdiagnostics-show-caret -Wc++-compat" } */
+
+/* Verify that various diagnostics show source code ranges.  */
+
+/* These ones merely use token ranges; they don't use tree ranges.  */
+
+void undeclared_identifier (void)
+{
+  name; /* { dg-error "'name' undeclared" } */
+/*
+{ dg-begin-multiline-output "" }
+   name;
+   ^~~~
+{ dg-end-multiline-output "" }
+*/
+}
+
+void unknown_type_name (void)
+{
+  foo bar; /* { dg-error "unknown type name 'foo'" } */
+/*
+{ dg-begin-multiline-output "" }
+   foo bar;
+   ^~~
+{ dg-end-multiline-output "" }
+*/
+
+  qux *baz; /* { dg-error "unknown type name 'qux'" } */
+/*
+{ dg-begin-multiline-output "" }
+   qux *baz;
+   ^~~
+{ dg-end-multiline-output "" }
+*/
+}
+
+void test_identifier_conflicts_with_cplusplus (void)
+{
+  int new; /* { dg-warning "identifier 'new' conflicts with" } */
+/*
+{ dg-begin-multiline-output "" }
+   int new;
+       ^~~
+{ dg-end-multiline-output "" }
+*/
+}
+
+extern void
+bogus_varargs (...); /* { dg-error "ISO C requires a named argument before '...'" } */
+/*
+{ dg-begin-multiline-output "" }
+ bogus_varargs (...);
+                ^~~
+{ dg-end-multiline-output "" }
+*/
+
+extern void
+foo (unknown_type param); /* { dg-error "unknown type name 'unknown_type'" } */
+/*
+{ dg-begin-multiline-output "" }
+ foo (unknown_type param);
+      ^~~~~~~~~~~~
+{ dg-end-multiline-output "" }
+*/
+
+void wide_string_literal_in_asm (void)
+{
+  asm (L"nop"); /* { dg-error "wide string literal in 'asm'" } */
+/*
+{ dg-begin-multiline-output "" }
+   asm (L"nop");
+        ^~~~~~
+{ dg-end-multiline-output "" }
+*/
+}
+
+void break_and_continue_in_wrong_places (void)
+{
+  if (0)
+    break; /* { dg-error "break statement not within loop or switch" } */
+/* { dg-begin-multiline-output "" }
+     break;
+     ^~~~~
+   { dg-end-multiline-output "" } */
+
+  if (1)
+    ;
+  else
+    continue; /* { dg-error "continue statement not within a loop" } */
+/* { dg-begin-multiline-output "" }
+     continue;
+     ^~~~~~~~
+    { dg-end-multiline-output "" } */
+}
+
+/* Various examples of bad type decls.  */
+
+int float bogus; /* { dg-error "two or more data types in declaration specifiers" } */
+/* { dg-begin-multiline-output "" }
+ int float bogus;
+     ^~~~~
+    { dg-end-multiline-output "" } */
+
+long long long bogus2; /* { dg-error "'long long long' is too long for GCC" } */
+/* { dg-begin-multiline-output "" }
+ long long long bogus2;
+           ^~~~
+    { dg-end-multiline-output "" } */
+
+long short bogus3; /* { dg-error "both 'long' and 'short' in declaration specifiers" } */
+/* { dg-begin-multiline-output "" }
+ long short bogus3;
+      ^~~~~
+    { dg-end-multiline-output "" } */
+
+signed unsigned bogus4; /* { dg-error "both 'signed' and 'unsigned' in declaration specifiers" } */
+/* { dg-begin-multiline-output "" }
+ signed unsigned bogus4;
+        ^~~~~~~~
+    { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
index 3471a4e..4c6120d 100644
--- a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
@@ -163,7 +163,7 @@ test_show_locus (function *fun)
   if (0 == strcmp (fnname, "test_simple"))
     {
       const int line = fnstart_line + 2;
-      rich_location richloc (get_loc (line, 15));
+      rich_location richloc (line_table, get_loc (line, 15));
       richloc.add_range (get_loc (line, 10), get_loc (line, 14));
       richloc.add_range (get_loc (line, 16), get_loc (line, 16));
       warning_at_rich_loc (&richloc, 0, "test");
@@ -172,7 +172,7 @@ test_show_locus (function *fun)
   if (0 == strcmp (fnname, "test_simple_2"))
     {
       const int line = fnstart_line + 2;
-      rich_location richloc (get_loc (line, 24));
+      rich_location richloc (line_table, get_loc (line, 24));
       richloc.add_range (get_loc (line, 6),
 			 get_loc (line, 22));
       richloc.add_range (get_loc (line, 26),
@@ -183,7 +183,7 @@ test_show_locus (function *fun)
   if (0 == strcmp (fnname, "test_multiline"))
     {
       const int line = fnstart_line + 2;
-      rich_location richloc (get_loc (line + 1, 7));
+      rich_location richloc (line_table, get_loc (line + 1, 7));
       richloc.add_range (get_loc (line, 7),
 			 get_loc (line, 23));
       richloc.add_range (get_loc (line + 1, 9),
@@ -194,7 +194,7 @@ test_show_locus (function *fun)
   if (0 == strcmp (fnname, "test_many_lines"))
     {
       const int line = fnstart_line + 2;
-      rich_location richloc (get_loc (line + 5, 7));
+      rich_location richloc (line_table, get_loc (line + 5, 7));
       richloc.add_range (get_loc (line, 7),
 			 get_loc (line + 4, 65));
       richloc.add_range (get_loc (line + 5, 9),
@@ -223,7 +223,7 @@ test_show_locus (function *fun)
       source_range src_range;
       src_range.m_start = get_loc (line, 12);
       src_range.m_finish = get_loc (line, 20);
-      rich_location richloc (caret);
+      rich_location richloc (line_table, caret);
       richloc.set_range (0, src_range, true, false);
       warning_at_rich_loc (&richloc, 0, "test");
     }
@@ -237,7 +237,7 @@ test_show_locus (function *fun)
       source_range src_range;
       src_range.m_start = get_loc (line, 90);
       src_range.m_finish = get_loc (line, 98);
-      rich_location richloc (caret);
+      rich_location richloc (line_table, caret);
       richloc.set_range (0, src_range, true, false);
       warning_at_rich_loc (&richloc, 0, "test");
     }
@@ -248,7 +248,7 @@ test_show_locus (function *fun)
       const int line = fnstart_line + 2;
       location_t caret_a = get_loc (line, 7);
       location_t caret_b = get_loc (line, 11);
-      rich_location richloc (caret_a);
+      rich_location richloc (line_table, caret_a);
       richloc.add_range (caret_b, caret_b, true);
       global_dc->caret_chars[0] = 'A';
       global_dc->caret_chars[1] = 'B';
@@ -269,7 +269,7 @@ test_show_locus (function *fun)
       const int line = fnstart_line + 3;
       location_t caret_a = get_loc (line, 5);
       location_t caret_b = get_loc (line - 1, 19);
-      rich_location richloc (caret_a);
+      rich_location richloc (line_table, caret_a);
       richloc.add_range (caret_b, caret_b, true);
       global_dc->caret_chars[0] = '1';
       global_dc->caret_chars[1] = '2';
@@ -304,11 +304,6 @@ plugin_init (struct plugin_name_args *plugin_info,
   if (!plugin_default_version_check (version, &gcc_version))
     return 1;
 
-  /* For now, tell the dc to expect ranges and thus to colorize the source
-     lines, not just the carets/underlines.  This will be redundant
-     once the C frontend generates ranges.  */
-  global_dc->colorize_source_p = true;
-
   for (int i = 0; i < argc; i++)
     {
       if (0 == strcmp (argv[i].key, "color"))
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c
index 46e97b7..ca54278 100644
--- a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c
@@ -29,72 +29,26 @@
 #include "plugin-version.h"
 #include "diagnostic.h"
 #include "context.h"
-#include "gcc-rich-location.h"
 #include "print-tree.h"
 
-/*
-  Hack: fails with linker error:
-./diagnostic_plugin_test_tree_expression_range.so: undefined symbol: _ZN17gcc_rich_location8add_exprEP9tree_node
-  since nothing in the tree is using gcc_rich_location::add_expr yet.
-
-  I've tried various workarounds (adding DEBUG_FUNCTION to the
-  method, taking its address), but can't seem to fix it that way.
-  So as a nasty workaround, the following material is copied&pasted
-  from gcc-rich-location.c: */
-
-static bool
-get_range_for_expr (tree expr, location_range *r)
-{
-  if (EXPR_HAS_RANGE (expr))
-    {
-      source_range sr = EXPR_LOCATION_RANGE (expr);
-
-      /* Do we have meaningful data?  */
-      if (sr.m_start && sr.m_finish)
-	{
-	  r->m_start = expand_location (sr.m_start);
-	  r->m_finish = expand_location (sr.m_finish);
-	  return true;
-	}
-    }
-
-  return false;
-}
-
-/* Add a range to the rich_location, covering expression EXPR. */
-
-void
-gcc_rich_location::add_expr (tree expr)
-{
-  gcc_assert (expr);
-
-  location_range r;
-  r.m_show_caret_p = false;
-  if (get_range_for_expr (expr, &r))
-    add_range (&r);
-}
-
-/* FIXME: end of material taken from gcc-rich-location.c */
-
-
 int plugin_is_GPL_compatible;
 
 static void
-emit_warning (rich_location *richloc)
+emit_warning (location_t loc)
 {
-  if (richloc->get_num_locations () < 2)
+  if (!IS_ADHOC_LOC (loc))
     {
-      error_at_rich_loc (richloc, "range not found");
+      error_at (loc, "ad-hoc location not found");
       return;
     }
 
-  location_range *range = richloc->get_range (1);
-  warning_at_rich_loc (richloc, 0,
-		       "tree range %i:%i-%i:%i",
-		       range->m_start.line,
-		       range->m_start.column,
-		       range->m_finish.line,
-		       range->m_finish.column);
+  source_range src_range = get_range_from_adhoc_loc (line_table, loc);
+  warning_at (loc, 0,
+	      "tree range %i:%i-%i:%i",
+	      LOCATION_LINE (src_range.m_start),
+	      LOCATION_COLUMN (src_range.m_start),
+	      LOCATION_LINE (src_range.m_finish),
+	      LOCATION_COLUMN (src_range.m_finish));
 }
 
 tree
@@ -117,9 +71,7 @@ cb_walk_tree_fn (tree * tp, int * walk_subtrees,
   /* Get arg 1; print it! */
   tree arg = CALL_EXPR_ARG (call_expr, 1);
 
-  gcc_rich_location richloc (EXPR_LOCATION (arg));
-  richloc.add_expr (arg);
-  emit_warning (&richloc);
+  emit_warning (EXPR_LOCATION (arg));
 
   return NULL_TREE;
 }
diff --git a/libcpp/errors.c b/libcpp/errors.c
index c351c11..8790e10 100644
--- a/libcpp/errors.c
+++ b/libcpp/errors.c
@@ -57,7 +57,7 @@ cpp_diagnostic (cpp_reader * pfile, int level, int reason,
 
   if (!pfile->cb.error)
     abort ();
-  rich_location richloc (src_loc);
+  rich_location richloc (pfile->line_table, src_loc);
   ret = pfile->cb.error (pfile, level, reason, &richloc, _(msgid), ap);
 
   return ret;
@@ -140,7 +140,7 @@ cpp_diagnostic_with_line (cpp_reader * pfile, int level, int reason,
   
   if (!pfile->cb.error)
     abort ();
-  rich_location richloc (src_loc);
+  rich_location richloc (pfile->line_table, src_loc);
   richloc.override_column (column);
   ret = pfile->cb.error (pfile, level, reason, &richloc, _(msgid), ap);
 
diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index de1c55c..0ef29d9 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -1192,7 +1192,7 @@ class rich_location
   /* Constructors.  */
 
   /* Constructing from a location.  */
-  rich_location (source_location loc);
+  rich_location (line_maps *set, source_location loc);
 
   /* Constructing from a source_range.  */
   rich_location (source_range src_range);
diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index 3810c88..2cbd56a 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -1774,13 +1774,14 @@ line_table_dump (FILE *stream, struct line_maps *set, unsigned int num_ordinary,
 
 /* Construct a rich_location with location LOC as its initial range.  */
 
-rich_location::rich_location (source_location loc) :
+rich_location::rich_location (line_maps *set, source_location loc) :
   m_loc (loc),
   m_num_ranges (0),
   m_have_expanded_location (false)
 {
-  /* Set up the 0th range: */
-  add_range (loc, loc, true);
+  /* Set up the 0th range, extracting any range from LOC.  */
+  source_range src_range = get_range_from_loc (set, loc);
+  add_range (src_range, true);
   m_ranges[0].m_caret = lazily_expand_location ();
 }
 
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 10/10] Compress short ranges into source_location
  2015-10-23 20:25 ` [PATCH 00/10] Overhaul of diagnostics (v5) David Malcolm
                     ` (5 preceding siblings ...)
  2015-10-23 20:26   ` [PATCH 05/10] Add ranges to libcpp tokens (via ad-hoc data, unoptimized) David Malcolm
@ 2015-10-23 20:26   ` David Malcolm
  2015-10-30  6:07     ` Jeff Law
  2015-11-04 20:42     ` Dodji Seketeli
  2015-10-23 20:26   ` [PATCH 02/10] Add stats on adhoc table to dump_line_table_statistics David Malcolm
                     ` (3 subsequent siblings)
  10 siblings, 2 replies; 83+ messages in thread
From: David Malcolm @ 2015-10-23 20:26 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

The attached patch implements a bit-packing scheme so that short ranges
can be stored directly within a 32-bit source_location (aka location_t)
without needing to use the ad-hoc table.  The intent is to mitigate the
overhead introduced in the earlier patch that added ranges for all tokens
in libcpp: every token up to 2**N characters long can be stored without
needing the ad-hoc table.  Other short ranges for expressions can be
stored compactly, provided that caret==start.

N currently is 5.  This is somewhat arbitrary, but seems to work.
The default bits for columns remains 7, meaning that the low 12 bits of
ordinary location_t values are for columns&packed ranges.
More details of the packing scheme can be seen in the patch's change
to line-map.h.

default_range_bits == 5 entails a 32-fold reduction in the size of the
code we can compile before the various fallbacks take effect (stopping
tracking columns then stopping tracking locations altogether).
In the former case, when we stop tracking columns, we also stop packing
ranges.

The range_bits needs to be per-ordinary_map, to cope with the case where
an ordinary map's range_and_column bits could be zero (e.g. due to a very
long line).
This requires figuring out which ordinary map a source_location is in
when generating compact ranges.  Luckily, this seems to hit the cached
map when tokenizing, avoiding the somewhat expensive binary search
through the ordinary maps.

Some benchmarks can be seen in this post:
  https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02283.html

gcc/ada/ChangeLog:
	* gcc-interface/trans.c (Sloc_to_locus): Add line_table param when
	calling linemap_position_for_line_and_column.

gcc/ChangeLog:
	* input.c (dump_line_table_statistics): Dump stats on how many
	ranges were optimized vs how many needed ad-hoc table.
	(write_digit_row): Add "map" param; use its range_bits
	to calculate the per-character offset.
	(dump_location_info): Print the range and column bits for each
	ordinary map.  Use the range bits to calculate the per-character
	offset.  Pass the map as a new param to the various calls to
	write_digit_row.  Eliminate uses of
	ORDINARY_MAP_NUMBER_OF_COLUMN_BITS.
	* toplev.c (general_init): Initialize line_table's
	default_range_bits.
	* tree.c (get_pure_location): New function.
	(set_block): Use the pure form of the location for the
	caret in the combined location.
	(set_source_range): Likewise.

gcc/testsuite/ChangeLog:
	* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c (get_loc): Add
	line_table param when calling
	linemap_position_for_line_and_column.
	* gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c
	(emit_warning): Remove restriction that "loc" must be ad-hoc.

libcpp/ChangeLog:
	* include/line-map.h (source_location): Update the descriptive
	comment to reflect the packing scheme for short ranges.
	(struct line_map_ordinary): Drop field "column_bits" in favor
	of field "m_column_and_range_bits"; add field "m_range_bits".
	(ORDINARY_MAP_NUMBER_OF_COLUMN_BITS): Delete.
	(struct line_maps): Add fields "default_range_bits",
	"num_optimized_ranges" and "num_unoptimized_ranges".
	(get_range_from_adhoc_loc): Delete prototype.
	(get_range_from_loc): Convert from an inline function to a
	prototype.
	(pure_location_p): New prototype.
	(SOURCE_LINE): Update for renaming of column_bits.
	(SOURCE_COLUMN): Likewise.  Shift the column right by the map's
	range_bits.
	(LAST_SOURCE_LINE_LOCATION): Update for renaming of column_bits.
	(linemap_position_for_line_and_column): Add line_maps * params.
	* lex.c (_cpp_lex_direct): Don't attempt to record token ranges
	for UNKNOWN_LOCATION and BUILTINS_LOCATION.
	* line-map.c (LINE_MAP_MAX_COLUMN_NUMBER): Reduce from 1U << 17 to
	1U << 9.
	(can_be_stored_compactly_p): New function.
	(get_combined_adhoc_loc): Implement bit-packing scheme for short
	ranges.
	(get_range_from_adhoc_loc): Make static.
	(get_range_from_loc): New function.
	(pure_location_p): New function.
	(linemap_add): Ensure that start_location has zero for the
	range_bits, unless we're past LINE_MAP_MAX_LOCATION_WITH_COLS.
	Initialize range_bits to zero.  Assert that the start_location
	is "pure".
	(linemap_line_start): Assert that the
	column_and_range_bits >= range_bits.
	Update determinination of whether we need to start a new map
	using the effective column bits, without the range bits.
	Use the set's default_range_bits in new maps, apart from
	those with column_bits == 0, which should also have 0 range_bits.
	Increase the column bits for new maps by the range bits.
	When adding lines to an existing map, use set->highest_line
	directly rather than offsetting highest by SOURCE_COLUMN.
	Add assertions to sanity-check the return value.
	(linemap_position_for_column): Offset to_column by range_bits.
	Update set->hightest_location if necessary.
	(linemap_position_for_line_and_column): Add line_maps * param.
	Update the calculation to offset the column by range_bits, and
	conditionalize it on being <= LINE_MAP_MAX_LOCATION_WITH_COLS.
	Bound it by LINEMAPS_MACRO_LOWEST_LOCATION.  Update
	set->highest_location if necessary.
	(linemap_position_for_loc_and_offset): Pass "set" to
	linemap_position_for_line_and_column.
	* location-example.txt: Regenerate, showing new representation.
---
 gcc/ada/gcc-interface/trans.c                      |   3 +-
 gcc/input.c                                        |  28 ++-
 .../plugin/diagnostic_plugin_test_show_locus.c     |   3 +-
 .../diagnostic_plugin_test_tree_expression_range.c |   8 +-
 gcc/toplev.c                                       |   1 +
 gcc/tree.c                                         |  25 ++-
 libcpp/include/line-map.h                          | 121 +++++++----
 libcpp/lex.c                                       |   9 +-
 libcpp/line-map.c                                  | 229 +++++++++++++++++++--
 libcpp/location-example.txt                        | 188 +++++++++--------
 10 files changed, 450 insertions(+), 165 deletions(-)

diff --git a/gcc/ada/gcc-interface/trans.c b/gcc/ada/gcc-interface/trans.c
index f1e2dcb..c3ff66a 100644
--- a/gcc/ada/gcc-interface/trans.c
+++ b/gcc/ada/gcc-interface/trans.c
@@ -9618,7 +9618,8 @@ Sloc_to_locus (Source_Ptr Sloc, location_t *locus, bool clear_column)
     line = 1;
 
   /* Translate the location.  */
-  *locus = linemap_position_for_line_and_column (map, line, column);
+  *locus = linemap_position_for_line_and_column (line_table, map,
+						 line, column);
 
   return true;
 }
diff --git a/gcc/input.c b/gcc/input.c
index baf8e7e..6aae857 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -878,6 +878,10 @@ dump_line_table_statistics (void)
 	   STAT_LABEL (s.adhoc_table_size));
   fprintf (stderr, "Ad-hoc table entries used:           %5ld\n",
 	   s.adhoc_table_entries_used);
+  fprintf (stderr, "optimized_ranges: %i\n",
+	   line_table->num_optimized_ranges);
+  fprintf (stderr, "unoptimized_ranges: %i\n",
+	   line_table->num_unoptimized_ranges);
 
   fprintf (stderr, "\n");
 }
@@ -908,13 +912,14 @@ write_digit (FILE *stream, int digit)
 
 static void
 write_digit_row (FILE *stream, int indent,
+		 const line_map_ordinary *map,
 		 source_location loc, int max_col, int divisor)
 {
   fprintf (stream, "%*c", indent, ' ');
   fprintf (stream, "|");
   for (int column = 1; column < max_col; column++)
     {
-      source_location column_loc = loc + column;
+      source_location column_loc = loc + (column << map->m_range_bits);
       write_digit (stream, column_loc / divisor);
     }
   fprintf (stream, "\n");
@@ -968,14 +973,20 @@ dump_location_info (FILE *stream)
       fprintf (stream, "  file: %s\n", ORDINARY_MAP_FILE_NAME (map));
       fprintf (stream, "  starting at line: %i\n",
 	       ORDINARY_MAP_STARTING_LINE_NUMBER (map));
+      fprintf (stream, "  column and range bits: %i\n",
+	       map->m_column_and_range_bits);
       fprintf (stream, "  column bits: %i\n",
-	       ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map));
+	       map->m_column_and_range_bits - map->m_range_bits);
+      fprintf (stream, "  range bits: %i\n",
+	       map->m_range_bits);
 
       /* Render the span of source lines that this "map" covers.  */
       for (source_location loc = MAP_START_LOCATION (map);
 	   loc < end_location;
-	   loc++)
+	   loc += (1 << map->m_range_bits) )
 	{
+	  gcc_assert (pure_location_p (line_table, loc) );
+
 	  expanded_location exploc
 	    = linemap_expand_location (line_table, map, loc);
 
@@ -999,8 +1010,7 @@ dump_location_info (FILE *stream)
 		 Render the locations *within* the line, by underlining
 		 it, showing the source_location numeric values
 		 at each column.  */
-	      int max_col
-		= (1 << ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map)) - 1;
+	      int max_col = (1 << map->m_column_and_range_bits) - 1;
 	      if (max_col > line_size)
 		max_col = line_size + 1;
 
@@ -1008,17 +1018,17 @@ dump_location_info (FILE *stream)
 
 	      /* Thousands.  */
 	      if (end_location > 999)
-		write_digit_row (stream, indent, loc, max_col, 1000);
+		write_digit_row (stream, indent, map, loc, max_col, 1000);
 
 	      /* Hundreds.  */
 	      if (end_location > 99)
-		write_digit_row (stream, indent, loc, max_col, 100);
+		write_digit_row (stream, indent, map, loc, max_col, 100);
 
 	      /* Tens.  */
-	      write_digit_row (stream, indent, loc, max_col, 10);
+	      write_digit_row (stream, indent, map, loc, max_col, 10);
 
 	      /* Units.  */
-	      write_digit_row (stream, indent, loc, max_col, 1);
+	      write_digit_row (stream, indent, map, loc, max_col, 1);
 	    }
 	}
       fprintf (stream, "\n");
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
index 4c6120d..14a8d91 100644
--- a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
@@ -109,7 +109,8 @@ get_loc (unsigned int line_num, unsigned int col_num)
 
   /* Convert from 0-based column numbers to 1-based column numbers.  */
   source_location loc
-    = linemap_position_for_line_and_column (line_map,
+    = linemap_position_for_line_and_column (line_table,
+					    line_map,
 					    line_num, col_num + 1);
 
   return loc;
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c
index ca54278..89cc95a 100644
--- a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c
@@ -36,13 +36,7 @@ int plugin_is_GPL_compatible;
 static void
 emit_warning (location_t loc)
 {
-  if (!IS_ADHOC_LOC (loc))
-    {
-      error_at (loc, "ad-hoc location not found");
-      return;
-    }
-
-  source_range src_range = get_range_from_adhoc_loc (line_table, loc);
+  source_range src_range = get_range_from_loc (line_table, loc);
   warning_at (loc, 0,
 	      "tree range %i:%i-%i:%i",
 	      LOCATION_LINE (src_range.m_start),
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 6d740d4..7067d96 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1147,6 +1147,7 @@ general_init (const char *argv0, bool init_signals)
   linemap_init (line_table, BUILTINS_LOCATION);
   line_table->reallocator = realloc_for_line_map;
   line_table->round_alloc_size = ggc_round_alloc_size;
+  line_table->default_range_bits = 5;
   init_ttree ();
 
   /* Initialize register usage now so switches may override.  */
diff --git a/gcc/tree.c b/gcc/tree.c
index a676352..4ec4a38 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -13653,11 +13653,31 @@ nonnull_arg_p (const_tree arg)
   return false;
 }
 
+static location_t
+get_pure_location (location_t loc)
+{
+  if (IS_ADHOC_LOC (loc))
+    loc
+      = line_table->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].locus;
+
+  if (loc >= LINEMAPS_MACRO_LOWEST_LOCATION (line_table))
+    return loc;
+
+  if (loc < RESERVED_LOCATION_COUNT)
+    return loc;
+
+  const line_map *map = linemap_lookup (line_table, loc);
+  const line_map_ordinary *ordmap = linemap_check_ordinary (map);
+
+  return loc & ~((1 << ordmap->m_range_bits) - 1);
+}
+
 location_t
 set_block (location_t loc, tree block)
 {
+  location_t pure_loc = get_pure_location (loc);
   source_range src_range = get_range_from_loc (line_table, loc);
-  return COMBINE_LOCATION_DATA (line_table, loc, src_range, block);
+  return COMBINE_LOCATION_DATA (line_table, pure_loc, src_range, block);
 }
 
 void
@@ -13675,8 +13695,9 @@ set_source_range (tree expr, source_range src_range)
   if (!EXPR_P (expr))
     return;
 
+  location_t pure_loc = get_pure_location (EXPR_LOCATION (expr));
   location_t adhoc = COMBINE_LOCATION_DATA (line_table,
-					    EXPR_LOCATION (expr),
+					    pure_loc,
 					    src_range,
 					    NULL);
   SET_EXPR_LOCATION (expr, adhoc);
diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index 0ef29d9..1a2dab8 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -47,7 +47,8 @@ enum lc_reason
 typedef unsigned int linenum_type;
 
 /* The typedef "source_location" is a key within the location database,
-   identifying a source location or macro expansion.
+   identifying a source location or macro expansion, along with range
+   information, and (optionally) a pointer for use by gcc.
 
    This key only has meaning in relation to a line_maps instance.  Within
    gcc there is a single line_maps instance: "line_table", declared in
@@ -69,13 +70,48 @@ typedef unsigned int linenum_type;
              |  ordmap[0]->start_location)   | first line in ordmap 0
   -----------+-------------------------------+-------------------------------
              | ordmap[1]->start_location     | First line in ordmap 1
-             | ordmap[1]->start_location+1   | First column in that line
-             | ordmap[1]->start_location+2   | 2nd column in that line
-             |                               | Subsequent lines are offset by
-             |                               | (1 << column_bits),
-             |                               | e.g. 128 for 7 bits, with a
-             |                               | column value of 0 representing
-             |                               | "the whole line".
+             | ordmap[1]->start_location+32  | First column in that line
+             |   (assuming range_bits == 5)  |
+             | ordmap[1]->start_location+64  | 2nd column in that line
+             | ordmap[1]->start_location+4096| Second line in ordmap 1
+             |   (assuming column_bits == 12)
+             |
+             |   Subsequent lines are offset by (1 << column_bits),
+             |   e.g. 4096 for 12 bits, with a column value of 0 representing
+             |   "the whole line".
+             |
+             |   Within a line, the low "range_bits" (typically 5) are used for
+             |   storing short ranges, so that there's an offset of
+             |     (1 << range_bits) between individual columns within a line,
+             |   typically 32.
+             |   The low range_bits store the offset of the end point from the
+             |   start point, and the start point is found by masking away
+             |   the range bits.
+             |
+             |   For example:
+             |      ordmap[1]->start_location+64    "2nd column in that line"
+             |   above means a caret at that location, with a range
+             |   starting and finishing at the same place (the range bits
+             |   are 0), a range of length 1.
+             |
+             |   By contrast:
+             |      ordmap[1]->start_location+68
+             |   has range bits 0x4, meaning a caret with a range starting at
+             |   that location, but with endpoint 4 columns further on: a range
+             |   of length 5.
+             |
+             |   Ranges that have caret != start, or have an endpoint too
+             |   far away to fit in range_bits are instead stored as ad-hoc
+             |   locations.  Hence for range_bits == 5 we can compactly store
+             |   tokens of length <= 32 without needing to use the ad-hoc
+             |   table.
+             |
+             |   This packing scheme means we effectively have
+             |     (column_bits - range_bits)
+             |   of bits for the columns, typically (12 - 5) = 7, for 128
+             |   columns; longer line widths are accomodated by starting a
+             |   new ordmap with a higher column_bits.
+             |
              | ordmap[2]->start_location-1   | Final location in ordmap 1
   -----------+-------------------------------+-------------------------------
              | ordmap[2]->start_location     | First line in ordmap 2
@@ -205,8 +241,9 @@ struct GTY((tag ("0"), desc ("%h.reason == LC_ENTER_MACRO ? 2 : 1"))) line_map {
    
    Physical source file TO_FILE at line TO_LINE at column 0 is represented
    by the logical START_LOCATION.  TO_LINE+L at column C is represented by
-   START_LOCATION+(L*(1<<column_bits))+C, as long as C<(1<<column_bits),
-   and the result_location is less than the next line_map's start_location.
+   START_LOCATION+(L*(1<<m_column_and_range_bits))+(C*1<<m_range_bits), as
+   long as C<(1<<effective range bits), and the result_location is less than
+   the next line_map's start_location.
    (The top line is line 1 and the leftmost column is column 1; line/column 0
    means "entire file/line" or "unknown line/column" or "not applicable".)
 
@@ -226,8 +263,24 @@ struct GTY((tag ("1"))) line_map_ordinary : public line_map {
      cpp_buffer.  */
   unsigned char sysp;
 
-  /* Number of the low-order source_location bits used for a column number.  */
-  unsigned int column_bits : 8;
+  /* Number of the low-order source_location bits used for column numbers
+     and ranges.  */
+  unsigned int m_column_and_range_bits : 8;
+
+  /* Number of the low-order "column" bits used for storing short ranges
+     inline, rather than in the ad-hoc table.
+     MSB                                                                 LSB
+     31                                                                    0
+     +-------------------------+-------------------------------------------+
+     |                         |<---map->column_and_range_bits (e.g. 12)-->|
+     +-------------------------+-----------------------+-------------------+
+     |                         | column_and_range_bits | map->range_bits   |
+     |                         |   - range_bits        |                   |
+     +-------------------------+-----------------------+-------------------+
+     | row bits                | effective column bits | short range bits  |
+     |                         |    (e.g. 7)           |   (e.g. 5)        |
+     +-------------------------+-----------------------+-------------------+ */
+  unsigned int m_range_bits : 8;
 };
 
 /* This is the highest possible source location encoded within an
@@ -423,15 +476,6 @@ ORDINARY_MAP_IN_SYSTEM_HEADER_P (const line_map_ordinary *ord_map)
   return ord_map->sysp;
 }
 
-/* Get the number of the low-order source_location bits used for a
-   column number within ordinary map MAP.  */
-
-inline unsigned char
-ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (const line_map_ordinary *ord_map)
-{
-  return ord_map->column_bits;
-}
-
 /* Get the filename of ordinary map MAP.  */
 
 inline const char *
@@ -578,6 +622,12 @@ struct GTY(()) line_maps {
 
   /* True if we've seen a #line or # 44 "file" directive.  */
   bool seen_line_directive;
+
+  /* The default value of range_bits in ordinary line maps.  */
+  unsigned int default_range_bits;
+
+  unsigned int num_optimized_ranges;
+  unsigned int num_unoptimized_ranges;
 };
 
 /* Returns the number of allocated maps so far. MAP_KIND shall be TRUE
@@ -821,8 +871,10 @@ extern source_location get_combined_adhoc_loc (struct line_maps *,
 extern void *get_data_from_adhoc_loc (struct line_maps *, source_location);
 extern source_location get_location_from_adhoc_loc (struct line_maps *,
 						    source_location);
-extern source_range get_range_from_adhoc_loc (struct line_maps *,
-					      source_location);
+
+extern source_range
+get_range_from_loc (line_maps *set,
+		    source_location loc);
 
 /* Get whether location LOC is an ad-hoc location.  */
 
@@ -832,15 +884,11 @@ IS_ADHOC_LOC (source_location loc)
   return (loc & MAX_SOURCE_LOCATION) != loc;
 }
 
-inline source_range
-get_range_from_loc (struct line_maps *set,
-		    source_location loc)
-{
-  if (IS_ADHOC_LOC (loc))
-    return get_range_from_adhoc_loc (set, loc);
-  else
-    return source_range::from_location (loc);
-}
+/* Get whether location LOC is a "pure" location, or
+   whether it is an ad-hoc location, or embeds range information.  */
+
+bool
+pure_location_p (line_maps *set, source_location loc);
 
 /* Combine LOC and BLOCK, giving a combined adhoc location.  */
 
@@ -936,7 +984,7 @@ inline linenum_type
 SOURCE_LINE (const line_map_ordinary *ord_map, source_location loc)
 {
   return ((loc - ord_map->start_location)
-	  >> ord_map->column_bits) + ord_map->to_line;
+	  >> ord_map->m_column_and_range_bits) + ord_map->to_line;
 }
 
 /* Convert a map and source_location to source column number.  */
@@ -944,7 +992,7 @@ inline linenum_type
 SOURCE_COLUMN (const line_map_ordinary *ord_map, source_location loc)
 {
   return ((loc - ord_map->start_location)
-	  & ((1 << ord_map->column_bits) - 1));
+	  & ((1 << ord_map->m_column_and_range_bits) - 1)) >> ord_map->m_range_bits;
 }
 
 /* Return the location of the last source line within an ordinary
@@ -954,7 +1002,7 @@ LAST_SOURCE_LINE_LOCATION (const line_map_ordinary *map)
 {
   return (((map[1].start_location - 1
 	    - map->start_location)
-	   & ~((1 << map->column_bits) - 1))
+	   & ~((1 << map->m_column_and_range_bits) - 1))
 	  + map->start_location);
 }
 
@@ -1004,7 +1052,8 @@ linemap_position_for_column (struct line_maps *, unsigned int);
 /* Encode and return a source location from a given line and
    column.  */
 source_location
-linemap_position_for_line_and_column (const line_map_ordinary *,
+linemap_position_for_line_and_column (line_maps *set,
+				      const line_map_ordinary *,
 				      linenum_type, unsigned int);
 
 /* Encode and return a source_location starting from location LOC and
diff --git a/libcpp/lex.c b/libcpp/lex.c
index f4c964f..6f46a7f 100644
--- a/libcpp/lex.c
+++ b/libcpp/lex.c
@@ -2725,9 +2725,12 @@ _cpp_lex_direct (cpp_reader *pfile)
 
   source_range tok_range;
   tok_range.m_start = result->src_loc;
-  tok_range.m_finish =
-    linemap_position_for_column (pfile->line_table,
-				 CPP_BUF_COLUMN (buffer, buffer->cur));
+  if (result->src_loc >= RESERVED_LOCATION_COUNT)
+    tok_range.m_finish =
+      linemap_position_for_column (pfile->line_table,
+				   CPP_BUF_COLUMN (buffer, buffer->cur));
+  else
+    tok_range.m_finish = tok_range.m_start;
 
   result->src_loc = COMBINE_LOCATION_DATA (pfile->line_table,
 					   result->src_loc,
diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index 6385fdf..fe8d784 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -29,7 +29,7 @@ along with this program; see the file COPYING3.  If not see
 /* Do not track column numbers higher than this one.  As a result, the
    range of column_bits is [7, 18] (or 0 if column numbers are
    disabled).  */
-const unsigned int LINE_MAP_MAX_COLUMN_NUMBER = (1U << 17);
+const unsigned int LINE_MAP_MAX_COLUMN_NUMBER = (1U << 9);
 
 /* Do not track column numbers if locations get higher than this.  */
 const source_location LINE_MAP_MAX_LOCATION_WITH_COLS = 0x60000000;
@@ -112,6 +112,49 @@ rebuild_location_adhoc_htab (struct line_maps *set)
 		    set->location_adhoc_data_map.data + i, INSERT);
 }
 
+/* Helper function for get_combined_adhoc_loc.
+   Can the given LOCUS + SRC_RANGE and DATA pointer be stored compactly
+   within a source_location, without needing to use an ad-hoc location.  */
+
+static bool
+can_be_stored_compactly_p (struct line_maps *set,
+			   source_location locus,
+			   source_range src_range,
+			   void *data)
+{
+  /* If there's an ad-hoc pointer, we can't store it directly in the
+     source_location, we need the lookaside.  */
+  if (data)
+    return false;
+
+  /* We only store ranges that begin at the locus and that are sufficientl
+     "sane".  */
+  if (src_range.m_start != locus)
+    return false;
+
+  if (src_range.m_finish < src_range.m_start)
+    return false;
+
+  if (src_range.m_start < RESERVED_LOCATION_COUNT)
+    return false;
+
+  if (locus >= LINE_MAP_MAX_LOCATION_WITH_COLS)
+    return false;
+
+  /* All 3 locations must be within ordinary maps, typically, the same
+     ordinary map.  */
+  source_location lowest_macro_loc = LINEMAPS_MACRO_LOWEST_LOCATION (set);
+  if (locus >= lowest_macro_loc)
+    return false;
+  if (src_range.m_start >= lowest_macro_loc)
+    return false;
+  if (src_range.m_finish >= lowest_macro_loc)
+    return false;
+
+  /* Passed all tests.  */
+  return true;
+}
+
 /* Combine LOCUS and DATA to a combined adhoc loc.  */
 
 source_location
@@ -128,6 +171,60 @@ get_combined_adhoc_loc (struct line_maps *set,
       = set->location_adhoc_data_map.data[locus & MAX_SOURCE_LOCATION].locus;
   if (locus == 0 && data == NULL)
     return 0;
+
+  /* Any ordinary locations ought to be "pure" at this point: no
+     compressed ranges.  */
+  linemap_assert (locus < RESERVED_LOCATION_COUNT
+		  || locus >= LINE_MAP_MAX_LOCATION_WITH_COLS
+		  || locus >= LINEMAPS_MACRO_LOWEST_LOCATION (set)
+		  || pure_location_p (set, locus));
+
+#define DEBUG_PACKING 0
+
+#if DEBUG_PACKING
+  fprintf (stderr, "get_combined_adhoc_loc: %x %x %x\n",
+	   locus, src_range.m_start, src_range.m_finish);
+#endif
+
+  /* Consider short-range optimization.  */
+  if (can_be_stored_compactly_p (set, locus, src_range, data))
+    {
+      /* The low bits ought to be clear.  */
+      linemap_assert (pure_location_p (set, locus));
+      const line_map *map = linemap_lookup (set, locus);
+      const line_map_ordinary *ordmap = linemap_check_ordinary (map);
+      unsigned int int_diff = src_range.m_finish - src_range.m_start;
+      unsigned int col_diff = (int_diff >> ordmap->m_range_bits);
+      if (col_diff < (1U << ordmap->m_range_bits))
+	{
+	  source_location packed = locus | col_diff;
+	  set->num_optimized_ranges++;
+#if DEBUG_PACKING
+	  fprintf (stderr, "  optimized to %x\n", packed);
+#endif
+	  return packed;
+	}
+    }
+
+  /* We can also compactly store the reserved locations
+     when locus == start == finish (and data is NULL).  */
+  if (locus < RESERVED_LOCATION_COUNT
+      && locus == src_range.m_start
+      && locus == src_range.m_finish
+      && !data)
+    {
+#if DEBUG_PACKING
+      fprintf (stderr, "  using reserved location: %x\n", locus);
+#endif
+      return locus;
+    }
+
+#if DEBUG_PACKING
+  fprintf (stderr, "  unoptimized\n");
+#endif
+  if (!data)
+    set->num_unoptimized_ranges++;
+
   lb.locus = locus;
   lb.src_range = src_range;
   lb.data = data;
@@ -184,13 +281,58 @@ get_location_from_adhoc_loc (struct line_maps *set, source_location loc)
   return set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].locus;
 }
 
-source_range
+static source_range
 get_range_from_adhoc_loc (struct line_maps *set, source_location loc)
 {
   linemap_assert (IS_ADHOC_LOC (loc));
   return set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].src_range;
 }
 
+/* Get the source_range of location LOC, either from the ad-hoc
+   lookaside table, or embedded inside LOC itself.  */
+
+source_range
+get_range_from_loc (struct line_maps *set,
+		    source_location loc)
+{
+  if (IS_ADHOC_LOC (loc))
+    return get_range_from_adhoc_loc (set, loc);
+
+  /* For ordinary maps, extract packed range.  */
+  if (loc >= RESERVED_LOCATION_COUNT
+      && loc < LINEMAPS_MACRO_LOWEST_LOCATION (set)
+      && loc <= LINE_MAP_MAX_LOCATION_WITH_COLS)
+    {
+      const line_map *map = linemap_lookup (set, loc);
+      const line_map_ordinary *ordmap = linemap_check_ordinary (map);
+      source_range result;
+      int offset = loc & ((1 << ordmap->m_range_bits) - 1);
+      result.m_start = loc - offset;
+      result.m_finish = result.m_start + (offset << ordmap->m_range_bits);
+      return result;
+    }
+
+  return source_range::from_location (loc);
+}
+
+/* Get whether location LOC is a "pure" location, or
+   whether it is an ad-hoc location, or embeds range information.  */
+
+bool
+pure_location_p (line_maps *set, source_location loc)
+{
+  if (IS_ADHOC_LOC (loc))
+    return false;
+
+  const line_map *map = linemap_lookup (set, loc);
+  const line_map_ordinary *ordmap = linemap_check_ordinary (map);
+
+  if (loc & ((1U << ordmap->m_range_bits) - 1))
+    return false;
+
+  return true;
+}
+
 /* Finalize the location_adhoc_data structure.  */
 void
 location_adhoc_data_fini (struct line_maps *set)
@@ -333,7 +475,19 @@ const struct line_map *
 linemap_add (struct line_maps *set, enum lc_reason reason,
 	     unsigned int sysp, const char *to_file, linenum_type to_line)
 {
-  source_location start_location = set->highest_location + 1;
+  /* Generate a start_location above the current highest_location.
+     If possible, make the low range bits be zero.  */
+  source_location start_location;
+  if (set->highest_location < LINE_MAP_MAX_LOCATION_WITH_COLS)
+    {
+      start_location = set->highest_location + (1 << set->default_range_bits);
+      if (set->default_range_bits)
+	start_location &= ~((1 << set->default_range_bits) - 1);
+      linemap_assert (0 == (start_location
+			    & ((1 << set->default_range_bits) - 1)));
+    }
+  else
+    start_location = set->highest_location + 1;
 
   linemap_assert (!(LINEMAPS_ORDINARY_USED (set)
 		    && (start_location
@@ -412,11 +566,18 @@ linemap_add (struct line_maps *set, enum lc_reason reason,
   map->to_file = to_file;
   map->to_line = to_line;
   LINEMAPS_ORDINARY_CACHE (set) = LINEMAPS_ORDINARY_USED (set) - 1;
-  map->column_bits = 0;
+  map->m_column_and_range_bits = 0;
+  map->m_range_bits = 0;
   set->highest_location = start_location;
   set->highest_line = start_location;
   set->max_column_hint = 0;
 
+  /* This assertion is placed after set->highest_location has
+     been updated, since the latter affects
+     linemap_location_from_macro_expansion_p, which ultimately affects
+     pure_location_p.  */
+  linemap_assert (pure_location_p (set, start_location));
+
   if (reason == LC_ENTER)
     {
       map->included_from =
@@ -563,13 +724,14 @@ linemap_line_start (struct line_maps *set, linenum_type to_line,
     SOURCE_LINE (map, set->highest_line);
   int line_delta = to_line - last_line;
   bool add_map = false;
+  linemap_assert (map->m_column_and_range_bits >= map->m_range_bits);
+  int effective_column_bits = map->m_column_and_range_bits - map->m_range_bits;
 
   if (line_delta < 0
       || (line_delta > 10
-	  && line_delta * ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map) > 1000)
-      || (max_column_hint >= (1U << ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map)))
-      || (max_column_hint <= 80
-	  && ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map) >= 10)
+	  && line_delta * map->m_column_and_range_bits > 1000)
+      || (max_column_hint >= (1U << effective_column_bits))
+      || (max_column_hint <= 80 && effective_column_bits >= 10)
       || (highest > LINE_MAP_MAX_LOCATION_WITH_COLS
 	  && (set->max_column_hint || highest >= LINE_MAP_MAX_SOURCE_LOCATION)))
     add_map = true;
@@ -578,22 +740,27 @@ linemap_line_start (struct line_maps *set, linenum_type to_line,
   if (add_map)
     {
       int column_bits;
+      int range_bits;
       if (max_column_hint > LINE_MAP_MAX_COLUMN_NUMBER
 	  || highest > LINE_MAP_MAX_LOCATION_WITH_COLS)
 	{
 	  /* If the column number is ridiculous or we've allocated a huge
-	     number of source_locations, give up on column numbers. */
+	     number of source_locations, give up on column numbers
+	     (and on packed ranges).  */
 	  max_column_hint = 0;
 	  column_bits = 0;
+	  range_bits = 0;
 	  if (highest > LINE_MAP_MAX_SOURCE_LOCATION)
 	    return 0;
 	}
       else
 	{
 	  column_bits = 7;
+	  range_bits = set->default_range_bits;
 	  while (max_column_hint >= (1U << column_bits))
 	    column_bits++;
 	  max_column_hint = 1U << column_bits;
+	  column_bits += range_bits;
 	}
       /* Allocate the new line_map.  However, if the current map only has a
 	 single line we can sometimes just increase its column_bits instead. */
@@ -606,14 +773,14 @@ linemap_line_start (struct line_maps *set, linenum_type to_line,
 				ORDINARY_MAP_IN_SYSTEM_HEADER_P (map),
 				ORDINARY_MAP_FILE_NAME (map),
 				to_line)));
-      map->column_bits = column_bits;
+      map->m_column_and_range_bits = column_bits;
+      map->m_range_bits = range_bits;
       r = (MAP_START_LOCATION (map)
 	   + ((to_line - ORDINARY_MAP_STARTING_LINE_NUMBER (map))
 	      << column_bits));
     }
   else
-    r = highest - SOURCE_COLUMN (map, highest)
-      + (line_delta << ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map));
+    r = set->highest_line + (line_delta << map->m_column_and_range_bits);
 
   /* Locations of ordinary tokens are always lower than locations of
      macro tokens.  */
@@ -624,6 +791,18 @@ linemap_line_start (struct line_maps *set, linenum_type to_line,
   if (r > set->highest_location)
     set->highest_location = r;
   set->max_column_hint = max_column_hint;
+
+  /* At this point, we expect one of:
+     (a) the normal case: a "pure" location with 0 range bits, or
+     (b) we've gone past LINE_MAP_MAX_LOCATION_WITH_COLS so can't track
+        columns anymore (or ranges), or
+     (c) we're in a region with a column hint exceeding
+        LINE_MAP_MAX_COLUMN_NUMBER, so column-tracking is off,
+	with column_bits == 0.  */
+  linemap_assert (pure_location_p (set, r)
+		  || r >= LINE_MAP_MAX_LOCATION_WITH_COLS
+		  || map->m_column_and_range_bits == 0);
+  linemap_assert (SOURCE_LINE (map, r) == to_line);
   return r;
 }
 
@@ -654,7 +833,8 @@ linemap_position_for_column (struct line_maps *set, unsigned int to_column)
 	  r = linemap_line_start (set, SOURCE_LINE (map, r), to_column + 50);
 	}
     }
-  r = r + to_column;
+  line_map_ordinary *map = LINEMAPS_LAST_ORDINARY_MAP (set);
+  r = r + (to_column << map->m_range_bits);
   if (r >= set->highest_location)
     set->highest_location = r;
   return r;
@@ -664,16 +844,25 @@ linemap_position_for_column (struct line_maps *set, unsigned int to_column)
    column.  */
 
 source_location
-linemap_position_for_line_and_column (const line_map_ordinary *ord_map,
+linemap_position_for_line_and_column (line_maps *set,
+				      const line_map_ordinary *ord_map,
 				      linenum_type line,
 				      unsigned column)
 {
   linemap_assert (ORDINARY_MAP_STARTING_LINE_NUMBER (ord_map) <= line);
 
-  return (MAP_START_LOCATION (ord_map)
-	  + ((line - ORDINARY_MAP_STARTING_LINE_NUMBER (ord_map))
-	     << ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (ord_map))
-	  + (column & ((1 << ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (ord_map)) - 1)));
+  source_location r = MAP_START_LOCATION (ord_map);
+  r += ((line - ORDINARY_MAP_STARTING_LINE_NUMBER (ord_map))
+	<< ord_map->m_column_and_range_bits);
+  if (r <= LINE_MAP_MAX_LOCATION_WITH_COLS)
+    r += ((column & ((1 << ord_map->m_column_and_range_bits) - 1))
+	  << ord_map->m_range_bits);
+  source_location upper_limit = LINEMAPS_MACRO_LOWEST_LOCATION (set);
+  if (r >= upper_limit)
+    r = upper_limit - 1;
+  if (r > set->highest_location)
+    set->highest_location = r;
+  return r;
 }
 
 /* Encode and return a source_location starting from location LOC and
@@ -728,11 +917,11 @@ linemap_position_for_loc_and_offset (struct line_maps *set,
     }
 
   offset += column;
-  if (linemap_assert_fails (offset < (1u << map->column_bits)))
+  if (linemap_assert_fails (offset < (1u << map->m_column_and_range_bits)))
     return loc;
 
   source_location r = 
-    linemap_position_for_line_and_column (map, line, offset);
+    linemap_position_for_line_and_column (set, map, line, offset);
   if (linemap_assert_fails (r <= set->highest_location)
       || linemap_assert_fails (map == linemap_lookup (set, r)))
     return loc;
diff --git a/libcpp/location-example.txt b/libcpp/location-example.txt
index a5f95b2..14b5c2e 100644
--- a/libcpp/location-example.txt
+++ b/libcpp/location-example.txt
@@ -30,142 +30,154 @@ RESERVED LOCATIONS
   source_location interval: 0 <= loc < 2
 
 ORDINARY MAP: 0
-  source_location interval: 2 <= loc < 3
+  source_location interval: 32 <= loc < 64
   file: test.c
   starting at line: 1
-  column bits: 7
-test.c:  1|loc:    2|#include "test.h"
-                    |00000001111111111
-                    |34567890123456789
+  column bits: 12
+  range bits: 5
+test.c:  1|loc:   32|#include "test.h"
+                    |69269258258148147
+                    |46802468024680246
 
 ORDINARY MAP: 1
-  source_location interval: 3 <= loc < 4
+  source_location interval: 64 <= loc < 96
   file: <built-in>
   starting at line: 0
   column bits: 0
+  range bits: 0
 
 ORDINARY MAP: 2
-  source_location interval: 4 <= loc < 5
+  source_location interval: 96 <= loc < 128
   file: <command-line>
   starting at line: 0
   column bits: 0
+  range bits: 0
 
 ORDINARY MAP: 3
-  source_location interval: 5 <= loc < 5005
+  source_location interval: 128 <= loc < 160128
   file: /usr/include/stdc-predef.h
   starting at line: 1
-  column bits: 7
+  column bits: 12
+  range bits: 5
 (contents of /usr/include/stdc-predef.h snipped for brevity)
 
 ORDINARY MAP: 4
-  source_location interval: 5005 <= loc < 5006
+  source_location interval: 160128 <= loc < 160160
   file: <command-line>
-  starting at line: 1
-  column bits: 7
+  starting at line: 32
+  column bits: 12
+  range bits: 5
 
 ORDINARY MAP: 5
-  source_location interval: 5006 <= loc < 5134
+  source_location interval: 160160 <= loc < 164256
   file: test.c
   starting at line: 1
-  column bits: 7
-test.c:  1|loc: 5006|#include "test.h"
-                    |55555555555555555
+  column bits: 12
+  range bits: 5
+test.c:  1|loc:160160|#include "test.h"
                     |00000000000000000
-                    |00011111111112222
-                    |78901234567890123
+                    |12223334445556667
+                    |92582581481470470
+                    |24680246802468024
 
 ORDINARY MAP: 6
-  source_location interval: 5134 <= loc < 5416
+  source_location interval: 164256 <= loc < 173280
   file: test.h
   starting at line: 1
-  column bits: 7
-test.h:  1|loc: 5134|extern int foo ();
-                    |555555555555555555
-                    |111111111111111111
-                    |333334444444444555
-                    |567890123456789012
-test.h:  2|loc: 5262|
+  column bits: 12
+  range bits: 5
+test.h:  1|loc:164256|extern int foo ();
+                    |444444444444444444
+                    |233344455566677788
+                    |825814814704703603
+                    |802468024680246802
+test.h:  2|loc:168352|
                     |
                     |
                     |
                     |
-test.h:  3|loc: 5390|#define PLUS(A, B) A + B
-                    |555555555555555555555555
-                    |333333333444444444444444
-                    |999999999000000000011111
-                    |123456789012345678901234
+test.h:  3|loc:172448|#define PLUS(A, B) A + B
+                    |222222222222222223333333
+                    |455566677788889990001112
+                    |814704703603692692582581
+                    |024680246802468024680246
 
 ORDINARY MAP: 7
-  source_location interval: 5416 <= loc < 6314
+  source_location interval: 173280 <= loc < 202016
   file: test.c
   starting at line: 2
-  column bits: 7
-test.c:  2|loc: 5416|
+  column bits: 12
+  range bits: 5
+test.c:  2|loc:173280|
                     |
                     |
                     |
                     |
-test.c:  3|loc: 5544|int
-                    |555
-                    |555
+test.c:  3|loc:177376|int
+                    |777
                     |444
-                    |567
-test.c:  4|loc: 5672|main (int argc, char **argv)
-                    |5555555555555555555555555555
-                    |6666666666666666666666666667
-                    |7777777888888888899999999990
-                    |3456789012345678901234567890
-test.c:  5|loc: 5800|{
+                    |047
+                    |802
+test.c:  4|loc:181472|main (int argc, char **argv)
+                    |1111111111111111222222222222
+                    |5556666777888999000111222333
+                    |0360369269258258148147047036
+                    |4680246802468024680246802468
+test.c:  5|loc:185568|{
                     |5
-                    |8
-                    |0
-                    |1
-test.c:  6|loc: 5928|  int a = PLUS (1,2);
-                    |555555555555555555555
-                    |999999999999999999999
-                    |233333333334444444444
-                    |901234567890123456789
-test.c:  7|loc: 6056|  int b = PLUS (3,4);
-                    |666666666666666666666
-                    |000000000000000000000
-                    |555666666666677777777
-                    |789012345678901234567
-test.c:  8|loc: 6184|  return 0;
-                    |66666666666
-                    |11111111111
-                    |88888999999
-                    |56789012345
-test.c:  9|loc: 6312|}
                     |6
-                    |3
+                    |0
+                    |0
+test.c:  6|loc:189664|  int a = PLUS (1,2);
+                    |999999999900000000000
+                    |677788899900011122233
+                    |926925825814814704703
+                    |680246802468024680246
+test.c:  7|loc:193760|  int b = PLUS (3,4);
+                    |333333344444444444444
+                    |788899900011122233344
+                    |925825814814704703603
+                    |246802468024680246802
+test.c:  8|loc:197856|  return 0;
+                    |77778888888
+                    |89990001112
+                    |82581481470
+                    |80246802468
+test.c:  9|loc:201952|}
                     |1
-                    |3
+                    |9
+                    |8
+                    |4
 
 UNALLOCATED LOCATIONS
-  source_location interval: 6314 <= loc < 2147483633
+  source_location interval: 202016 <= loc < 2147483633
 
 MACRO 1: PLUS (7 tokens)
   source_location interval: 2147483633 <= loc < 2147483640
-test.c:7:11: note: expansion point is location 6067
+test.c:7:11: note: expansion point is location 194115
    int b = PLUS (3,4);
-           ^
+           ^~~~
+
   map->start_location: 2147483633
   macro_locations:
-    0: 6073, 5410
-test.c:7:17: note: token 0 has x-location == 6073
+    0: 194304, 173088
+test.c:7:17: note: token 0 has x-location == 194304
    int b = PLUS (3,4);
                  ^
-test.c:7:17: note: token 0 has y-location == 5410
-    1: 5412, 5412
+
+test.c:7:17: note: token 0 has y-location == 173088
+    1: 173152, 173152
 In file included from test.c:1:0:
-test.h:3:22: note: token 1 has x-location == y-location == 5412
+test.h:3:22: note: token 1 has x-location == y-location == 173152
  #define PLUS(A, B) A + B
                       ^
-    2: 6075, 5414
-test.c:7:19: note: token 2 has x-location == 6075
+
+    2: 194368, 173216
+test.c:7:19: note: token 2 has x-location == 194368
    int b = PLUS (3,4);
                    ^
-test.c:7:19: note: token 2 has y-location == 5414
+
+test.c:7:19: note: token 2 has y-location == 173216
     3: 0, 2947526575
 cc1: note: token 3 has x-location == 0
 cc1: note: token 3 has y-location == 2947526575
@@ -178,26 +190,30 @@ x-location == y-location == 2947526575 encodes token # 800042942
 
 MACRO 0: PLUS (7 tokens)
   source_location interval: 2147483640 <= loc < 2147483647
-test.c:6:11: note: expansion point is location 5939
+test.c:6:11: note: expansion point is location 190019
    int a = PLUS (1,2);
-           ^
+           ^~~~
+
   map->start_location: 2147483640
   macro_locations:
-    0: 5945, 5410
-test.c:6:17: note: token 0 has x-location == 5945
+    0: 190208, 173088
+test.c:6:17: note: token 0 has x-location == 190208
    int a = PLUS (1,2);
                  ^
-test.c:6:17: note: token 0 has y-location == 5410
-    1: 5412, 5412
+
+test.c:6:17: note: token 0 has y-location == 173088
+    1: 173152, 173152
 In file included from test.c:1:0:
-test.h:3:22: note: token 1 has x-location == y-location == 5412
+test.h:3:22: note: token 1 has x-location == y-location == 173152
  #define PLUS(A, B) A + B
                       ^
-    2: 5947, 5414
-test.c:6:19: note: token 2 has x-location == 5947
+
+    2: 190272, 173216
+test.c:6:19: note: token 2 has x-location == 190272
    int a = PLUS (1,2);
                    ^
-test.c:6:19: note: token 2 has y-location == 5414
+
+test.c:6:19: note: token 2 has y-location == 173216
     3: 0, 2947526575
 cc1: note: token 3 has x-location == 0
 cc1: note: token 3 has y-location == 2947526575
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 05/10] Add ranges to libcpp tokens (via ad-hoc data, unoptimized)
  2015-10-23 20:25 ` [PATCH 00/10] Overhaul of diagnostics (v5) David Malcolm
                     ` (4 preceding siblings ...)
  2015-10-23 20:26   ` [PATCH 08/10] Wire things up so that libcpp users get token underlines David Malcolm
@ 2015-10-23 20:26   ` David Malcolm
  2015-10-27 21:29     ` Jeff Law
  2015-10-23 20:26   ` [PATCH 10/10] Compress short ranges into source_location David Malcolm
                     ` (4 subsequent siblings)
  10 siblings, 1 reply; 83+ messages in thread
From: David Malcolm @ 2015-10-23 20:26 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

This patch:

  - generalizes the meaning of the source_location (aka location_t) type
    from being a caret within the source code to being a caret plus
    a source_range.  The latter data is stored purely in the ad-hoc data
    lookaside for now.

  - captures ranges for libcpp tokens by generating source_location
    values with caret == start and finish == the last character of the
    token

This is elegant, since we can store caret+range data in location_t as
before, without having to track ranges everywhere: all location_t
values (such as input_location) become the ranges of tokens; a followup
patch fixes the diagnostic machinery to automatically extract the
ranges when building rich_location instances, which means that all
calls to warning, warning_at, error, error_at etc get underlines
showing the range of the pertinent token "for free".

However, it's inefficient, since it means generating an ad-hoc location
for every token in libcpp.  A followup patch optimizes this by packing
short ranges into location_t, without needing to use ad-hoc locations,
which covers most tokens efficiently.

gcc/ChangeLog:
	* gimple.h (gimple_set_block): Use set_block function.
	* tree-cfg.c (move_block_to_fn): Likewise.
	(move_block_to_fn): Likewise.
	* tree-inline.c (copy_phis_for_bb): Likewise.
	* tree.c (tree_set_block): Likewise.
	(set_block): New function.
	* tree.h (set_block): New declaration.

libcpp/ChangeLog:
	* include/cpplib.h (struct cpp_token): Update comment for src_loc
	to indicate that the range of the token is "baked into" the
	source_location.
	* include/line-map.h (location_adhoc_data): Add source_range
	field.
	(get_combined_adhoc_loc): Add source_range param.
	(get_range_from_adhoc_loc): New declaration.
	(get_range_from_loc): New inline function.
	(COMBINE_LOCATION_DATA):  Add source_range param.
	* lex.c (_cpp_lex_direct): Capture the range of the token, baking
	it into token->src_loc via a call to COMBINE_LOCATION_DATA.
	* line-map.c (location_adhoc_data_hash): Add the src_range into
	the hash value.
	(location_adhoc_data_eq): Require equality of the src_range
	values.
	(get_combined_adhoc_loc): Add src_range param, and store it
	within the lookaside table.  Remove the requirement that data
	is non-NULL.
	(get_range_from_adhoc_loc): New function.
	(linemap_expand_location): Extract the data pointer before
	extracting the location.
---
 gcc/gimple.h              |  6 +-----
 gcc/tree-cfg.c            |  9 ++-------
 gcc/tree-inline.c         |  5 +----
 gcc/tree.c                | 11 +++++++----
 gcc/tree.h                |  3 +++
 libcpp/include/cpplib.h   |  3 ++-
 libcpp/include/line-map.h | 23 ++++++++++++++++++++---
 libcpp/lex.c              | 10 ++++++++++
 libcpp/line-map.c         | 26 ++++++++++++++++++++------
 9 files changed, 66 insertions(+), 30 deletions(-)

diff --git a/gcc/gimple.h b/gcc/gimple.h
index a456f54..ba66931 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -1709,11 +1709,7 @@ gimple_block (const gimple *g)
 static inline void
 gimple_set_block (gimple *g, tree block)
 {
-  if (block)
-    g->location =
-	COMBINE_LOCATION_DATA (line_table, g->location, block);
-  else
-    g->location = LOCATION_LOCUS (g->location);
+  g->location = set_block (g->location, block);
 }
 
 
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 735ac46..8ef2443 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -6739,10 +6739,7 @@ move_block_to_fn (struct function *dest_cfun, basic_block bb,
 	    continue;
 	  if (d->orig_block == NULL_TREE || block == d->orig_block)
 	    {
-	      if (d->new_block == NULL_TREE)
-		locus = LOCATION_LOCUS (locus);
-	      else
-		locus = COMBINE_LOCATION_DATA (line_table, locus, d->new_block);
+	      locus = set_block (locus, d->new_block);
 	      gimple_phi_arg_set_location (phi, i, locus);
 	    }
 	}
@@ -6802,9 +6799,7 @@ move_block_to_fn (struct function *dest_cfun, basic_block bb,
 	tree block = LOCATION_BLOCK (e->goto_locus);
 	if (d->orig_block == NULL_TREE
 	    || block == d->orig_block)
-	  e->goto_locus = d->new_block ?
-	      COMBINE_LOCATION_DATA (line_table, e->goto_locus, d->new_block) :
-	      LOCATION_LOCUS (e->goto_locus);
+	  e->goto_locus = set_block (e->goto_locus, d->new_block);
       }
 }
 
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index 9b525f3..58eca90 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -2352,10 +2352,7 @@ copy_phis_for_bb (basic_block bb, copy_body_data *id)
 		  tree *n;
 		  n = id->decl_map->get (LOCATION_BLOCK (locus));
 		  gcc_assert (n);
-		  if (*n)
-		    locus = COMBINE_LOCATION_DATA (line_table, locus, *n);
-		  else
-		    locus = LOCATION_LOCUS (locus);
+		  locus = set_block (locus, *n);
 		}
 	      else
 		locus = LOCATION_LOCUS (locus);
diff --git a/gcc/tree.c b/gcc/tree.c
index f78a2c2..426803c 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -11669,10 +11669,7 @@ tree_set_block (tree t, tree b)
 
   if (IS_EXPR_CODE_CLASS (c))
     {
-      if (b)
-	t->exp.locus = COMBINE_LOCATION_DATA (line_table, t->exp.locus, b);
-      else
-	t->exp.locus = LOCATION_LOCUS (t->exp.locus);
+      t->exp.locus = set_block (t->exp.locus, b);
     }
   else
     gcc_unreachable ();
@@ -13656,5 +13653,11 @@ nonnull_arg_p (const_tree arg)
   return false;
 }
 
+location_t
+set_block (location_t loc, tree block)
+{
+  source_range src_range = get_range_from_loc (line_table, loc);
+  return COMBINE_LOCATION_DATA (line_table, loc, src_range, block);
+}
 
 #include "gt-tree.h"
diff --git a/gcc/tree.h b/gcc/tree.h
index 4c803f4..92cc929 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -5139,6 +5139,9 @@ type_with_alias_set_p (const_tree t)
   return false;
 }
 
+extern location_t
+set_block (location_t loc, tree block);
+
 extern void gt_ggc_mx (tree &);
 extern void gt_pch_nx (tree &);
 extern void gt_pch_nx (tree &, gt_pointer_operator, void *);
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index a2bdfa0..f5c2a21 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -237,7 +237,8 @@ struct GTY(()) cpp_identifier {
 /* A preprocessing token.  This has been carefully packed and should
    occupy 16 bytes on 32-bit hosts and 24 bytes on 64-bit hosts.  */
 struct GTY(()) cpp_token {
-  source_location src_loc;	/* Location of first char of token.  */
+  source_location src_loc;	/* Location of first char of token,
+				   together with range of full token.  */
   ENUM_BITFIELD(cpp_ttype) type : CHAR_BIT;  /* token type */
   unsigned short flags;		/* flags - see above */
 
diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index 84a5ab7..de1c55c 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -512,9 +512,11 @@ struct GTY(()) maps_info_macro {
   unsigned int cache;
 };
 
-/* Data structure to associate an arbitrary data to a source location.  */
+/* Data structure to associate a source_range together with an arbitrary
+   data pointer with a source location.  */
 struct GTY(()) location_adhoc_data {
   source_location locus;
+  source_range src_range;
   void * GTY((skip)) data;
 };
 
@@ -813,10 +815,14 @@ LINEMAPS_LAST_ALLOCATED_MACRO_MAP (const line_maps *set)
 
 extern void location_adhoc_data_fini (struct line_maps *);
 extern source_location get_combined_adhoc_loc (struct line_maps *,
-					       source_location, void *);
+					       source_location,
+					       source_range,
+					       void *);
 extern void *get_data_from_adhoc_loc (struct line_maps *, source_location);
 extern source_location get_location_from_adhoc_loc (struct line_maps *,
 						    source_location);
+extern source_range get_range_from_adhoc_loc (struct line_maps *,
+					      source_location);
 
 /* Get whether location LOC is an ad-hoc location.  */
 
@@ -826,14 +832,25 @@ IS_ADHOC_LOC (source_location loc)
   return (loc & MAX_SOURCE_LOCATION) != loc;
 }
 
+inline source_range
+get_range_from_loc (struct line_maps *set,
+		    source_location loc)
+{
+  if (IS_ADHOC_LOC (loc))
+    return get_range_from_adhoc_loc (set, loc);
+  else
+    return source_range::from_location (loc);
+}
+
 /* Combine LOC and BLOCK, giving a combined adhoc location.  */
 
 inline source_location
 COMBINE_LOCATION_DATA (struct line_maps *set,
 		       source_location loc,
+		       source_range src_range,
 		       void *block)
 {
-  return get_combined_adhoc_loc (set, loc, block);
+  return get_combined_adhoc_loc (set, loc, src_range, block);
 }
 
 extern void rebuild_location_adhoc_htab (struct line_maps *);
diff --git a/libcpp/lex.c b/libcpp/lex.c
index 0aa1090..f4c964f 100644
--- a/libcpp/lex.c
+++ b/libcpp/lex.c
@@ -2723,6 +2723,16 @@ _cpp_lex_direct (cpp_reader *pfile)
       break;
     }
 
+  source_range tok_range;
+  tok_range.m_start = result->src_loc;
+  tok_range.m_finish =
+    linemap_position_for_column (pfile->line_table,
+				 CPP_BUF_COLUMN (buffer, buffer->cur));
+
+  result->src_loc = COMBINE_LOCATION_DATA (pfile->line_table,
+					   result->src_loc,
+					   tok_range, NULL);
+
   return result;
 }
 
diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index 3c19f93..3810c88 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -69,7 +69,10 @@ location_adhoc_data_hash (const void *l)
 {
   const struct location_adhoc_data *lb =
       (const struct location_adhoc_data *) l;
-  return (hashval_t) lb->locus + (size_t) lb->data;
+  return ((hashval_t) lb->locus
+	  + (hashval_t) lb->src_range.m_start
+	  + (hashval_t) lb->src_range.m_finish
+	  + (size_t) lb->data);
 }
 
 /* Compare function for location_adhoc_data hashtable.  */
@@ -81,7 +84,10 @@ location_adhoc_data_eq (const void *l1, const void *l2)
       (const struct location_adhoc_data *) l1;
   const struct location_adhoc_data *lb2 =
       (const struct location_adhoc_data *) l2;
-  return lb1->locus == lb2->locus && lb1->data == lb2->data;
+  return (lb1->locus == lb2->locus
+	  && lb1->src_range.m_start == lb2->src_range.m_start
+	  && lb1->src_range.m_finish == lb2->src_range.m_finish
+	  && lb1->data == lb2->data);
 }
 
 /* Update the hashtable when location_adhoc_data is reallocated.  */
@@ -110,19 +116,20 @@ rebuild_location_adhoc_htab (struct line_maps *set)
 
 source_location
 get_combined_adhoc_loc (struct line_maps *set,
-			source_location locus, void *data)
+			source_location locus,
+			source_range src_range,
+			void *data)
 {
   struct location_adhoc_data lb;
   struct location_adhoc_data **slot;
 
-  linemap_assert (data);
-
   if (IS_ADHOC_LOC (locus))
     locus
       = set->location_adhoc_data_map.data[locus & MAX_SOURCE_LOCATION].locus;
   if (locus == 0 && data == NULL)
     return 0;
   lb.locus = locus;
+  lb.src_range = src_range;
   lb.data = data;
   slot = (struct location_adhoc_data **)
       htab_find_slot (set->location_adhoc_data_map.htab, &lb, INSERT);
@@ -177,6 +184,13 @@ get_location_from_adhoc_loc (struct line_maps *set, source_location loc)
   return set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].locus;
 }
 
+source_range
+get_range_from_adhoc_loc (struct line_maps *set, source_location loc)
+{
+  linemap_assert (IS_ADHOC_LOC (loc));
+  return set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].src_range;
+}
+
 /* Finalize the location_adhoc_data structure.  */
 void
 location_adhoc_data_fini (struct line_maps *set)
@@ -1478,9 +1492,9 @@ linemap_expand_location (struct line_maps *set,
   memset (&xloc, 0, sizeof (xloc));
   if (IS_ADHOC_LOC (loc))
     {
-      loc = set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].locus;
       xloc.data
 	= set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].data;
+      loc = set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].locus;
     }
 
   if (loc < RESERVED_LOCATION_COUNT)
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 07/10] Add plugin to recursively dump the source-ranges in a tree (v2)
  2015-10-23 20:25 ` [PATCH 00/10] Overhaul of diagnostics (v5) David Malcolm
                     ` (7 preceding siblings ...)
  2015-10-23 20:26   ` [PATCH 02/10] Add stats on adhoc table to dump_line_table_statistics David Malcolm
@ 2015-10-23 20:26   ` David Malcolm
  2015-10-27 21:32     ` Jeff Law
  2015-10-23 20:29   ` [PATCH 09/10] Delay some resolution of ad-hoc locations, preserving ranges David Malcolm
  2015-10-23 21:25   ` [PATCH 00/10] Overhaul of diagnostics (v5) Jeff Law
  10 siblings, 1 reply; 83+ messages in thread
From: David Malcolm @ 2015-10-23 20:26 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

This patch adds a test plugin that recurses down an expression tree,
printing diagnostics showing the ranges of each node in the tree.

It corresponds to:
  [PATCH 15/22] Add plugin to recursively dump the source-ranges in a tree
    https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00741.html
from v1 of the patch kit.

Changes in v2:
  * the output no longer contains the PARAM_DECL and INTEGER_CST
    leaves since we no longer have range data for them; updated
    the expected output accordingly.
  * slightly updated to eliminate use of SOURCE_RANGE

Updated screenshot:
  https://dmalcolm.fedorapeople.org/gcc/2015-09-22/diagnostic-test-show-trees-1.html

gcc/testsuite/ChangeLog:
	* gcc.dg/plugin/diagnostic-test-show-trees-1.c: New file.
	* gcc.dg/plugin/diagnostic_plugin_show_trees.c: New file.
	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add
	diagnostic_plugin_show_trees.c and diagnostic-test-show-trees-1.c.
---
 .../gcc.dg/plugin/diagnostic-test-show-trees-1.c   |  65 ++++++++
 .../gcc.dg/plugin/diagnostic_plugin_show_trees.c   | 174 +++++++++++++++++++++
 gcc/testsuite/gcc.dg/plugin/plugin.exp             |   2 +
 3 files changed, 241 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-trees-1.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_show_trees.c

diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-trees-1.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-trees-1.c
new file mode 100644
index 0000000..7473a07
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-trees-1.c
@@ -0,0 +1,65 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdiagnostics-show-caret" } */
+
+/* This is an example file for use with
+   diagnostic_plugin_show_trees.c.
+
+   The plugin handles "__show_tree" by recursively dumping
+   the internal structure of the second input argument.
+
+   We want to accept an expression of any type.  To do this in C, we
+   use variadic arguments, but C requires at least one argument before
+   the ellipsis, so we have a dummy one.  */
+
+extern void __show_tree (int dummy, ...);
+
+extern double sqrt (double x);
+
+void test_quadratic (double a, double b, double c)
+{
+  __show_tree (0,
+     (-b + sqrt (b * b - 4 * a * c))
+     / (2 * a));
+
+/* { dg-begin-multiline-output "" }
+      (-b + sqrt (b * b - 4 * a * c))
+      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+      / (2 * a));
+      ^~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+/* { dg-begin-multiline-output "" }
+      (-b + sqrt (b * b - 4 * a * c))
+      ~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+/* { dg-begin-multiline-output "" }
+      (-b + sqrt (b * b - 4 * a * c))
+            ^~~~~~~~~~~~~~~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+/* { dg-begin-multiline-output "" }
+      (-b + sqrt (b * b - 4 * a * c))
+                  ~~~~~~^~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+/* { dg-begin-multiline-output "" }
+      (-b + sqrt (b * b - 4 * a * c))
+                  ~~^~~
+   { dg-end-multiline-output "" } */
+
+/* { dg-begin-multiline-output "" }
+      (-b + sqrt (b * b - 4 * a * c))
+                          ~~~~~~^~~
+   { dg-end-multiline-output "" } */
+
+/* { dg-begin-multiline-output "" }
+      (-b + sqrt (b * b - 4 * a * c))
+                          ~~^~~
+   { dg-end-multiline-output "" } */
+
+/* { dg-begin-multiline-output "" }
+      / (2 * a));
+        ~~~^~~~
+   { dg-end-multiline-output "" } */
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_show_trees.c b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_show_trees.c
new file mode 100644
index 0000000..5a911c1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_show_trees.c
@@ -0,0 +1,174 @@
+/* This plugin recursively dumps the source-code location ranges of
+   expressions, at the pre-gimplification tree stage.  */
+/* { dg-options "-O" } */
+
+#include "gcc-plugin.h"
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "stringpool.h"
+#include "toplev.h"
+#include "basic-block.h"
+#include "hash-table.h"
+#include "vec.h"
+#include "ggc.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "internal-fn.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "tree.h"
+#include "tree-pass.h"
+#include "intl.h"
+#include "plugin-version.h"
+#include "diagnostic.h"
+#include "context.h"
+#include "gcc-rich-location.h"
+#include "print-tree.h"
+
+/*
+  Hack: fails with linker error:
+./diagnostic_plugin_show_trees.so: undefined symbol: _ZN17gcc_rich_location8add_exprEP9tree_node
+  since nothing in the tree is using gcc_rich_location::add_expr yet.
+
+  I've tried various workarounds (adding DEBUG_FUNCTION to the
+  method, taking its address), but can't seem to fix it that way.
+  So as a nasty workaround, the following material is copied&pasted
+  from gcc-rich-location.c: */
+
+static bool
+get_range_for_expr (tree expr, location_range *r)
+{
+  if (EXPR_HAS_RANGE (expr))
+    {
+      source_range sr = EXPR_LOCATION_RANGE (expr);
+
+      /* Do we have meaningful data?  */
+      if (sr.m_start && sr.m_finish)
+	{
+	  r->m_start = expand_location (sr.m_start);
+	  r->m_finish = expand_location (sr.m_finish);
+	  return true;
+	}
+    }
+
+  return false;
+}
+
+/* Add a range to the rich_location, covering expression EXPR. */
+
+void
+gcc_rich_location::add_expr (tree expr)
+{
+  gcc_assert (expr);
+
+  location_range r;
+  r.m_show_caret_p = false;
+  if (get_range_for_expr (expr, &r))
+    add_range (&r);
+}
+
+/* FIXME: end of material taken from gcc-rich-location.c */
+
+int plugin_is_GPL_compatible;
+
+static void
+show_tree (tree node)
+{
+  if (!CAN_HAVE_RANGE_P (node))
+    return;
+
+  gcc_rich_location richloc (EXPR_LOCATION (node));
+  richloc.add_expr (node);
+
+  if (richloc.get_num_locations () < 2)
+    {
+      error_at_rich_loc (&richloc, "range not found");
+      return;
+    }
+
+  enum tree_code code = TREE_CODE (node);
+
+  location_range *range = richloc.get_range (1);
+  inform_at_rich_loc (&richloc,
+		      "%s at range %i:%i-%i:%i",
+		      get_tree_code_name (code),
+		      range->m_start.line,
+		      range->m_start.column,
+		      range->m_finish.line,
+		      range->m_finish.column);
+
+  /* Recurse.  */
+  int min_idx = 0;
+  int max_idx = TREE_OPERAND_LENGTH (node);
+  switch (code)
+    {
+    case CALL_EXPR:
+      min_idx = 3;
+      break;
+
+    default:
+      break;
+    }
+
+  for (int i = min_idx; i < max_idx; i++)
+    show_tree (TREE_OPERAND (node, i));
+}
+
+tree
+cb_walk_tree_fn (tree * tp, int * walk_subtrees,
+		 void * data ATTRIBUTE_UNUSED)
+{
+  if (TREE_CODE (*tp) != CALL_EXPR)
+    return NULL_TREE;
+
+  tree call_expr = *tp;
+  tree fn = CALL_EXPR_FN (call_expr);
+  if (TREE_CODE (fn) != ADDR_EXPR)
+    return NULL_TREE;
+  fn = TREE_OPERAND (fn, 0);
+  if (TREE_CODE (fn) != FUNCTION_DECL)
+    return NULL_TREE;
+  if (strcmp (IDENTIFIER_POINTER (DECL_NAME (fn)), "__show_tree"))
+    return NULL_TREE;
+
+  /* Get arg 1; print it! */
+  tree arg = CALL_EXPR_ARG (call_expr, 1);
+
+  show_tree (arg);
+
+  return NULL_TREE;
+}
+
+static void
+callback (void *gcc_data, void *user_data)
+{
+  tree fndecl = (tree)gcc_data;
+  walk_tree (&DECL_SAVED_TREE (fndecl), cb_walk_tree_fn, NULL, NULL);
+}
+
+int
+plugin_init (struct plugin_name_args *plugin_info,
+	     struct plugin_gcc_version *version)
+{
+  struct register_pass_info pass_info;
+  const char *plugin_name = plugin_info->base_name;
+  int argc = plugin_info->argc;
+  struct plugin_argument *argv = plugin_info->argv;
+
+  if (!plugin_default_version_check (version, &gcc_version))
+    return 1;
+
+  register_callback (plugin_name,
+		     PLUGIN_PRE_GENERICIZE,
+		     callback,
+		     NULL);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/plugin.exp b/gcc/testsuite/gcc.dg/plugin/plugin.exp
index b7efcf5..f1155ee 100644
--- a/gcc/testsuite/gcc.dg/plugin/plugin.exp
+++ b/gcc/testsuite/gcc.dg/plugin/plugin.exp
@@ -68,6 +68,8 @@ set plugin_test_list [list \
 	  diagnostic-test-show-locus-color.c } \
     { diagnostic_plugin_test_tree_expression_range.c \
 	  diagnostic-test-expressions-1.c } \
+    { diagnostic_plugin_show_trees.c \
+	  diagnostic-test-show-trees-1.c } \
 ]
 
 foreach plugin_test $plugin_test_list {
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 02/10] Add stats on adhoc table to dump_line_table_statistics
  2015-10-23 20:25 ` [PATCH 00/10] Overhaul of diagnostics (v5) David Malcolm
                     ` (6 preceding siblings ...)
  2015-10-23 20:26   ` [PATCH 10/10] Compress short ranges into source_location David Malcolm
@ 2015-10-23 20:26   ` David Malcolm
  2015-10-23 21:07     ` Jeff Law
  2015-10-23 20:26   ` [PATCH 07/10] Add plugin to recursively dump the source-ranges in a tree (v2) David Malcolm
                     ` (2 subsequent siblings)
  10 siblings, 1 reply; 83+ messages in thread
From: David Malcolm @ 2015-10-23 20:26 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

The stats on line-table memory usage emitted via -fmem-report
from input.c's dump_line_table_statistics don't include
information on the ad-hoc data table.

This patch adds lines like this:
 Ad-hoc table size:                     192k
 Ad-hoc table entries used:            4336

OK for trunk?

gcc/ChangeLog:
	* input.c (dump_line_table_statistics): Dump stats on adhoc table.

libcpp/ChangeLog:
	* include/line-map.h (struct linemap_stats): Add fields
	"adhoc_table_size" and "adhoc_table_entries_used".
	* line-map.c (linemap_get_statistics): Populate above fields.
---
 gcc/input.c               | 6 ++++++
 libcpp/include/line-map.h | 2 ++
 libcpp/line-map.c         | 3 +++
 3 files changed, 11 insertions(+)

diff --git a/gcc/input.c b/gcc/input.c
index e7302a4..ff80dd9 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -866,6 +866,12 @@ dump_line_table_statistics (void)
   fprintf (stderr, "Total used maps size:                %5ld%c\n",
            SCALE (total_used_map_size),
            STAT_LABEL (total_used_map_size));
+  fprintf (stderr, "Ad-hoc table size:                   %5ld%c\n",
+	   SCALE (s.adhoc_table_size),
+	   STAT_LABEL (s.adhoc_table_size));
+  fprintf (stderr, "Ad-hoc table entries used:           %5ld\n",
+	   s.adhoc_table_entries_used);
+
   fprintf (stderr, "\n");
 }
 
diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index 30bad87..09378f9 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -1143,6 +1143,8 @@ struct linemap_stats
   long macro_maps_used_size;
   long macro_maps_locations_size;
   long duplicated_macro_maps_locations_size;
+  long adhoc_table_size;
+  long adhoc_table_entries_used;
 };
 
 /* Return the highest location emitted for a given file for which
diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index 3d82e9b..84403de 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -1712,6 +1712,9 @@ linemap_get_statistics (struct line_maps *set,
   s->macro_maps_used_size = macro_maps_used_size;
   s->duplicated_macro_maps_locations_size =
     duplicated_macro_maps_locations_size;
+  s->adhoc_table_size = (set->location_adhoc_data_map.allocated
+			 * sizeof (struct location_adhoc_data));
+  s->adhoc_table_entries_used = set->location_adhoc_data_map.curr_loc;
 }
 
 
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 09/10] Delay some resolution of ad-hoc locations, preserving ranges
  2015-10-23 20:25 ` [PATCH 00/10] Overhaul of diagnostics (v5) David Malcolm
                     ` (8 preceding siblings ...)
  2015-10-23 20:26   ` [PATCH 07/10] Add plugin to recursively dump the source-ranges in a tree (v2) David Malcolm
@ 2015-10-23 20:29   ` David Malcolm
  2015-10-27 22:15     ` Jeff Law
  2015-10-23 21:25   ` [PATCH 00/10] Overhaul of diagnostics (v5) Jeff Law
  10 siblings, 1 reply; 83+ messages in thread
From: David Malcolm @ 2015-10-23 20:29 UTC (permalink / raw)
  To: gcc-patches; +Cc: David Malcolm

Some diagnostics e.g. -Wmaybe-uninitialized weren't showing
underlines, despite being provided with range-based data.

Debugging showed that it the pertinent location was an
ad-hoc location with a range:

  (gdb) p /x loc
  $9 = 0x8000002a

  (gdb) p line_table->location_adhoc_data_map.data[0x2a]
  $10 = {locus = 6919936, src_range = {m_start = 6919936,
         m_finish = 6921216}, data = 0x7ffff19a8480}

  (gdb) call inform (loc, "foo")
  test.c: In function 'test':
  test.c:173:10: note: foo
  return result;
         ^~~~~~

but the result from linemap_resolve_location here:

  location = linemap_resolve_location (line_table, location,
                                       LRK_SPELLING_LOCATION,
                                       NULL);

was stripping away the ad-hoc location to just the locus:

  Value returned is $11 = 6919936

at the front of the token, thus losing the underline.

The fix is to rework linemap_resolve_location to avoid bypassing
ad-hoc locations, so that range data is available later.

gcc/testsuite/ChangeLog:
	* gcc.dg/diagnostic-tree-expr-ranges-2.c: New file.

libcpp/ChangeLog:
	* line-map.c (linemap_position_for_loc_and_offset): Handle
	ad-hoc locations.
	(linemap_macro_map_loc_unwind_toward_spelling): Add line_maps
	param.  Handle ad-hoc locations.
	(linemap_location_in_system_header_p): Pass on "set" to call to
	linemap_macro_map_loc_unwind_toward_spelling.
	(linemap_macro_loc_to_spelling_point): Retain ad-hoc locations.
	Pass on "set" to call to
	linemap_macro_map_loc_unwind_toward_spelling.
	(linemap_resolve_location): Retain ad-hoc locations.  Pass on
	"set" to call to linemap_macro_map_loc_unwind_toward_spelling.
	(linemap_unwind_toward_expansion):  Pass on "set" to call to
	linemap_macro_map_loc_unwind_toward_spelling.
---
 .../gcc.dg/diagnostic-tree-expr-ranges-2.c         | 23 ++++++++++++++++++
 libcpp/line-map.c                                  | 28 ++++++++++++----------
 2 files changed, 39 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/diagnostic-tree-expr-ranges-2.c

diff --git a/gcc/testsuite/gcc.dg/diagnostic-tree-expr-ranges-2.c b/gcc/testsuite/gcc.dg/diagnostic-tree-expr-ranges-2.c
new file mode 100644
index 0000000..302e233
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/diagnostic-tree-expr-ranges-2.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-Wuninitialized -fdiagnostics-show-caret" } */
+
+int test_uninit_1 (void)
+{
+  int result;
+  return result;  /* { dg-warning "uninitialized" } */
+/* { dg-begin-multiline-output "" }
+   return result;
+          ^~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+int test_uninit_2 (void)
+{
+  int result;
+  result += 3; /* { dg-warning "uninitialized" } */
+/* { dg-begin-multiline-output "" }
+   result += 3;
+   ~~~~~~~^~~~
+   { dg-end-multiline-output "" } */
+  return result;
+}
diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index 2cbd56a..6385fdf 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -46,7 +46,7 @@ static const line_map_macro* linemap_macro_map_lookup (struct line_maps *,
 static source_location linemap_macro_map_loc_to_def_point
 (const line_map_macro *, source_location);
 static source_location linemap_macro_map_loc_unwind_toward_spelling
-(const line_map_macro *, source_location);
+(line_maps *set, const line_map_macro *, source_location);
 static source_location linemap_macro_map_loc_to_exp_point
 (const line_map_macro *, source_location);
 static source_location linemap_macro_loc_to_spelling_point
@@ -687,6 +687,9 @@ linemap_position_for_loc_and_offset (struct line_maps *set,
 {
   const line_map_ordinary * map = NULL;
 
+  if (IS_ADHOC_LOC (loc))
+    loc = set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].locus;
+
   /* This function does not support virtual locations yet.  */
   if (linemap_assert_fails
       (!linemap_location_from_macro_expansion_p (set, loc)))
@@ -907,14 +910,19 @@ linemap_macro_map_loc_to_def_point (const line_map_macro *map,
    In other words, this returns the xI location presented in the
    comments of line_map_macro above.  */
 source_location
-linemap_macro_map_loc_unwind_toward_spelling (const line_map_macro* map,
+linemap_macro_map_loc_unwind_toward_spelling (line_maps *set,
+					      const line_map_macro* map,
 					      source_location location)
 {
   unsigned token_no;
 
+  if (IS_ADHOC_LOC (location))
+    location = get_location_from_adhoc_loc (set, location);
+
   linemap_assert (linemap_macro_expansion_map_p (map)
 		  && location >= MAP_START_LOCATION (map));
   linemap_assert (location >= RESERVED_LOCATION_COUNT);
+  linemap_assert (!IS_ADHOC_LOC (location));
 
   token_no = location - MAP_START_LOCATION (map);
   linemap_assert (token_no < MACRO_MAP_NUM_MACRO_TOKENS (map));
@@ -1024,7 +1032,7 @@ linemap_location_in_system_header_p (struct line_maps *set,
 
 	      /* It's a token resulting from a macro expansion.  */
 	      source_location loc =
-		linemap_macro_map_loc_unwind_toward_spelling (macro_map, location);
+		linemap_macro_map_loc_unwind_toward_spelling (set, macro_map, location);
 	      if (loc < RESERVED_LOCATION_COUNT)
 		/* This token might come from a built-in macro.  Let's
 		   look at where that macro got expanded.  */
@@ -1197,11 +1205,6 @@ linemap_macro_loc_to_spelling_point (struct line_maps *set,
 				     const line_map_ordinary **original_map)
 {
   struct line_map *map;
-
-  if (IS_ADHOC_LOC (location))
-    location = set->location_adhoc_data_map.data[location
-						 & MAX_SOURCE_LOCATION].locus;
-
   linemap_assert (set && location >= RESERVED_LOCATION_COUNT);
 
   while (true)
@@ -1212,7 +1215,7 @@ linemap_macro_loc_to_spelling_point (struct line_maps *set,
 
       location
 	= linemap_macro_map_loc_unwind_toward_spelling
-	    (linemap_check_macro (map),
+	    (set, linemap_check_macro (map),
 	     location);
     }
 
@@ -1355,10 +1358,11 @@ linemap_resolve_location (struct line_maps *set,
 			  enum location_resolution_kind lrk,
 			  const line_map_ordinary **map)
 {
+  source_location locus = loc;
   if (IS_ADHOC_LOC (loc))
-    loc = set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].locus;
+    locus = set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].locus;
 
-  if (loc < RESERVED_LOCATION_COUNT)
+  if (locus < RESERVED_LOCATION_COUNT)
     {
       /* A reserved location wasn't encoded in a map.  Let's return a
 	 NULL map here, just like what linemap_ordinary_map_lookup
@@ -1410,7 +1414,7 @@ linemap_unwind_toward_expansion (struct line_maps *set,
     loc = set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].locus;
 
   resolved_location =
-    linemap_macro_map_loc_unwind_toward_spelling (macro_map, loc);
+    linemap_macro_map_loc_unwind_toward_spelling (set, macro_map, loc);
   resolved_map = linemap_lookup (set, resolved_location);
 
   if (!linemap_macro_expansion_map_p (resolved_map))
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 01/10] Improvements to description of source_location in line-map.h
  2015-10-23 20:24   ` [PATCH 01/10] Improvements to description of source_location in line-map.h David Malcolm
@ 2015-10-23 21:02     ` Jeff Law
  0 siblings, 0 replies; 83+ messages in thread
From: Jeff Law @ 2015-10-23 21:02 UTC (permalink / raw)
  To: David Malcolm, gcc-patches

On 10/23/2015 02:41 PM, David Malcolm wrote:
> libcpp/ChangeLog:
> 	* include/line-map.h (source_location): In the table in the
> 	descriptive comment, show UNKNOWN_LOCATION, BUILTINS_LOCATION,
> 	LINE_MAP_MAX_LOCATION_WITH_COLS, LINE_MAP_MAX_SOURCE_LOCATION.
> 	Add notes about ad-hoc values.
If this is documenting current state, then it's OK to go in now.  If 
it's documenting future state, then it can go in with the rest of the 
patches.

jeff

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 02/10] Add stats on adhoc table to dump_line_table_statistics
  2015-10-23 20:26   ` [PATCH 02/10] Add stats on adhoc table to dump_line_table_statistics David Malcolm
@ 2015-10-23 21:07     ` Jeff Law
  0 siblings, 0 replies; 83+ messages in thread
From: Jeff Law @ 2015-10-23 21:07 UTC (permalink / raw)
  To: David Malcolm, gcc-patches

On 10/23/2015 02:41 PM, David Malcolm wrote:
> The stats on line-table memory usage emitted via -fmem-report
> from input.c's dump_line_table_statistics don't include
> information on the ad-hoc data table.
>
> This patch adds lines like this:
>   Ad-hoc table size:                     192k
>   Ad-hoc table entries used:            4336
>
> OK for trunk?
>
> gcc/ChangeLog:
> 	* input.c (dump_line_table_statistics): Dump stats on adhoc table.
>
> libcpp/ChangeLog:
> 	* include/line-map.h (struct linemap_stats): Add fields
> 	"adhoc_table_size" and "adhoc_table_entries_used".
> 	* line-map.c (linemap_get_statistics): Populate above fields.
OK.
jeff

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 03/10] libstdc++v3: Explicitly disable carets and colorization within testsuite
  2015-10-23 20:24   ` [PATCH 03/10] libstdc++v3: Explicitly disable carets and colorization within testsuite David Malcolm
@ 2015-10-23 21:10     ` Jeff Law
  0 siblings, 0 replies; 83+ messages in thread
From: Jeff Law @ 2015-10-23 21:10 UTC (permalink / raw)
  To: David Malcolm, gcc-patches

On 10/23/2015 02:41 PM, David Malcolm wrote:
> Later on in this patch kit, with token range underlining, the
> libstdc++v3 testsuite starts showing numerous failures of the form:
>
>    FAIL: 17_intro/using_namespace_std_tr1_neg.cc (test for excess errors)
>
> The excess errors turn out to be the source code and
> caret/underlines emitted after an "error":
>
>    using namespace std::tr1;  // { dg-error "is not a namespace-name" }
>                    ^~~
>
> However, looking at the results of a control build of r228618, I see
> the testsuite emit code and carets (albeit without underlines):
>
>    using namespace std::tr1;  // { dg-error "is not a namespace-name" }
>                    ^
>
> and for some reason this is treated by dg.exp as:
>
>    PASS: 17_intro/using_namespace_std_tr1_neg.cc (test for excess errors)
>
> It's not clear to me why the status quo isn't treating the lines of
> dumped source code and caret as "excess errors", but the attached
> patch explicitly disables carets and colorization.
>
> libstdc++-v3/ChangeLog:
> 	* testsuite/lib/libstdc++.exp (v3_target_compile): Add
> 	-fno-diagnostics-show-caret -fdiagnostics-color=never to
> 	option's additional_flags.
I'd feel better knowing what was going on with the testing framework. 
But I can live with this.
jeff

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 00/10] Overhaul of diagnostics (v5)
  2015-10-23 21:25   ` [PATCH 00/10] Overhaul of diagnostics (v5) Jeff Law
@ 2015-10-23 21:25     ` David Malcolm
  0 siblings, 0 replies; 83+ messages in thread
From: David Malcolm @ 2015-10-23 21:25 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Fri, 2015-10-23 at 15:13 -0600, Jeff Law wrote:
> On 10/23/2015 02:41 PM, David Malcolm wrote:
> > This is a followup to:
> >    https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01696.html
> > (one of the individual patches has seen iteration since that, so
> > I'm calling the whole thing "v5" for the sake of clarity).
> >
> >
> > Patches 1-3 are a preamble:
> >    "Improvements to description of source_location in line-map.h"
> >    "Add stats on adhoc table to dump_line_table_statistics"
> >    "libstdc++v3: Explicitly disable carets and colorization within
> >      testsuite"
> >
> > Patch 4:
> >   "Reimplement diagnostic_show_locus, introducing rich_location classes (v5)"
> > is an updated version of the rewrite of diagnostic_show_locus,
> > via the new rich_location class.  I believe this one is ready for trunk
> > and could be applied without needing the followup patches; I
> > have a followup patch that adds support for "fix it hints" on top
> > of this (PR/62314).
> >
> > Patch 5:
> >    "Add ranges to libcpp tokens (via ad-hoc data, unoptimized)"
> > implements token range tracking by adding range information to
> > the ad-hoc location table.  As noted in the patch, this generalizes
> > source_location (aka location_t) to be both a caret and a range,
> > letting us track them through our existing location-tracking
> > mechanisms, without having to add extra fields to core data structures.
> > The drawback is that it's inefficient. This is addressed by patch 10,
> > which implements a packing scheme to avoid the ad-hoc table for most
> > tokens.
> >
> > Patch 6:
> >    "Track expression ranges in C frontend"
> > is an updated version of the patch to add tracking of expression
> > ranges to the C frontend, using the above mechanism.
> >
> > Patch 7:
> >    "Add plugin to recursively dump the source-ranges in a tree (v2)"
> > is the test plugin to demo dumping the ranges for all
> > sub-expressions of a complicated expression.  It's unchanged since
> > previous versions.
> >
> > Patch 8:
> >    "Wire things up so that libcpp users get token underlines"
> > wires up the work from patches 4 and 5 so that most diagnostics
> > in frontends using libcpp will see some kind of underlining, for tokens
> > at least.
> >
> > Patch 9:
> >    "Delay some resolution of ad-hoc locations, preserving ranges"
> > tweaks things to provide underlines for some places that patch 8
> > missed.
> >
> > Patch 10:
> >    "Compress short ranges into source_location"
> > is the bit-packing optimization for patch 5.
> So was there a final resolution with PCH?  IIRC from our meeting PCH 
> blew things up by filling one of the tables.

I believe you're referring to a case I mentioned where I used 8 bits for
the bit-packed ranges, and this led to running out of bits for storing
columns in a C++ source file after loading the PCH for the stdlib
(reaching LINE_MAP_MAX_LOCATION_WITH_COLS before the user's source file
had started, iirc).

The patch kit now uses 5 bits for the bit-packed ranges, so there's 3
more bits for columns than in that case = 8 times more room.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 00/10] Overhaul of diagnostics (v5)
  2015-10-23 20:25 ` [PATCH 00/10] Overhaul of diagnostics (v5) David Malcolm
                     ` (9 preceding siblings ...)
  2015-10-23 20:29   ` [PATCH 09/10] Delay some resolution of ad-hoc locations, preserving ranges David Malcolm
@ 2015-10-23 21:25   ` Jeff Law
  2015-10-23 21:25     ` David Malcolm
  10 siblings, 1 reply; 83+ messages in thread
From: Jeff Law @ 2015-10-23 21:25 UTC (permalink / raw)
  To: David Malcolm, gcc-patches

On 10/23/2015 02:41 PM, David Malcolm wrote:
> This is a followup to:
>    https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01696.html
> (one of the individual patches has seen iteration since that, so
> I'm calling the whole thing "v5" for the sake of clarity).
>
>
> Patches 1-3 are a preamble:
>    "Improvements to description of source_location in line-map.h"
>    "Add stats on adhoc table to dump_line_table_statistics"
>    "libstdc++v3: Explicitly disable carets and colorization within
>      testsuite"
>
> Patch 4:
>   "Reimplement diagnostic_show_locus, introducing rich_location classes (v5)"
> is an updated version of the rewrite of diagnostic_show_locus,
> via the new rich_location class.  I believe this one is ready for trunk
> and could be applied without needing the followup patches; I
> have a followup patch that adds support for "fix it hints" on top
> of this (PR/62314).
>
> Patch 5:
>    "Add ranges to libcpp tokens (via ad-hoc data, unoptimized)"
> implements token range tracking by adding range information to
> the ad-hoc location table.  As noted in the patch, this generalizes
> source_location (aka location_t) to be both a caret and a range,
> letting us track them through our existing location-tracking
> mechanisms, without having to add extra fields to core data structures.
> The drawback is that it's inefficient. This is addressed by patch 10,
> which implements a packing scheme to avoid the ad-hoc table for most
> tokens.
>
> Patch 6:
>    "Track expression ranges in C frontend"
> is an updated version of the patch to add tracking of expression
> ranges to the C frontend, using the above mechanism.
>
> Patch 7:
>    "Add plugin to recursively dump the source-ranges in a tree (v2)"
> is the test plugin to demo dumping the ranges for all
> sub-expressions of a complicated expression.  It's unchanged since
> previous versions.
>
> Patch 8:
>    "Wire things up so that libcpp users get token underlines"
> wires up the work from patches 4 and 5 so that most diagnostics
> in frontends using libcpp will see some kind of underlining, for tokens
> at least.
>
> Patch 9:
>    "Delay some resolution of ad-hoc locations, preserving ranges"
> tweaks things to provide underlines for some places that patch 8
> missed.
>
> Patch 10:
>    "Compress short ranges into source_location"
> is the bit-packing optimization for patch 5.
So was there a final resolution with PCH?  IIRC from our meeting PCH 
blew things up by filling one of the tables.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 05/10] Add ranges to libcpp tokens (via ad-hoc data, unoptimized)
  2015-10-23 20:26   ` [PATCH 05/10] Add ranges to libcpp tokens (via ad-hoc data, unoptimized) David Malcolm
@ 2015-10-27 21:29     ` Jeff Law
  0 siblings, 0 replies; 83+ messages in thread
From: Jeff Law @ 2015-10-27 21:29 UTC (permalink / raw)
  To: David Malcolm, gcc-patches

On 10/23/2015 02:41 PM, David Malcolm wrote:
> This patch:
>
>    - generalizes the meaning of the source_location (aka location_t) type
>      from being a caret within the source code to being a caret plus
>      a source_range.  The latter data is stored purely in the ad-hoc data
>      lookaside for now.
>
>    - captures ranges for libcpp tokens by generating source_location
>      values with caret == start and finish == the last character of the
>      token
>
> This is elegant, since we can store caret+range data in location_t as
> before, without having to track ranges everywhere: all location_t
> values (such as input_location) become the ranges of tokens; a followup
> patch fixes the diagnostic machinery to automatically extract the
> ranges when building rich_location instances, which means that all
> calls to warning, warning_at, error, error_at etc get underlines
> showing the range of the pertinent token "for free".
>
> However, it's inefficient, since it means generating an ad-hoc location
> for every token in libcpp.  A followup patch optimizes this by packing
> short ranges into location_t, without needing to use ad-hoc locations,
> which covers most tokens efficiently.
>
> gcc/ChangeLog:
> 	* gimple.h (gimple_set_block): Use set_block function.
> 	* tree-cfg.c (move_block_to_fn): Likewise.
> 	(move_block_to_fn): Likewise.
> 	* tree-inline.c (copy_phis_for_bb): Likewise.
> 	* tree.c (tree_set_block): Likewise.
> 	(set_block): New function.
> 	* tree.h (set_block): New declaration.
>
> libcpp/ChangeLog:
> 	* include/cpplib.h (struct cpp_token): Update comment for src_loc
> 	to indicate that the range of the token is "baked into" the
> 	source_location.
> 	* include/line-map.h (location_adhoc_data): Add source_range
> 	field.
> 	(get_combined_adhoc_loc): Add source_range param.
> 	(get_range_from_adhoc_loc): New declaration.
> 	(get_range_from_loc): New inline function.
> 	(COMBINE_LOCATION_DATA):  Add source_range param.
> 	* lex.c (_cpp_lex_direct): Capture the range of the token, baking
> 	it into token->src_loc via a call to COMBINE_LOCATION_DATA.
> 	* line-map.c (location_adhoc_data_hash): Add the src_range into
> 	the hash value.
> 	(location_adhoc_data_eq): Require equality of the src_range
> 	values.
> 	(get_combined_adhoc_loc): Add src_range param, and store it
> 	within the lookaside table.  Remove the requirement that data
> 	is non-NULL.
> 	(get_range_from_adhoc_loc): New function.
> 	(linemap_expand_location): Extract the data pointer before
> 	extracting the location.
> ---
>   gcc/gimple.h              |  6 +-----
>   gcc/tree-cfg.c            |  9 ++-------
>   gcc/tree-inline.c         |  5 +----
>   gcc/tree.c                | 11 +++++++----
>   gcc/tree.h                |  3 +++
>   libcpp/include/cpplib.h   |  3 ++-
>   libcpp/include/line-map.h | 23 ++++++++++++++++++++---
>   libcpp/lex.c              | 10 ++++++++++
>   libcpp/line-map.c         | 26 ++++++++++++++++++++------
>   9 files changed, 66 insertions(+), 30 deletions(-)
>
> diff --git a/gcc/gimple.h b/gcc/gimple.h
> index a456f54..ba66931 100644
> --- a/gcc/gimple.h
> +++ b/gcc/gimple.h
> @@ -13656,5 +13653,11 @@ nonnull_arg_p (const_tree arg)
>     return false;
>   }
>
> +location_t
> +set_block (location_t loc, tree block)
> +{
> +  source_range src_range = get_range_from_loc (line_table, loc);
> +  return COMBINE_LOCATION_DATA (line_table, loc, src_range, block);
> +}
Needs a function comment.


>
>   #include "gt-tree.h"
> diff --git a/gcc/tree.h b/gcc/tree.h
> index 4c803f4..92cc929 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -5139,6 +5139,9 @@ type_with_alias_set_p (const_tree t)
>     return false;
>   }
>
> +extern location_t
> +set_block (location_t loc, tree block);
Single line please.

> diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
> index 84a5ab7..de1c55c 100644
> --- a/libcpp/include/line-map.h
> +++ b/libcpp/include/line-map.h
> @@ -826,14 +832,25 @@ IS_ADHOC_LOC (source_location loc)
>     return (loc & MAX_SOURCE_LOCATION) != loc;
>   }
>
> +inline source_range
> +get_range_from_loc (struct line_maps *set,
> +		    source_location loc)
> +{
> +  if (IS_ADHOC_LOC (loc))
> +    return get_range_from_adhoc_loc (set, loc);
> +  else
> +    return source_range::from_location (loc);
> +}
Needs a function comment.

  diff --git a/libcpp/lex.c b/libcpp/lex.c
> index 0aa1090..f4c964f 100644
> --- a/libcpp/lex.c
> +++ b/libcpp/lex.c
> @@ -2723,6 +2723,16 @@ _cpp_lex_direct (cpp_reader *pfile)
>         break;
>       }
>
> +  source_range tok_range;
> +  tok_range.m_start = result->src_loc;
> +  tok_range.m_finish =
> +    linemap_position_for_column (pfile->line_table,
> +				 CPP_BUF_COLUMN (buffer, buffer->cur));
Line wrapping is wrong.  Break before the "=".

> diff --git a/libcpp/line-map.c b/libcpp/line-map.c
> index 3c19f93..3810c88 100644
> --- a/libcpp/line-map.c
> +++ b/libcpp/line-map.c
> @@ -177,6 +184,13 @@ get_location_from_adhoc_loc (struct line_maps *set, source_location loc)
>     return set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].locus;
>   }
>
> +source_range
> +get_range_from_adhoc_loc (struct line_maps *set, source_location loc)
> +{
> +  linemap_assert (IS_ADHOC_LOC (loc));
> +  return set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].src_range;
> +}
Function comment.


OK with those nits fixed.  However please don't install into the 
prerequisites are in *and* the later optimization bits are approved & 
installed too.

jeff

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 07/10] Add plugin to recursively dump the source-ranges in a tree (v2)
  2015-10-23 20:26   ` [PATCH 07/10] Add plugin to recursively dump the source-ranges in a tree (v2) David Malcolm
@ 2015-10-27 21:32     ` Jeff Law
  0 siblings, 0 replies; 83+ messages in thread
From: Jeff Law @ 2015-10-27 21:32 UTC (permalink / raw)
  To: David Malcolm, gcc-patches

On 10/23/2015 02:41 PM, David Malcolm wrote:
> This patch adds a test plugin that recurses down an expression tree,
> printing diagnostics showing the ranges of each node in the tree.
>
> It corresponds to:
>    [PATCH 15/22] Add plugin to recursively dump the source-ranges in a tree
>      https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00741.html
> from v1 of the patch kit.
>
> Changes in v2:
>    * the output no longer contains the PARAM_DECL and INTEGER_CST
>      leaves since we no longer have range data for them; updated
>      the expected output accordingly.
>    * slightly updated to eliminate use of SOURCE_RANGE
>
> Updated screenshot:
>    https://dmalcolm.fedorapeople.org/gcc/2015-09-22/diagnostic-test-show-trees-1.html
>
> gcc/testsuite/ChangeLog:
> 	* gcc.dg/plugin/diagnostic-test-show-trees-1.c: New file.
> 	* gcc.dg/plugin/diagnostic_plugin_show_trees.c: New file.
> 	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add
> 	diagnostic_plugin_show_trees.c and diagnostic-test-show-trees-1.c.
OK when all the prereqs are approved.

I'll avoid the request that you check that the headers are in proper 
order & reduced form for the plugin test since it's the testsuite not 
the compiler proper.  :-)

jeff


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 09/10] Delay some resolution of ad-hoc locations, preserving ranges
  2015-10-23 20:29   ` [PATCH 09/10] Delay some resolution of ad-hoc locations, preserving ranges David Malcolm
@ 2015-10-27 22:15     ` Jeff Law
  0 siblings, 0 replies; 83+ messages in thread
From: Jeff Law @ 2015-10-27 22:15 UTC (permalink / raw)
  To: David Malcolm, gcc-patches

On 10/23/2015 02:41 PM, David Malcolm wrote:
> Some diagnostics e.g. -Wmaybe-uninitialized weren't showing
> underlines, despite being provided with range-based data.
>
> Debugging showed that it the pertinent location was an
> ad-hoc location with a range:
>
>    (gdb) p /x loc
>    $9 = 0x8000002a
>
>    (gdb) p line_table->location_adhoc_data_map.data[0x2a]
>    $10 = {locus = 6919936, src_range = {m_start = 6919936,
>           m_finish = 6921216}, data = 0x7ffff19a8480}
>
>    (gdb) call inform (loc, "foo")
>    test.c: In function 'test':
>    test.c:173:10: note: foo
>    return result;
>           ^~~~~~
>
> but the result from linemap_resolve_location here:
>
>    location = linemap_resolve_location (line_table, location,
>                                         LRK_SPELLING_LOCATION,
>                                         NULL);
>
> was stripping away the ad-hoc location to just the locus:
>
>    Value returned is $11 = 6919936
>
> at the front of the token, thus losing the underline.
>
> The fix is to rework linemap_resolve_location to avoid bypassing
> ad-hoc locations, so that range data is available later.
>
> gcc/testsuite/ChangeLog:
> 	* gcc.dg/diagnostic-tree-expr-ranges-2.c: New file.
>
> libcpp/ChangeLog:
> 	* line-map.c (linemap_position_for_loc_and_offset): Handle
> 	ad-hoc locations.
> 	(linemap_macro_map_loc_unwind_toward_spelling): Add line_maps
> 	param.  Handle ad-hoc locations.
> 	(linemap_location_in_system_header_p): Pass on "set" to call to
> 	linemap_macro_map_loc_unwind_toward_spelling.
> 	(linemap_macro_loc_to_spelling_point): Retain ad-hoc locations.
> 	Pass on "set" to call to
> 	linemap_macro_map_loc_unwind_toward_spelling.
> 	(linemap_resolve_location): Retain ad-hoc locations.  Pass on
> 	"set" to call to linemap_macro_map_loc_unwind_toward_spelling.
> 	(linemap_unwind_toward_expansion):  Pass on "set" to call to
> 	linemap_macro_map_loc_unwind_toward_spelling.
This is fine.  Other than the test it seems like it'd be independent of 
the other patches.    If so, you can commit the linmap changes 
independently of the rest.

Jeff
> ---

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 04/10] Reimplement diagnostic_show_locus, introducing rich_location classes (v5)
  2015-10-23 20:25   ` [PATCH 04/10] Reimplement diagnostic_show_locus, introducing rich_location classes (v5) David Malcolm
@ 2015-10-27 23:12     ` Jeff Law
  2015-10-28 17:52       ` David Malcolm
  0 siblings, 1 reply; 83+ messages in thread
From: Jeff Law @ 2015-10-27 23:12 UTC (permalink / raw)
  To: David Malcolm, gcc-patches

On 10/23/2015 02:41 PM, David Malcolm wrote:
> The change since v4 can be seen at:
>   https://dmalcolm.fedorapeople.org/gcc/2015-10-23/0001-Add-colorize_source_p-to-diagnostic_context.patch
> which is a tweak to colorization, to handle both frontends that provide
> ranges and those that only provide carets, and provide a smoother
> transition path for the latter.
>
> gcc/ChangeLog:
> 	* diagnostic-color.c (color_dict): Eliminate "caret"; add "range1"
> 	and "range2".
> 	(parse_gcc_colors): Update comment to describe default GCC_COLORS.
> 	* diagnostic-core.h (warning_at_rich_loc): New declaration.
> 	(error_at_rich_loc): New declaration.
> 	(permerror_at_rich_loc): New declaration.
> 	(inform_at_rich_loc): New declaration.
> 	* diagnostic-show-locus.c (adjust_line): Delete.
> 	(struct point_state): New struct.
> 	(class colorizer): New class.
> 	(class layout_point): New class.
> 	(class layout_range): New class.
> 	(class layout): New class.
> 	(colorizer::colorizer): New ctor.
> 	(colorizer::~colorizer): New dtor.
> 	(layout::layout): New ctor.
> 	(layout::print_line): New method.
> 	(layout::get_state_at_point): New method.
> 	(layout::get_x_bound_for_row): New method.
> 	(show_ruler): New function.
> 	(diagnostic_show_locus): Reimplement in terms of class layout.
> 	* diagnostic.c (diagnostic_initialize): Replace
> 	MAX_LOCATIONS_PER_MESSAGE with rich_location::MAX_RANGES.
> 	(diagnostic_set_info_translated): Convert param from location_t
> 	to rich_location *.  Eliminate calls to set_location on the
> 	message in favor of storing the rich_location ptr there.
> 	(diagnostic_set_info): Convert param from location_t to
> 	rich_location *.
> 	(diagnostic_build_prefix): Break out array into...
> 	(diagnostic_kind_color): New variable.
> 	(diagnostic_get_color_for_kind): New function.
> 	(diagnostic_report_diagnostic): Colorize the option_text
> 	using the color for the severity.
> 	(diagnostic_append_note): Update for change in signature of
> 	diagnostic_set_info.
> 	(diagnostic_append_note_at_rich_loc): New function.
> 	(emit_diagnostic): Update for change in signature of
> 	diagnostic_set_info.
> 	(inform): Likewise.
> 	(inform_at_rich_loc): New function.
> 	(inform_n): Update for change in signature of diagnostic_set_info.
> 	(warning): Likewise.
> 	(warning_at): Likewise.
> 	(warning_at_rich_loc): New function.
> 	(warning_n): Update for change in signature of diagnostic_set_info.
> 	(pedwarn): Likewise.
> 	(permerror): Likewise.
> 	(permerror_at_rich_loc): New function.
> 	(error): Update for change in signature of diagnostic_set_info.
> 	(error_n): Likewise.
> 	(error_at): Likewise.
> 	(error_at_rich_loc): New function.
> 	(sorry): Update for change in signature of diagnostic_set_info.
> 	(fatal_error): Likewise.
> 	(internal_error): Likewise.
> 	(internal_error_no_backtrace): Likewise.
> 	(source_range::debug): New function.
> 	* diagnostic.h (struct diagnostic_info): Eliminate field
> 	"override_column".  Add field "richloc".
> 	(struct diagnostic_context): Add field "colorize_source_p".
> 	(diagnostic_override_column): Delete.
> 	(diagnostic_set_info): Convert param from location_t to
> 	rich_location *.
> 	(diagnostic_set_info_translated): Likewise.
> 	(diagnostic_append_note_at_rich_loc): New function.
> 	(diagnostic_num_locations): New function.
> 	(diagnostic_expand_location): Get the location from the
> 	rich_location.
> 	(diagnostic_print_caret_line): Delete.
> 	(diagnostic_get_color_for_kind): New declaration.
> 	* genmatch.c (linemap_client_expand_location_to_spelling_point): New.
> 	(error_cb): Update for change in signature of "error" callback.
> 	(fatal_at): Likewise.
> 	(warning_at): Likewise.
> 	* input.c (linemap_client_expand_location_to_spelling_point): New.
> 	* pretty-print.c (text_info::set_range): New method.
> 	(text_info::get_location): New method.
> 	* pretty-print.h (MAX_LOCATIONS_PER_MESSAGE): Eliminate this macro.
> 	(struct text_info): Eliminate "locations" array in favor of
> 	"m_richloc", a rich_location *.
> 	(textinfo::set_location): Add a "caret_p" param, and reimplement
> 	in terms of a call to set_range.
> 	(textinfo::get_location): Eliminate inline implementation in favor of
> 	an out-of-line reimplementation.
> 	(textinfo::set_range): New method.
> 	* rtl-error.c (diagnostic_for_asm): Update for change in signature
> 	of diagnostic_set_info.
> 	* tree-diagnostic.c (default_tree_printer): Update for new
> 	"caret_p" param for textinfo::set_location.
> 	* tree-pretty-print.c (percent_K_format): Likewise.
>
> gcc/c-family/ChangeLog:
> 	* c-common.c (c_cpp_error): Convert parameter from location_t to
> 	rich_location *.  Eliminate the "column_override" parameter and
> 	the call to diagnostic_override_column.
> 	Update the "done_lexing" clause to set range 0
> 	on the rich_location, rather than overwriting a location_t.
> 	* c-common.h (c_cpp_error): Convert parameter from location_t to
> 	rich_location *.  Eliminate the "column_override" parameter.
>
> gcc/c/ChangeLog:
> 	* c-decl.c (warn_defaults_to): Update for change in signature
> 	of diagnostic_set_info.
> 	* c-errors.c (pedwarn_c99): Likewise.
> 	(pedwarn_c90): Likewise.
> 	* c-objc-common.c (c_tree_printer): Update for new "caret_p" param
> 	for textinfo::set_location.
>
> gcc/cp/ChangeLog:
> 	* error.c (cp_printer): Update for new "caret_p" param for
> 	textinfo::set_location.
> 	(pedwarn_cxx98): Update for change in signature of
> 	diagnostic_set_info.
>
> gcc/fortran/ChangeLog:
> 	* cpp.c (cb_cpp_error): Convert parameter from location_t to
> 	rich_location *.  Eliminate the "column_override" parameter.
> 	* error.c (gfc_warning): Update for change in signature of
> 	diagnostic_set_info.
> 	(gfc_format_decoder): Update handling of %C/%L for changes
> 	to struct text_info.
> 	(gfc_diagnostic_starter): Use richloc when determining whether to
> 	print one locus or two.  When handling a location that will
> 	involve a call to diagnostic_show_locus, only attempt to print the
> 	locus for the primary location, and don't call into
> 	diagnostic_print_caret_line.
> 	(gfc_warning_now_at): Update for change in signature of
> 	diagnostic_set_info.
> 	(gfc_warning_now): Likewise.
> 	(gfc_error_now): Likewise.
> 	(gfc_fatal_error): Likewise.
> 	(gfc_error): Likewise.
> 	(gfc_internal_error): Likewise.
>
> gcc/testsuite/ChangeLog:
> 	* gcc.dg/plugin/diagnostic-test-show-locus-bw.c: New file.
> 	* gcc.dg/plugin/diagnostic-test-show-locus-color.c: New file.
> 	* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c: New file.
> 	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add the above.
> 	* lib/gcc-dg.exp: Load multiline.exp.
>
> libcpp/ChangeLog:
> 	* errors.c (cpp_diagnostic): Update for change in signature
> 	of "error" callback.
> 	(cpp_diagnostic_with_line): Likewise, calling override_column
> 	on the rich_location.
> 	* include/cpplib.h (struct cpp_callbacks): Within "error"
> 	callback, convert param from source_location to rich_location *,
> 	and drop column_override param.
> 	* include/line-map.h (struct source_range): New struct.
> 	(struct location_range): New struct.
> 	(class rich_location): New class.
> 	(linemap_client_expand_location_to_spelling_point): New declaration.
> 	* line-map.c (rich_location::rich_location): New ctors.
> 	(rich_location::lazily_expand_location): New method.
> 	(rich_location::override_column): New method.
> 	(rich_location::add_range): New methods.
> 	(rich_location::set_range): New method.
> ---

> diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
> index 147a2b8..6865209 100644
> --- a/gcc/diagnostic-show-locus.c
> +++ b/gcc/diagnostic-show-locus.c
I'm having trouble breaking this down into manageable hunks to look at. 
  Are there bits in here that we can pull out as separate patches?  It 
looks like git diff is just making a mess of this file when I think it's 
a huge chunk of new code and a few deletes.

If you have a blob of new code and a blob of deletes, even breaking it 
down that way may help in this case (ie, a patch with new classes & 
code, then a pass that deletes old crud we're not going to use anymore).






> +
> +void
> +source_range::debug (const char *msg) const
> +{
> +  rich_location richloc (m_start);
> +  richloc.add_range (m_start, m_finish);
> +  inform_at_rich_loc (&richloc, "%s", msg);
> +}
Function comment.  Do you need a DEBUG_FUNCTION annotation here?


> +extern const char *
> +diagnostic_get_color_for_kind (diagnostic_t kind);
If this will fit on one line, then combine the lines.


>
>   /* Pure text formatting support functions.  */
>   extern char *file_name_as_prefix (diagnostic_context *, const char *);

> diff --git a/gcc/fortran/error.c b/gcc/fortran/error.c
> index 3825751..4b3d31c 100644
> --- a/gcc/fortran/error.c
> +++ b/gcc/fortran/error.c
I'm having a hard time mapping the code you removed from 
fortran/error.c::gfc_diagnostic_starter to its functional equivalent in 
your new code.  I know we've discussed this issue a few times on the 
phone, so I don't doubt you're handling it.  I just want to know where 
so I can double-check things a bit.




> diff --git a/gcc/genmatch.c b/gcc/genmatch.c
> index 102a635..6bfde06 100644
> --- a/gcc/genmatch.c
> +++ b/gcc/genmatch.c
> @@ -53,14 +53,23 @@ unsigned verbose;
>
>   static struct line_maps *line_table;
>
> +expanded_location
> +linemap_client_expand_location_to_spelling_point (source_location loc)
> +{
> +  const struct line_map_ordinary *map;
> +  loc = linemap_resolve_location (line_table, loc, LRK_SPELLING_LOCATION, &map);
> +  return linemap_expand_location (line_table, map, loc);
> +}
Function comment.


> diff --git a/gcc/input.c b/gcc/input.c
> index ff80dd9..baf8e7e 100644
> --- a/gcc/input.c
> +++ b/gcc/input.c
> @@ -751,6 +751,13 @@ expand_location_to_spelling_point (source_location loc)
>     return expand_location_1 (loc, /*expansion_point_p=*/false);
>   }
>
> +expanded_location
> +linemap_client_expand_location_to_spelling_point (source_location loc)
> +{
> +  return expand_location_to_spelling_point (loc);
> +}
Likewise.


> diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
> index 09378f9..84a5ab7 100644
> --- a/libcpp/include/line-map.h
> +++ b/libcpp/include/line-map.h
> @@ -131,6 +131,35 @@ typedef unsigned int linenum_type;
>     libcpp/location-example.txt.  */
>   typedef unsigned int source_location;
>
> +/* A range of source locations.
> +
> +   Ranges are closed:
> +   m_start is the first location within the range,
> +   m_finish is the last location within the range.
> +
> +   We may need a more compact way to store these, but for now,
> +   let's do it the simple way, as a pair.  */
> +struct GTY(()) source_range
> +{
> +  source_location m_start;
> +  source_location m_finish;
> +
> +  void debug (const char *msg) const;
Do you need a DEBUG_FUNCTION annotation here?

> @@ -1028,6 +1057,175 @@ typedef struct
>     bool sysp;
>   } expanded_location;
>
> +class rich_location
[ ... ]

> +
> +  void
> +  add_range (source_location start, source_location finish,
> +	     bool show_caret_p = false);
> +
> +  void
> +  add_range (source_range src_range,
> +	     bool show_caret_p = false);
Do we really want to bother with default arguments?  Is it buying us 
some level of cleanliness that's hard to otherwise achieve?  Given this 
is new code I just don't see the value here.  Educate me.

Jeff

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 4a] diagnostic-show-locus.c changes: Deletions
  2015-10-28 17:52       ` David Malcolm
  2015-10-28 17:51         ` [PATCH 4b] diagnostic-show-locus.c changes: Insertions David Malcolm
@ 2015-10-28 17:51         ` David Malcolm
  2015-10-28 17:59         ` [PATCH 4c] Other changes: everything apart from diagnostic-show-locus.c changes David Malcolm
  2015-10-30  4:49         ` [PATCH 04/10] Reimplement diagnostic_show_locus, introducing rich_location classes (v5) Jeff Law
  3 siblings, 0 replies; 83+ messages in thread
From: David Malcolm @ 2015-10-28 17:51 UTC (permalink / raw)
  To: law; +Cc: gcc-patches, David Malcolm

gcc/ChangeLog:
	* diagnostic-show-locus.c (adjust_line): Delete.
	(diagnostic_print_caret_line): Delete.
---
 gcc/diagnostic-show-locus.c | 102 --------------------------------------------
 1 file changed, 102 deletions(-)

diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
index 147a2b8..fdf73de 100644
--- a/gcc/diagnostic-show-locus.c
+++ b/gcc/diagnostic-show-locus.c
@@ -36,30 +36,6 @@ along with GCC; see the file COPYING3.  If not see
 # include <sys/ioctl.h>
 #endif
 
-/* If LINE is longer than MAX_WIDTH, and COLUMN is not smaller than
-   MAX_WIDTH by some margin, then adjust the start of the line such
-   that the COLUMN is smaller than MAX_WIDTH minus the margin.  The
-   margin is either CARET_LINE_MARGIN characters or the difference
-   between the column and the length of the line, whatever is smaller.
-   The length of LINE is given by LINE_WIDTH.  */
-static const char *
-adjust_line (const char *line, int line_width,
-	     int max_width, int *column_p)
-{
-  int right_margin = CARET_LINE_MARGIN;
-  int column = *column_p;
-
-  gcc_checking_assert (line_width >= column);
-  right_margin = MIN (line_width - column, right_margin);
-  right_margin = max_width - right_margin;
-  if (line_width >= max_width && column > right_margin)
-    {
-      line += column - right_margin;
-      *column_p = right_margin;
-    }
-  return line;
-}
-
 /* Print the physical source line corresponding to the location of
    this diagnostic, and a caret indicating the precise column.  This
    function only prints two caret characters if the two locations
@@ -86,81 +62,3 @@ diagnostic_show_locus (diagnostic_context * context,
 			       context->caret_chars[0],
 			       context->caret_chars[1]);
 }
-
-/* Print (part) of the source line given by xloc1 with caret1 pointing
-   at the column.  If xloc2.column != 0 and it fits within the same
-   line as xloc1 according to diagnostic_same_line (), then caret2 is
-   printed at xloc2.colum.  Otherwise, the caller has to set up things
-   to print a second caret line for xloc2.  */
-void
-diagnostic_print_caret_line (diagnostic_context * context,
-			     expanded_location xloc1,
-			     expanded_location xloc2,
-			     char caret1, char caret2)
-{
-  if (!diagnostic_same_line (context, xloc1, xloc2))
-    /* This will mean ignore xloc2.  */
-    xloc2.column = 0;
-  else if (xloc1.column == xloc2.column)
-    xloc2.column++;
-
-  int cmax = MAX (xloc1.column, xloc2.column);
-  int line_width;
-  const char *line = location_get_source_line (xloc1.file, xloc1.line,
-					       &line_width);
-  if (line == NULL || cmax > line_width)
-    return;
-
-  /* Center the interesting part of the source line to fit in
-     max_width, and adjust all columns accordingly.  */
-  int max_width = context->caret_max_width;
-  int offset = (int) cmax;
-  line = adjust_line (line, line_width, max_width, &offset);
-  offset -= cmax;
-  cmax += offset;
-  xloc1.column += offset;
-  if (xloc2.column)
-    xloc2.column += offset;
-
-  /* Print the source line.  */
-  pp_newline (context->printer);
-  const char *saved_prefix = pp_get_prefix (context->printer);
-  pp_set_prefix (context->printer, NULL);
-  pp_space (context->printer);
-  while (max_width > 0 && line_width > 0)
-    {
-      char c = *line == '\t' ? ' ' : *line;
-      if (c == '\0')
-	c = ' ';
-      pp_character (context->printer, c);
-      max_width--;
-      line_width--;
-      line++;
-    }
-  pp_newline (context->printer);
-
-  /* Print the caret under the line.  */
-  const char *caret_cs, *caret_ce;
-  caret_cs = colorize_start (pp_show_color (context->printer), "caret");
-  caret_ce = colorize_stop (pp_show_color (context->printer));
-  int cmin = xloc2.column
-    ? MIN (xloc1.column, xloc2.column) : xloc1.column;
-  int caret_min = cmin == xloc1.column ? caret1 : caret2;
-  int caret_max = cmin == xloc1.column ? caret2 : caret1;
-
-  /* cmin is >= 1, but we indent with an extra space at the start like
-     we did above.  */
-  int i;
-  for (i = 0; i < cmin; i++)
-    pp_space (context->printer);
-  pp_printf (context->printer, "%s%c%s", caret_cs, caret_min, caret_ce);
-
-  if (xloc2.column)
-    {
-      for (i++; i < cmax; i++)
-	pp_space (context->printer);
-      pp_printf (context->printer, "%s%c%s", caret_cs, caret_max, caret_ce);
-    }
-  pp_set_prefix (context->printer, saved_prefix);
-  pp_needs_newline (context->printer) = true;
-}
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 4b] diagnostic-show-locus.c changes: Insertions
  2015-10-28 17:52       ` David Malcolm
@ 2015-10-28 17:51         ` David Malcolm
  2015-10-30  4:53           ` Jeff Law
  2015-10-28 17:51         ` [PATCH 4a] diagnostic-show-locus.c changes: Deletions David Malcolm
                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 83+ messages in thread
From: David Malcolm @ 2015-10-28 17:51 UTC (permalink / raw)
  To: law; +Cc: gcc-patches, David Malcolm

gcc/ChangeLog:
	* diagnostic-show-locus.c (struct point_state): New struct.
	(class colorizer): New class.
	(class layout_point): New class.
	(class layout_range): New class.
	(class layout): New class.
	(colorizer::colorizer): New ctor.
	(colorizer::~colorizer): New dtor.
	(layout::layout): New ctor.
	(layout::print_line): New method.
	(layout::get_state_at_point): New method.
	(layout::get_x_bound_for_row): New method.
	(show_ruler): New function.
	(diagnostic_show_locus): Reimplement in terms of class layout.
---
 gcc/diagnostic-show-locus.c | 708 +++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 695 insertions(+), 13 deletions(-)

diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
index fdf73de..6865209 100644
--- a/gcc/diagnostic-show-locus.c
+++ b/gcc/diagnostic-show-locus.c
@@ -36,11 +36,682 @@ along with GCC; see the file COPYING3.  If not see
 # include <sys/ioctl.h>
 #endif
 
-/* Print the physical source line corresponding to the location of
-   this diagnostic, and a caret indicating the precise column.  This
-   function only prints two caret characters if the two locations
-   given by DIAGNOSTIC are on the same line according to
-   diagnostic_same_line().  */
+static void
+show_ruler (diagnostic_context *context, int max_width, int x_offset);
+
+/* Classes for rendering source code and diagnostics, within an
+   anonymous namespace.
+   The work is done by "class layout", which embeds and uses
+   "class colorizer" and "class layout_range" to get things done.  */
+
+namespace {
+
+/* The state at a given point of the source code, assuming that we're
+   in a range: which range are we in, and whether we should draw a caret at
+   this point.  */
+
+struct point_state
+{
+  int range_idx;
+  bool draw_caret_p;
+};
+
+/* A class to inject colorization codes when printing the diagnostic locus.
+
+   It has one kind of colorization for each of:
+     - normal text
+     - range 0 (the "primary location")
+     - range 1
+     - range 2
+
+   The class caches the lookup of the color codes for the above.
+
+   The class also has responsibility for tracking which of the above is
+   active, filtering out unnecessary changes.  This allows layout::print_line
+   to simply request a colorization code for *every* character it prints
+   through this class, and have the filtering be done for it here.  */
+
+class colorizer
+{
+ public:
+  colorizer (diagnostic_context *context,
+	     const diagnostic_info *diagnostic);
+  ~colorizer ();
+
+  void set_range (int range_idx) { set_state (range_idx); }
+  void set_normal_text () { set_state (STATE_NORMAL_TEXT); }
+
+ private:
+  void set_state (int state);
+  void begin_state (int state);
+  void finish_state (int state);
+
+ private:
+  static const int STATE_NORMAL_TEXT = -1;
+
+  diagnostic_context *m_context;
+  const diagnostic_info *m_diagnostic;
+  int m_current_state;
+  const char *m_caret_cs;
+  const char *m_caret_ce;
+  const char *m_range1_cs;
+  const char *m_range2_cs;
+  const char *m_range_ce;
+};
+
+/* A point within a layout_range; similar to an expanded_location,
+   but after filtering on file.  */
+
+class layout_point
+{
+ public:
+  layout_point (const expanded_location &exploc)
+  : m_line (exploc.line),
+    m_column (exploc.column) {}
+
+  int m_line;
+  int m_column;
+};
+
+/* A class for use by "class layout" below: a filtered location_range.  */
+
+class layout_range
+{
+ public:
+  layout_range (const location_range *loc_range);
+
+  bool contains_point (int row, int column) const;
+
+  layout_point m_start;
+  layout_point m_finish;
+  bool m_show_caret_p;
+  layout_point m_caret;
+};
+
+/* A class to control the overall layout when printing a diagnostic.
+
+   The layout is determined within the constructor.
+   It is then printed by repeatedly calling the "print_line" method.
+   Each such call can print two lines: one for the source line itself,
+   and potentially an "annotation" line, containing carets/underlines.
+
+   We assume we have disjoint ranges.  */
+
+class layout
+{
+ public:
+  layout (diagnostic_context *context,
+	  const diagnostic_info *diagnostic);
+
+  int get_first_line () const { return m_first_line; }
+  int get_last_line () const { return m_last_line; }
+
+  void print_line (int row);
+
+ private:
+  bool
+  get_state_at_point (/* Inputs.  */
+		      int row, int column,
+		      int first_non_ws, int last_non_ws,
+		      /* Outputs.  */
+		      point_state *out_state);
+
+  int
+  get_x_bound_for_row (int row, int caret_column,
+		       int last_non_ws);
+
+ private:
+  diagnostic_context *m_context;
+  pretty_printer *m_pp;
+  diagnostic_t m_diagnostic_kind;
+  expanded_location m_exploc;
+  colorizer m_colorizer;
+  bool m_colorize_source_p;
+  auto_vec <layout_range> m_layout_ranges;
+  int m_first_line;
+  int m_last_line;
+  int m_x_offset;
+};
+
+/* Implementation of "class colorizer".  */
+
+/* The constructor for "colorizer".  Lookup and store color codes for the
+   different kinds of things we might need to print.  */
+
+colorizer::colorizer (diagnostic_context *context,
+		      const diagnostic_info *diagnostic) :
+  m_context (context),
+  m_diagnostic (diagnostic),
+  m_current_state (STATE_NORMAL_TEXT)
+{
+  m_caret_ce = colorize_stop (pp_show_color (context->printer));
+  m_range1_cs = colorize_start (pp_show_color (context->printer), "range1");
+  m_range2_cs = colorize_start (pp_show_color (context->printer), "range2");
+  m_range_ce = colorize_stop (pp_show_color (context->printer));
+}
+
+/* The destructor for "colorize".  If colorization is on, print a code to
+   turn it off.  */
+
+colorizer::~colorizer ()
+{
+  finish_state (m_current_state);
+}
+
+/* Update state, printing color codes if necessary if there's a state
+   change.  */
+
+void
+colorizer::set_state (int new_state)
+{
+  if (m_current_state != new_state)
+    {
+      finish_state (m_current_state);
+      m_current_state = new_state;
+      begin_state (new_state);
+    }
+}
+
+/* Turn on any colorization for STATE.  */
+
+void
+colorizer::begin_state (int state)
+{
+  switch (state)
+    {
+    case STATE_NORMAL_TEXT:
+      break;
+
+    case 0:
+      /* Make range 0 be the same color as the "kind" text
+	 (error vs warning vs note).  */
+      pp_string
+	(m_context->printer,
+	 colorize_start (pp_show_color (m_context->printer),
+			 diagnostic_get_color_for_kind (m_diagnostic->kind)));
+      break;
+
+    case 1:
+      pp_string (m_context->printer, m_range1_cs);
+      break;
+
+    case 2:
+      pp_string (m_context->printer, m_range2_cs);
+      break;
+
+    default:
+      /* We don't expect more than 3 ranges per diagnostic.  */
+      gcc_unreachable ();
+      break;
+    }
+}
+
+/* Turn off any colorization for STATE.  */
+
+void
+colorizer::finish_state (int state)
+{
+  switch (state)
+    {
+    case STATE_NORMAL_TEXT:
+      break;
+
+    case 0:
+      pp_string (m_context->printer, m_caret_ce);
+      break;
+
+    default:
+      /* Within a range.  */
+      gcc_assert (state > 0);
+      pp_string (m_context->printer, m_range_ce);
+      break;
+    }
+}
+
+/* Implementation of class layout_range.  */
+
+/* The constructor for class layout_range.
+   Initialize various layout_point fields from expanded_location
+   equivalents; we've already filtered on file.  */
+
+layout_range::layout_range (const location_range *loc_range)
+: m_start (loc_range->m_start),
+  m_finish (loc_range->m_finish),
+  m_show_caret_p (loc_range->m_show_caret_p),
+  m_caret (loc_range->m_caret)
+{
+}
+
+/* Is (column, row) within the given range?
+   We've already filtered on the file.
+
+   Ranges are closed (both limits are within the range).
+
+   Example A: a single-line range:
+     start:  (col=22, line=2)
+     finish: (col=38, line=2)
+
+  |00000011111111112222222222333333333344444444444
+  |34567890123456789012345678901234567890123456789
+--+-----------------------------------------------
+01|bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
+02|bbbbbbbbbbbbbbbbbbbSwwwwwwwwwwwwwwwFaaaaaaaaaaa
+03|aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+
+   Example B: a multiline range with
+     start:  (col=14, line=3)
+     finish: (col=08, line=5)
+
+  |00000011111111112222222222333333333344444444444
+  |34567890123456789012345678901234567890123456789
+--+-----------------------------------------------
+01|bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
+02|bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
+03|bbbbbbbbbbbSwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
+04|wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
+05|wwwwwFaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+06|aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+--+-----------------------------------------------
+
+   Legend:
+   - 'b' indicates a point *before* the range
+   - 'S' indicates the start of the range
+   - 'w' indicates a point within the range
+   - 'F' indicates the finish of the range (which is
+	 within it).
+   - 'a' indicates a subsequent point *after* the range.  */
+
+bool
+layout_range::contains_point (int row, int column) const
+{
+  gcc_assert (m_start.m_line <= m_finish.m_line);
+  /* ...but the equivalent isn't true for the columns;
+     consider example B in the comment above.  */
+
+  if (row < m_start.m_line)
+    /* Points before the first line of the range are
+       outside it (corresponding to line 01 in example A
+       and lines 01 and 02 in example B above).  */
+    return false;
+
+  if (row == m_start.m_line)
+    /* On same line as start of range (corresponding
+       to line 02 in example A and line 03 in example B).  */
+    {
+      if (column < m_start.m_column)
+	/* Points on the starting line of the range, but
+	   before the column in which it begins.  */
+	return false;
+
+      if (row < m_finish.m_line)
+	/* This is a multiline range; the point
+	   is within it (corresponds to line 03 in example B
+	   from column 14 onwards) */
+	return true;
+      else
+	{
+	  /* This is a single-line range.  */
+	  gcc_assert (row == m_finish.m_line);
+	  return column <= m_finish.m_column;
+	}
+    }
+
+  /* The point is in a line beyond that containing the
+     start of the range: lines 03 onwards in example A,
+     and lines 04 onwards in example B.  */
+  gcc_assert (row > m_start.m_line);
+
+  if (row > m_finish.m_line)
+    /* The point is beyond the final line of the range
+       (lines 03 onwards in example A, and lines 06 onwards
+       in example B).  */
+    return false;
+
+  if (row < m_finish.m_line)
+    {
+      /* The point is in a line that's fully within a multiline
+	 range (e.g. line 04 in example B).  */
+      gcc_assert (m_start.m_line < m_finish.m_line);
+      return true;
+    }
+
+  gcc_assert (row ==  m_finish.m_line);
+
+  return column <= m_finish.m_column;
+}
+
+/* Given a source line LINE of length LINE_WIDTH, determine the width
+   without any trailing whitespace.  */
+
+static int
+get_line_width_without_trailing_whitespace (const char *line, int line_width)
+{
+  int result = line_width;
+  while (result > 0)
+    {
+      char ch = line[result - 1];
+      if (ch == ' ' || ch == '\t')
+	result--;
+      else
+	break;
+    }
+  gcc_assert (result >= 0);
+  gcc_assert (result <= line_width);
+  gcc_assert (result == 0 ||
+	      (line[result - 1] != ' '
+	       && line[result -1] != '\t'));
+  return result;
+}
+
+/* Implementation of class layout.  */
+
+/* Constructor for class layout.
+
+   Filter the ranges from the rich_location to those that we can
+   sanely print, populating m_layout_ranges.
+   Determine the range of lines that we will print.
+   Determine m_x_offset, to ensure that the primary caret
+   will fit within the max_width provided by the diagnostic_context.  */
+
+layout::layout (diagnostic_context * context,
+		const diagnostic_info *diagnostic)
+: m_context (context),
+  m_pp (context->printer),
+  m_diagnostic_kind (diagnostic->kind),
+  m_exploc (diagnostic->richloc->lazily_expand_location ()),
+  m_colorizer (context, diagnostic),
+  m_colorize_source_p (context->colorize_source_p),
+  m_layout_ranges (rich_location::MAX_RANGES),
+  m_first_line (m_exploc.line),
+  m_last_line  (m_exploc.line),
+  m_x_offset (0)
+{
+  rich_location *richloc = diagnostic->richloc;
+  for (unsigned int idx = 0; idx < richloc->get_num_locations (); idx++)
+    {
+      /* This diagnostic printer can only cope with "sufficiently sane" ranges.
+	 Ignore any ranges that are awkward to handle.  */
+      location_range *loc_range = richloc->get_range (idx);
+
+      /* If any part of the range isn't in the same file as the primary
+	 location of this diagnostic, ignore the range.  */
+      if (loc_range->m_start.file != m_exploc.file)
+	continue;
+      if (loc_range->m_finish.file != m_exploc.file)
+	continue;
+      if (loc_range->m_show_caret_p)
+	if (loc_range->m_caret.file != m_exploc.file)
+	  continue;
+
+      /* Passed all the tests; add the range to m_layout_ranges so that
+	 it will be printed.  */
+      layout_range ri (loc_range);
+      m_layout_ranges.safe_push (ri);
+
+      /* Update m_first_line/m_last_line if necessary.  */
+      if (loc_range->m_start.line < m_first_line)
+	m_first_line = loc_range->m_start.line;
+      if (loc_range->m_finish.line > m_last_line)
+	m_last_line = loc_range->m_finish.line;
+    }
+
+  /* Adjust m_x_offset.
+     Center the primary caret to fit in max_width; all columns
+     will be adjusted accordingly.  */
+  int max_width = m_context->caret_max_width;
+  int line_width;
+  const char *line = location_get_source_line (m_exploc.file, m_exploc.line,
+					       &line_width);
+  if (line && m_exploc.column <= line_width)
+    {
+      int right_margin = CARET_LINE_MARGIN;
+      int column = m_exploc.column;
+      right_margin = MIN (line_width - column, right_margin);
+      right_margin = max_width - right_margin;
+      if (line_width >= max_width && column > right_margin)
+	m_x_offset = column - right_margin;
+      gcc_assert (m_x_offset >= 0);
+    }
+
+  if (0)
+    show_ruler (context, line_width, m_x_offset);
+}
+
+/* Print text describing a line of source code.
+   This typically prints two lines:
+
+   (1) the source code itself, potentially colorized at any ranges, and
+   (2) an annotation line containing any carets/underlines
+   describing the ranges.  */
+
+void
+layout::print_line (int row)
+{
+  int line_width;
+  const char *line = location_get_source_line (m_exploc.file, row,
+					       &line_width);
+  if (!line)
+    return;
+
+  line += m_x_offset;
+
+  m_colorizer.set_normal_text ();
+
+  /* Step 1: print the source code line.  */
+
+  /* We will stop printing at any trailing whitespace.  */
+  line_width
+    = get_line_width_without_trailing_whitespace (line,
+						  line_width);
+  pp_space (m_pp);
+  int first_non_ws = INT_MAX;
+  int last_non_ws = 0;
+  int column;
+  for (column = 1 + m_x_offset; column <= line_width; column++)
+    {
+      /* Assuming colorization is enabled for the caret and underline
+	 characters, we may also colorize the associated characters
+	 within the source line.
+
+	 For frontends that generate range information, we color the
+	 associated characters in the source line the same as the
+	 carets and underlines in the annotation line, to make it easier
+	 for the reader to see the pertinent code.
+
+	 For frontends that only generate carets, we don't colorize the
+	 characters above them, since this would look strange (e.g.
+	 colorizing just the first character in a token).  */
+      if (m_colorize_source_p)
+	{
+	  bool in_range_p;
+	  point_state state;
+	  in_range_p = get_state_at_point (row, column,
+					   0, INT_MAX,
+					   &state);
+	  if (in_range_p)
+	    m_colorizer.set_range (state.range_idx);
+	  else
+	    m_colorizer.set_normal_text ();
+	}
+      char c = *line == '\t' ? ' ' : *line;
+      if (c == '\0')
+	c = ' ';
+      if (c != ' ')
+	{
+	  last_non_ws = column;
+	  if (first_non_ws == INT_MAX)
+	    first_non_ws = column;
+	}
+      pp_character (m_pp, c);
+      line++;
+    }
+  pp_newline (m_pp);
+
+  /* Step 2: print a line consisting of the caret/underlines for the
+     given source line.  */
+  int x_bound = get_x_bound_for_row (row, m_exploc.column,
+				     last_non_ws);
+
+  pp_space (m_pp);
+  for (int column = 1 + m_x_offset; column < x_bound; column++)
+    {
+      bool in_range_p;
+      point_state state;
+      in_range_p = get_state_at_point (row, column,
+				       first_non_ws, last_non_ws,
+				       &state);
+      if (in_range_p)
+	{
+	  /* Within a range.  Draw either the caret or an underline.  */
+	  m_colorizer.set_range (state.range_idx);
+	  if (state.draw_caret_p)
+	    /* Draw the caret.  */
+	    pp_character (m_pp, m_context->caret_chars[state.range_idx]);
+	  else
+	    pp_character (m_pp, '~');
+	}
+      else
+	{
+	  /* Not in a range.  */
+	  m_colorizer.set_normal_text ();
+	  pp_character (m_pp, ' ');
+	}
+    }
+  pp_newline (m_pp);
+}
+
+/* Return true if (ROW/COLUMN) is within a range of the layout.
+   If it returns true, OUT_STATE is written to, with the
+   range index, and whether we should draw the caret at
+   (ROW/COLUMN) (as opposed to an underline).  */
+
+bool
+layout::get_state_at_point (/* Inputs.  */
+			    int row, int column,
+			    int first_non_ws, int last_non_ws,
+			    /* Outputs.  */
+			    point_state *out_state)
+{
+  layout_range *range;
+  int i;
+  FOR_EACH_VEC_ELT (m_layout_ranges, i, range)
+    {
+      if (0)
+	fprintf (stderr,
+		 "range ( (%i, %i), (%i, %i))->contains_point (%i, %i): %s\n",
+		 range->m_start.m_line,
+		 range->m_start.m_column,
+		 range->m_finish.m_line,
+		 range->m_finish.m_column,
+		 row,
+		 column,
+		 range->contains_point (row, column) ? "true" : "false");
+
+      if (range->contains_point (row, column))
+	{
+	  out_state->range_idx = i;
+
+	  /* Are we at the range's caret?  is it visible? */
+	  out_state->draw_caret_p = false;
+	  if (row == range->m_caret.m_line
+	      && column == range->m_caret.m_column)
+	    out_state->draw_caret_p = range->m_show_caret_p;
+
+	  /* Within a multiline range, don't display any underline
+	     in any leading or trailing whitespace on a line.
+	     We do display carets, however.  */
+	  if (!out_state->draw_caret_p)
+	    if (column < first_non_ws || column > last_non_ws)
+	      return false;
+
+	  /* We are within a range.  */
+	  return true;
+	}
+    }
+
+  return false;
+}
+
+/* Helper function for use by layout::print_line when printing the
+   annotation line under the source line.
+   Get the column beyond the rightmost one that could contain a caret or
+   range marker, given that we stop rendering at trailing whitespace.
+   ROW is the source line within the given file.
+   CARET_COLUMN is the column of range 0's caret.
+   LAST_NON_WS_COLUMN is the last column containing a non-whitespace
+   character of source (as determined when printing the source line).  */
+
+int
+layout::get_x_bound_for_row (int row, int caret_column,
+			     int last_non_ws_column)
+{
+  int result = caret_column + 1;
+
+  layout_range *range;
+  int i;
+  FOR_EACH_VEC_ELT (m_layout_ranges, i, range)
+    {
+      if (row >= range->m_start.m_line)
+	{
+	  if (range->m_finish.m_line == row)
+	    {
+	      /* On the final line within a range; ensure that
+		 we render up to the end of the range.  */
+	      if (result <= range->m_finish.m_column)
+		result = range->m_finish.m_column + 1;
+	    }
+	  else if (row < range->m_finish.m_line)
+	    {
+	      /* Within a multiline range; ensure that we render up to the
+		 last non-whitespace column.  */
+	      if (result <= last_non_ws_column)
+		result = last_non_ws_column + 1;
+	    }
+	}
+    }
+
+  return result;
+}
+
+} /* End of anonymous namespace.  */
+
+/* For debugging layout issues in diagnostic_show_locus and friends,
+   render a ruler giving column numbers (after the 1-column indent).  */
+
+static void
+show_ruler (diagnostic_context *context, int max_width, int x_offset)
+{
+  /* Hundreds.  */
+  if (max_width > 99)
+    {
+      pp_space (context->printer);
+      for (int column = 1 + x_offset; column < max_width; column++)
+	if (0 == column % 10)
+	  pp_character (context->printer, '0' + (column / 100) % 10);
+	else
+	  pp_space (context->printer);
+      pp_newline (context->printer);
+    }
+
+  /* Tens.  */
+  pp_space (context->printer);
+  for (int column = 1 + x_offset; column < max_width; column++)
+    if (0 == column % 10)
+      pp_character (context->printer, '0' + (column / 10) % 10);
+    else
+      pp_space (context->printer);
+  pp_newline (context->printer);
+
+  /* Units.  */
+  pp_space (context->printer);
+  for (int column = 1 + x_offset; column < max_width; column++)
+    pp_character (context->printer, '0' + (column % 10));
+  pp_newline (context->printer);
+}
+
+/* Print the physical source code corresponding to the location of
+   this diagnostic, with additional annotations.  */
+
 void
 diagnostic_show_locus (diagnostic_context * context,
 		       const diagnostic_info *diagnostic)
@@ -51,14 +722,25 @@ diagnostic_show_locus (diagnostic_context * context,
     return;
 
   context->last_location = diagnostic_location (diagnostic, 0);
-  expanded_location s0 = diagnostic_expand_location (diagnostic, 0);
-  expanded_location s1 = { };
-  /* Zero-initialized. This is checked later by diagnostic_print_caret_line.  */
 
-  if (diagnostic_location (diagnostic, 1) > BUILTINS_LOCATION)
-    s1 = diagnostic_expand_location (diagnostic, 1);
+  pp_newline (context->printer);
+
+  const char *saved_prefix = pp_get_prefix (context->printer);
+  pp_set_prefix (context->printer, NULL);
+
+  {
+    layout layout (context, diagnostic);
+    int last_line = layout.get_last_line ();
+    for (int row = layout.get_first_line ();
+	 row <= last_line;
+	 row++)
+      layout.print_line (row);
+
+    /* The closing scope here leads to the dtor for layout and thus
+       colorizer being called here, which affects the precise
+       place where colorization is turned off in the unittest
+       for colorized output.  */
+  }
 
-  diagnostic_print_caret_line (context, s0, s1,
-			       context->caret_chars[0],
-			       context->caret_chars[1]);
+  pp_set_prefix (context->printer, saved_prefix);
 }
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 04/10] Reimplement diagnostic_show_locus, introducing rich_location classes (v5)
  2015-10-27 23:12     ` Jeff Law
@ 2015-10-28 17:52       ` David Malcolm
  2015-10-28 17:51         ` [PATCH 4b] diagnostic-show-locus.c changes: Insertions David Malcolm
                           ` (3 more replies)
  0 siblings, 4 replies; 83+ messages in thread
From: David Malcolm @ 2015-10-28 17:52 UTC (permalink / raw)
  To: law; +Cc: gcc-patches, David Malcolm

On Tue, 2015-10-27 at 17:02 -0600, Jeff Law wrote:
> On 10/23/2015 02:41 PM, David Malcolm wrote:
> > The change since v4 can be seen at:
> >   https://dmalcolm.fedorapeople.org/gcc/2015-10-23/0001-Add-colorize_source_p-to-diagnostic_context.patch
> > which is a tweak to colorization, to handle both frontends that provide
> > ranges and those that only provide carets, and provide a smoother
> > transition path for the latter.
> >
> > gcc/ChangeLog:
> > 	* diagnostic-color.c (color_dict): Eliminate "caret"; add "range1"
> > 	and "range2".
> > 	(parse_gcc_colors): Update comment to describe default GCC_COLORS.
> > 	* diagnostic-core.h (warning_at_rich_loc): New declaration.
> > 	(error_at_rich_loc): New declaration.
> > 	(permerror_at_rich_loc): New declaration.
> > 	(inform_at_rich_loc): New declaration.
> > 	* diagnostic-show-locus.c (adjust_line): Delete.
> > 	(struct point_state): New struct.
> > 	(class colorizer): New class.
> > 	(class layout_point): New class.
> > 	(class layout_range): New class.
> > 	(class layout): New class.
> > 	(colorizer::colorizer): New ctor.
> > 	(colorizer::~colorizer): New dtor.
> > 	(layout::layout): New ctor.
> > 	(layout::print_line): New method.
> > 	(layout::get_state_at_point): New method.
> > 	(layout::get_x_bound_for_row): New method.
> > 	(show_ruler): New function.
> > 	(diagnostic_show_locus): Reimplement in terms of class layout.
> > 	* diagnostic.c (diagnostic_initialize): Replace
> > 	MAX_LOCATIONS_PER_MESSAGE with rich_location::MAX_RANGES.
> > 	(diagnostic_set_info_translated): Convert param from location_t
> > 	to rich_location *.  Eliminate calls to set_location on the
> > 	message in favor of storing the rich_location ptr there.
> > 	(diagnostic_set_info): Convert param from location_t to
> > 	rich_location *.
> > 	(diagnostic_build_prefix): Break out array into...
> > 	(diagnostic_kind_color): New variable.
> > 	(diagnostic_get_color_for_kind): New function.
> > 	(diagnostic_report_diagnostic): Colorize the option_text
> > 	using the color for the severity.
> > 	(diagnostic_append_note): Update for change in signature of
> > 	diagnostic_set_info.
> > 	(diagnostic_append_note_at_rich_loc): New function.
> > 	(emit_diagnostic): Update for change in signature of
> > 	diagnostic_set_info.
> > 	(inform): Likewise.
> > 	(inform_at_rich_loc): New function.
> > 	(inform_n): Update for change in signature of diagnostic_set_info.
> > 	(warning): Likewise.
> > 	(warning_at): Likewise.
> > 	(warning_at_rich_loc): New function.
> > 	(warning_n): Update for change in signature of diagnostic_set_info.
> > 	(pedwarn): Likewise.
> > 	(permerror): Likewise.
> > 	(permerror_at_rich_loc): New function.
> > 	(error): Update for change in signature of diagnostic_set_info.
> > 	(error_n): Likewise.
> > 	(error_at): Likewise.
> > 	(error_at_rich_loc): New function.
> > 	(sorry): Update for change in signature of diagnostic_set_info.
> > 	(fatal_error): Likewise.
> > 	(internal_error): Likewise.
> > 	(internal_error_no_backtrace): Likewise.
> > 	(source_range::debug): New function.
> > 	* diagnostic.h (struct diagnostic_info): Eliminate field
> > 	"override_column".  Add field "richloc".
> > 	(struct diagnostic_context): Add field "colorize_source_p".
> > 	(diagnostic_override_column): Delete.
> > 	(diagnostic_set_info): Convert param from location_t to
> > 	rich_location *.
> > 	(diagnostic_set_info_translated): Likewise.
> > 	(diagnostic_append_note_at_rich_loc): New function.
> > 	(diagnostic_num_locations): New function.
> > 	(diagnostic_expand_location): Get the location from the
> > 	rich_location.
> > 	(diagnostic_print_caret_line): Delete.
> > 	(diagnostic_get_color_for_kind): New declaration.
> > 	* genmatch.c (linemap_client_expand_location_to_spelling_point): New.
> > 	(error_cb): Update for change in signature of "error" callback.
> > 	(fatal_at): Likewise.
> > 	(warning_at): Likewise.
> > 	* input.c (linemap_client_expand_location_to_spelling_point): New.
> > 	* pretty-print.c (text_info::set_range): New method.
> > 	(text_info::get_location): New method.
> > 	* pretty-print.h (MAX_LOCATIONS_PER_MESSAGE): Eliminate this macro.
> > 	(struct text_info): Eliminate "locations" array in favor of
> > 	"m_richloc", a rich_location *.
> > 	(textinfo::set_location): Add a "caret_p" param, and reimplement
> > 	in terms of a call to set_range.
> > 	(textinfo::get_location): Eliminate inline implementation in favor of
> > 	an out-of-line reimplementation.
> > 	(textinfo::set_range): New method.
> > 	* rtl-error.c (diagnostic_for_asm): Update for change in signature
> > 	of diagnostic_set_info.
> > 	* tree-diagnostic.c (default_tree_printer): Update for new
> > 	"caret_p" param for textinfo::set_location.
> > 	* tree-pretty-print.c (percent_K_format): Likewise.
> >
> > gcc/c-family/ChangeLog:
> > 	* c-common.c (c_cpp_error): Convert parameter from location_t to
> > 	rich_location *.  Eliminate the "column_override" parameter and
> > 	the call to diagnostic_override_column.
> > 	Update the "done_lexing" clause to set range 0
> > 	on the rich_location, rather than overwriting a location_t.
> > 	* c-common.h (c_cpp_error): Convert parameter from location_t to
> > 	rich_location *.  Eliminate the "column_override" parameter.
> >
> > gcc/c/ChangeLog:
> > 	* c-decl.c (warn_defaults_to): Update for change in signature
> > 	of diagnostic_set_info.
> > 	* c-errors.c (pedwarn_c99): Likewise.
> > 	(pedwarn_c90): Likewise.
> > 	* c-objc-common.c (c_tree_printer): Update for new "caret_p" param
> > 	for textinfo::set_location.
> >
> > gcc/cp/ChangeLog:
> > 	* error.c (cp_printer): Update for new "caret_p" param for
> > 	textinfo::set_location.
> > 	(pedwarn_cxx98): Update for change in signature of
> > 	diagnostic_set_info.
> >
> > gcc/fortran/ChangeLog:
> > 	* cpp.c (cb_cpp_error): Convert parameter from location_t to
> > 	rich_location *.  Eliminate the "column_override" parameter.
> > 	* error.c (gfc_warning): Update for change in signature of
> > 	diagnostic_set_info.
> > 	(gfc_format_decoder): Update handling of %C/%L for changes
> > 	to struct text_info.
> > 	(gfc_diagnostic_starter): Use richloc when determining whether to
> > 	print one locus or two.  When handling a location that will
> > 	involve a call to diagnostic_show_locus, only attempt to print the
> > 	locus for the primary location, and don't call into
> > 	diagnostic_print_caret_line.
> > 	(gfc_warning_now_at): Update for change in signature of
> > 	diagnostic_set_info.
> > 	(gfc_warning_now): Likewise.
> > 	(gfc_error_now): Likewise.
> > 	(gfc_fatal_error): Likewise.
> > 	(gfc_error): Likewise.
> > 	(gfc_internal_error): Likewise.
> >
> > gcc/testsuite/ChangeLog:
> > 	* gcc.dg/plugin/diagnostic-test-show-locus-bw.c: New file.
> > 	* gcc.dg/plugin/diagnostic-test-show-locus-color.c: New file.
> > 	* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c: New file.
> > 	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add the above.
> > 	* lib/gcc-dg.exp: Load multiline.exp.
> >
> > libcpp/ChangeLog:
> > 	* errors.c (cpp_diagnostic): Update for change in signature
> > 	of "error" callback.
> > 	(cpp_diagnostic_with_line): Likewise, calling override_column
> > 	on the rich_location.
> > 	* include/cpplib.h (struct cpp_callbacks): Within "error"
> > 	callback, convert param from source_location to rich_location *,
> > 	and drop column_override param.
> > 	* include/line-map.h (struct source_range): New struct.
> > 	(struct location_range): New struct.
> > 	(class rich_location): New class.
> > 	(linemap_client_expand_location_to_spelling_point): New declaration.
> > 	* line-map.c (rich_location::rich_location): New ctors.
> > 	(rich_location::lazily_expand_location): New method.
> > 	(rich_location::override_column): New method.
> > 	(rich_location::add_range): New methods.
> > 	(rich_location::set_range): New method.
> > ---
> 
> > diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
> > index 147a2b8..6865209 100644
> > --- a/gcc/diagnostic-show-locus.c
> > +++ b/gcc/diagnostic-show-locus.c
> I'm having trouble breaking this down into manageable hunks to look at. 
>   Are there bits in here that we can pull out as separate patches?  It 
> looks like git diff is just making a mess of this file when I think it's 
> a huge chunk of new code and a few deletes.
> 
> If you have a blob of new code and a blob of deletes, even breaking it 
> down that way may help in this case (ie, a patch with new classes & 
> code, then a pass that deletes old crud we're not going to use anymore).

I've split patch 4 of the kit into 3 sub-patches:

  [PATCH 4a] diagnostic-show-locus.c changes: Deletions
  [PATCH 4b] diagnostic-show-locus.c changes: Insertions
  [PATCH 4c] Other changes: everything apart from diagnostic-show-locus.c changes

4a, 4b, and 4c should appear as followups to this mail (assuming my
"git send-email" command works OK).  They're only split up for
ease-of-review purposes; they're intended to be committed as one commit.

The 4a/4b split seems to have allowed "git diff" to done a
much better job on diagnostic-show-locus.c.

Patch 4c contains updates based on your review comments below; a diff
relative to the prior version of the patch can be seen at:
 https://dmalcolm.fedorapeople.org/gcc/2015-10-28/rich-locations/0001-Fix-issues-found-in-review.patch

> > +void
> > +source_range::debug (const char *msg) const
> > +{
> > +  rich_location richloc (m_start);
> > +  richloc.add_range (m_start, m_finish);
> > +  inform_at_rich_loc (&richloc, "%s", msg);
> > +}
> Function comment.  Do you need a DEBUG_FUNCTION annotation here?

Added function comment and DEBUG_FUNCTION annotation.

Note that this is slightly messy: the method is declared in libcpp
(in libcpp/include/line-map.h), but defined in gcc: it uses the
gcc diagnostics-printing machinery to print a note highlighting the
range.  The method is sufficently useful for debugging to warrant
this layering violation, in my opionion  (I've been using it all the
time when working on this code).  I've called out this wart in the
new comments.

> > +extern const char *
> > +diagnostic_get_color_for_kind (diagnostic_t kind);
> If this will fit on one line, then combine the lines.

Fixed.

> >   /* Pure text formatting support functions.  */
> >   extern char *file_name_as_prefix (diagnostic_context *, const char *);
> 
> > diff --git a/gcc/fortran/error.c b/gcc/fortran/error.c
> > index 3825751..4b3d31c 100644
> > --- a/gcc/fortran/error.c
> > +++ b/gcc/fortran/error.c
> I'm having a hard time mapping the code you removed from 
> fortran/error.c::gfc_diagnostic_starter to its functional equivalent in 
> your new code.  I know we've discussed this issue a few times on the 
> phone, so I don't doubt you're handling it.  I just want to know where 
> so I can double-check things a bit.

Both old and new implementations of gfc_diagnostic_starter
determine if we have just one locus, or more than one
(bool one_locus).

Both old and new implementations determine if
diagnostic_show_locus is going to print anything.
There's some logic for the no-caret-printed case to
print multiple prefix lines, which is the same for
both old and new implementations; this is what the existing
test cases look for (the existing test cases all implicitly use
-fno-diagnostics-show-caret and so use this branch).

For the caret-printing case, both old and new implementation
call diagnostic_show_locus.  The old implementation
would then call some follow-up code, which is what the
removed code is: it would determine if the 2nd caret was
on a different line, and if so, print that line via
diagnostic_print_caret_line.
This followup code was needed by the old implementation since
diagnostic_show_locus would only print a single line of source
code (the one with the caret).
It's not needed by the new implementation since
diagnostic_show_locus will print the lines containing both
carets (and any lines in between).

> > diff --git a/gcc/genmatch.c b/gcc/genmatch.c
> > index 102a635..6bfde06 100644
> > --- a/gcc/genmatch.c
> > +++ b/gcc/genmatch.c
> > @@ -53,14 +53,23 @@ unsigned verbose;
> >
> >   static struct line_maps *line_table;
> >
> > +expanded_location
> > +linemap_client_expand_location_to_spelling_point (source_location loc)
> > +{
> > +  const struct line_map_ordinary *map;
> > +  loc = linemap_resolve_location (line_table, loc, LRK_SPELLING_LOCATION, &map);
> > +  return linemap_expand_location (line_table, map, loc);
> > +}
> Function comment.

Fixed.

> > diff --git a/gcc/input.c b/gcc/input.c
> > index ff80dd9..baf8e7e 100644
> > --- a/gcc/input.c
> > +++ b/gcc/input.c
> > @@ -751,6 +751,13 @@ expand_location_to_spelling_point (source_location loc)
> >     return expand_location_1 (loc, /*expansion_point_p=*/false);
> >   }
> >
> > +expanded_location
> > +linemap_client_expand_location_to_spelling_point (source_location loc)
> > +{
> > +  return expand_location_to_spelling_point (loc);
> > +}
> Likewise.

Fixed.

> > diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
> > index 09378f9..84a5ab7 100644
> > --- a/libcpp/include/line-map.h
> > +++ b/libcpp/include/line-map.h
> > @@ -131,6 +131,35 @@ typedef unsigned int linenum_type;
> >     libcpp/location-example.txt.  */
> >   typedef unsigned int source_location;
> >
> > +/* A range of source locations.
> > +
> > +   Ranges are closed:
> > +   m_start is the first location within the range,
> > +   m_finish is the last location within the range.
> > +
> > +   We may need a more compact way to store these, but for now,
> > +   let's do it the simple way, as a pair.  */
> > +struct GTY(()) source_range
> > +{
> > +  source_location m_start;
> > +  source_location m_finish;
> > +
> > +  void debug (const char *msg) const;
> Do you need a DEBUG_FUNCTION annotation here?

As noted above, this method is declared in libcpp but defined in gcc.
DEBUG_FUNCTION is in gcc, so is unavailable in libcpp.  I've added
a function comment here, which notes this.

> > @@ -1028,6 +1057,175 @@ typedef struct
> >     bool sysp;
> >   } expanded_location;
> >
> > +class rich_location
> [ ... ]
> 
> > +
> > +  void
> > +  add_range (source_location start, source_location finish,
> > +	     bool show_caret_p = false);
> > +
> > +  void
> > +  add_range (source_range src_range,
> > +	     bool show_caret_p = false);
> Do we really want to bother with default arguments?  Is it buying us 
> some level of cleanliness that's hard to otherwise achieve?  Given this 
> is new code I just don't see the value here.  Educate me.

I think I was just being lazy.  Almost all calls were with
show_caret_p == false, but those were in the test plugin.
I've eliminated the default values for the arguments,
supplying explicit values.


Hopefully this addresses your concerns, and the -show-locus.c code
is now reviewable.

Is the combined 4a+4b+4c patch OK for trunk, assuming it passes
bootstrap&regtest? (running it now)

Thanks
Dave

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 4c] Other changes: everything apart from diagnostic-show-locus.c changes
  2015-10-28 17:52       ` David Malcolm
  2015-10-28 17:51         ` [PATCH 4b] diagnostic-show-locus.c changes: Insertions David Malcolm
  2015-10-28 17:51         ` [PATCH 4a] diagnostic-show-locus.c changes: Deletions David Malcolm
@ 2015-10-28 17:59         ` David Malcolm
  2015-10-30  4:49         ` [PATCH 04/10] Reimplement diagnostic_show_locus, introducing rich_location classes (v5) Jeff Law
  3 siblings, 0 replies; 83+ messages in thread
From: David Malcolm @ 2015-10-28 17:59 UTC (permalink / raw)
  To: law; +Cc: gcc-patches, David Malcolm

Reimplement diagnostic_show_locus, introducing rich_location classes.

gcc/ChangeLog:
	* diagnostic-color.c (color_dict): Eliminate "caret"; add "range1"
	and "range2".
	(parse_gcc_colors): Update comment to describe default GCC_COLORS.
	* diagnostic-core.h (warning_at_rich_loc): New declaration.
	(error_at_rich_loc): New declaration.
	(permerror_at_rich_loc): New declaration.
	(inform_at_rich_loc): New declaration.
	* diagnostic.c (diagnostic_initialize): Replace
	MAX_LOCATIONS_PER_MESSAGE with rich_location::MAX_RANGES.
	(diagnostic_set_info_translated): Convert param from location_t
	to rich_location *.  Eliminate calls to set_location on the
	message in favor of storing the rich_location ptr there.
	(diagnostic_set_info): Convert param from location_t to
	rich_location *.
	(diagnostic_build_prefix): Break out array into...
	(diagnostic_kind_color): New variable.
	(diagnostic_get_color_for_kind): New function.
	(diagnostic_report_diagnostic): Colorize the option_text
	using the color for the severity.
	(diagnostic_append_note): Update for change in signature of
	diagnostic_set_info.
	(diagnostic_append_note_at_rich_loc): New function.
	(emit_diagnostic): Update for change in signature of
	diagnostic_set_info.
	(inform): Likewise.
	(inform_at_rich_loc): New function.
	(inform_n): Update for change in signature of diagnostic_set_info.
	(warning): Likewise.
	(warning_at): Likewise.
	(warning_at_rich_loc): New function.
	(warning_n): Update for change in signature of diagnostic_set_info.
	(pedwarn): Likewise.
	(permerror): Likewise.
	(permerror_at_rich_loc): New function.
	(error): Update for change in signature of diagnostic_set_info.
	(error_n): Likewise.
	(error_at): Likewise.
	(error_at_rich_loc): New function.
	(sorry): Update for change in signature of diagnostic_set_info.
	(fatal_error): Likewise.
	(internal_error): Likewise.
	(internal_error_no_backtrace): Likewise.
	(source_range::debug): New function.
	* diagnostic.h (struct diagnostic_info): Eliminate field
	"override_column".  Add field "richloc".
	(struct diagnostic_context): Add field "colorize_source_p".
	(diagnostic_override_column): Delete.
	(diagnostic_set_info): Convert param from location_t to
	rich_location *.
	(diagnostic_set_info_translated): Likewise.
	(diagnostic_append_note_at_rich_loc): New function.
	(diagnostic_num_locations): New function.
	(diagnostic_expand_location): Get the location from the
	rich_location.
	(diagnostic_print_caret_line): Delete.
	(diagnostic_get_color_for_kind): New declaration.
	* genmatch.c (linemap_client_expand_location_to_spelling_point): New.
	(error_cb): Update for change in signature of "error" callback.
	(fatal_at): Likewise.
	(warning_at): Likewise.
	* input.c (linemap_client_expand_location_to_spelling_point): New.
	* pretty-print.c (text_info::set_range): New method.
	(text_info::get_location): New method.
	* pretty-print.h (MAX_LOCATIONS_PER_MESSAGE): Eliminate this macro.
	(struct text_info): Eliminate "locations" array in favor of
	"m_richloc", a rich_location *.
	(textinfo::set_location): Add a "caret_p" param, and reimplement
	in terms of a call to set_range.
	(textinfo::get_location): Eliminate inline implementation in favor of
	an out-of-line reimplementation.
	(textinfo::set_range): New method.
	* rtl-error.c (diagnostic_for_asm): Update for change in signature
	of diagnostic_set_info.
	* tree-diagnostic.c (default_tree_printer): Update for new
	"caret_p" param for textinfo::set_location.
	* tree-pretty-print.c (percent_K_format): Likewise.

gcc/c-family/ChangeLog:
	* c-common.c (c_cpp_error): Convert parameter from location_t to
	rich_location *.  Eliminate the "column_override" parameter and
	the call to diagnostic_override_column.
	Update the "done_lexing" clause to set range 0
	on the rich_location, rather than overwriting a location_t.
	* c-common.h (c_cpp_error): Convert parameter from location_t to
	rich_location *.  Eliminate the "column_override" parameter.

gcc/c/ChangeLog:
	* c-decl.c (warn_defaults_to): Update for change in signature
	of diagnostic_set_info.
	* c-errors.c (pedwarn_c99): Likewise.
	(pedwarn_c90): Likewise.
	* c-objc-common.c (c_tree_printer): Update for new "caret_p" param
	for textinfo::set_location.

gcc/cp/ChangeLog:
	* error.c (cp_printer): Update for new "caret_p" param for
	textinfo::set_location.
	(pedwarn_cxx98): Update for change in signature of
	diagnostic_set_info.

gcc/fortran/ChangeLog:
	* cpp.c (cb_cpp_error): Convert parameter from location_t to
	rich_location *.  Eliminate the "column_override" parameter.
	* error.c (gfc_warning): Update for change in signature of
	diagnostic_set_info.
	(gfc_format_decoder): Update handling of %C/%L for changes
	to struct text_info.
	(gfc_diagnostic_starter): Use richloc when determining whether to
	print one locus or two.  When handling a location that will
	involve a call to diagnostic_show_locus, only attempt to print the
	locus for the primary location, and don't call into
	diagnostic_print_caret_line.
	(gfc_warning_now_at): Update for change in signature of
	diagnostic_set_info.
	(gfc_warning_now): Likewise.
	(gfc_error_now): Likewise.
	(gfc_fatal_error): Likewise.
	(gfc_error): Likewise.
	(gfc_internal_error): Likewise.

gcc/testsuite/ChangeLog:
	* gcc.dg/plugin/diagnostic-test-show-locus-bw.c: New file.
	* gcc.dg/plugin/diagnostic-test-show-locus-color.c: New file.
	* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c: New file.
	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add the above.
	* lib/gcc-dg.exp: Load multiline.exp.

libcpp/ChangeLog:
	* errors.c (cpp_diagnostic): Update for change in signature
	of "error" callback.
	(cpp_diagnostic_with_line): Likewise, calling override_column
	on the rich_location.
	* include/cpplib.h (struct cpp_callbacks): Within "error"
	callback, convert param from source_location to rich_location *,
	and drop column_override param.
	* include/line-map.h (struct source_range): New struct.
	(struct location_range): New struct.
	(class rich_location): New class.
	(linemap_client_expand_location_to_spelling_point): New declaration.
	* line-map.c (rich_location::rich_location): New ctors.
	(rich_location::lazily_expand_location): New method.
	(rich_location::override_column): New method.
	(rich_location::add_range): New methods.
	(rich_location::set_range): New method.
---
 gcc/c-family/c-common.c                            |  15 +-
 gcc/c-family/c-common.h                            |   4 +-
 gcc/c/c-decl.c                                     |   3 +-
 gcc/c/c-errors.c                                   |  12 +-
 gcc/c/c-objc-common.c                              |   2 +-
 gcc/cp/error.c                                     |   5 +-
 gcc/diagnostic-color.c                             |   5 +-
 gcc/diagnostic-core.h                              |   8 +
 gcc/diagnostic.c                                   | 202 +++++++++++--
 gcc/diagnostic.h                                   |  54 ++--
 gcc/fortran/cpp.c                                  |  13 +-
 gcc/fortran/error.c                                | 103 ++-----
 gcc/genmatch.c                                     |  35 ++-
 gcc/input.c                                        |  16 +
 gcc/pretty-print.c                                 |  21 ++
 gcc/pretty-print.h                                 |  25 +-
 gcc/rtl-error.c                                    |   3 +-
 .../gcc.dg/plugin/diagnostic-test-show-locus-bw.c  | 149 ++++++++++
 .../plugin/diagnostic-test-show-locus-color.c      | 158 ++++++++++
 .../plugin/diagnostic_plugin_test_show_locus.c     | 326 +++++++++++++++++++++
 gcc/testsuite/gcc.dg/plugin/plugin.exp             |   3 +
 gcc/testsuite/lib/gcc-dg.exp                       |   1 +
 gcc/tree-diagnostic.c                              |   2 +-
 gcc/tree-pretty-print.c                            |   2 +-
 libcpp/errors.c                                    |   7 +-
 libcpp/include/cpplib.h                            |   4 +-
 libcpp/include/line-map.h                          | 218 ++++++++++++++
 libcpp/line-map.c                                  | 130 ++++++++
 28 files changed, 1337 insertions(+), 189 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-bw.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-color.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 4b64a44..4a5ccb7 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -10477,15 +10477,14 @@ c_option_controlling_cpp_error (int reason)
 /* Callback from cpp_error for PFILE to print diagnostics from the
    preprocessor.  The diagnostic is of type LEVEL, with REASON set
    to the reason code if LEVEL is represents a warning, at location
-   LOCATION unless this is after lexing and the compiler's location
-   should be used instead, with column number possibly overridden by
-   COLUMN_OVERRIDE if not zero; MSG is the translated message and AP
+   RICHLOC unless this is after lexing and the compiler's location
+   should be used instead; MSG is the translated message and AP
    the arguments.  Returns true if a diagnostic was emitted, false
    otherwise.  */
 
 bool
 c_cpp_error (cpp_reader *pfile ATTRIBUTE_UNUSED, int level, int reason,
-	     location_t location, unsigned int column_override,
+	     rich_location *richloc,
 	     const char *msg, va_list *ap)
 {
   diagnostic_info diagnostic;
@@ -10526,11 +10525,11 @@ c_cpp_error (cpp_reader *pfile ATTRIBUTE_UNUSED, int level, int reason,
       gcc_unreachable ();
     }
   if (done_lexing)
-    location = input_location;
+    richloc->set_range (0,
+			source_range::from_location (input_location),
+			true, true);
   diagnostic_set_info_translated (&diagnostic, msg, ap,
-				  location, dlevel);
-  if (column_override)
-    diagnostic_override_column (&diagnostic, column_override);
+				  richloc, dlevel);
   diagnostic_override_option_index (&diagnostic,
                                     c_option_controlling_cpp_error (reason));
   ret = report_diagnostic (&diagnostic);
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index d5fb499..b0a7661 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -995,9 +995,9 @@ extern void init_c_lex (void);
 
 extern void c_cpp_builtins (cpp_reader *);
 extern void c_cpp_builtins_optimize_pragma (cpp_reader *, tree, tree);
-extern bool c_cpp_error (cpp_reader *, int, int, location_t, unsigned int,
+extern bool c_cpp_error (cpp_reader *, int, int, rich_location *,
 			 const char *, va_list *)
-     ATTRIBUTE_GCC_DIAG(6,0);
+     ATTRIBUTE_GCC_DIAG(5,0);
 extern int c_common_has_attribute (cpp_reader *);
 
 extern bool parse_optimize_options (tree, bool);
diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index ce8406a..732080a 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -5297,9 +5297,10 @@ warn_defaults_to (location_t location, int opt, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
                        flag_isoc99 ? DK_PEDWARN : DK_WARNING);
   diagnostic.option_index = opt;
   report_diagnostic (&diagnostic);
diff --git a/gcc/c/c-errors.c b/gcc/c/c-errors.c
index e5fbf05..0f8b933 100644
--- a/gcc/c/c-errors.c
+++ b/gcc/c/c-errors.c
@@ -42,13 +42,14 @@ pedwarn_c99 (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool warned = false;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
   /* If desired, issue the C99/C11 compat warning, which is more specific
      than -pedantic.  */
   if (warn_c99_c11_compat > 0)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 			   (pedantic && !flag_isoc11)
 			   ? DK_PEDWARN : DK_WARNING);
       diagnostic.option_index = OPT_Wc99_c11_compat;
@@ -60,7 +61,7 @@ pedwarn_c99 (location_t location, int opt, const char *gmsgid, ...)
   /* For -pedantic outside C11, issue a pedwarn.  */
   else if (pedantic && !flag_isoc11)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_PEDWARN);
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_PEDWARN);
       diagnostic.option_index = opt;
       warned = report_diagnostic (&diagnostic);
     }
@@ -80,6 +81,7 @@ pedwarn_c90 (location_t location, int opt, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
   /* Warnings such as -Wvla are the most specific ones.  */
@@ -90,7 +92,7 @@ pedwarn_c90 (location_t location, int opt, const char *gmsgid, ...)
         goto out;
       else if (opt_var > 0)
 	{
-	  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+	  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 			       (pedantic && !flag_isoc99)
 			       ? DK_PEDWARN : DK_WARNING);
 	  diagnostic.option_index = opt;
@@ -102,7 +104,7 @@ pedwarn_c90 (location_t location, int opt, const char *gmsgid, ...)
      specific than -pedantic.  */
   if (warn_c90_c99_compat > 0)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 			   (pedantic && !flag_isoc99)
 			   ? DK_PEDWARN : DK_WARNING);
       diagnostic.option_index = OPT_Wc90_c99_compat;
@@ -114,7 +116,7 @@ pedwarn_c90 (location_t location, int opt, const char *gmsgid, ...)
   /* For -pedantic outside C99, issue a pedwarn.  */
   else if (pedantic && !flag_isoc99)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_PEDWARN);
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_PEDWARN);
       diagnostic.option_index = opt;
       report_diagnostic (&diagnostic);
     }
diff --git a/gcc/c/c-objc-common.c b/gcc/c/c-objc-common.c
index 47fd7de..1e601f9 100644
--- a/gcc/c/c-objc-common.c
+++ b/gcc/c/c-objc-common.c
@@ -101,7 +101,7 @@ c_tree_printer (pretty_printer *pp, text_info *text, const char *spec,
     {
       t = va_arg (*text->args_ptr, tree);
       if (set_locus)
-	text->set_location (0, DECL_SOURCE_LOCATION (t));
+	text->set_location (0, DECL_SOURCE_LOCATION (t), true);
     }
 
   switch (*spec)
diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 17870b5..2e2ff10 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -3562,7 +3562,7 @@ cp_printer (pretty_printer *pp, text_info *text, const char *spec,
 
   pp_string (pp, result);
   if (set_locus && t != NULL)
-    text->set_location (0, location_of (t));
+    text->set_location (0, location_of (t), true);
   return true;
 #undef next_tree
 #undef next_tcode
@@ -3676,9 +3676,10 @@ pedwarn_cxx98 (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 		       (cxx_dialect == cxx98) ? DK_PEDWARN : DK_WARNING);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
diff --git a/gcc/diagnostic-color.c b/gcc/diagnostic-color.c
index 3fe49b2..d848dfc 100644
--- a/gcc/diagnostic-color.c
+++ b/gcc/diagnostic-color.c
@@ -164,7 +164,8 @@ static struct color_cap color_dict[] =
   { "warning", SGR_SEQ (COLOR_BOLD COLOR_SEPARATOR COLOR_FG_MAGENTA),
 	       7, false },
   { "note", SGR_SEQ (COLOR_BOLD COLOR_SEPARATOR COLOR_FG_CYAN), 4, false },
-  { "caret", SGR_SEQ (COLOR_BOLD COLOR_SEPARATOR COLOR_FG_GREEN), 5, false },
+  { "range1", SGR_SEQ (COLOR_FG_GREEN), 6, false },
+  { "range2", SGR_SEQ (COLOR_FG_BLUE), 6, false },
   { "locus", SGR_SEQ (COLOR_BOLD), 5, false },
   { "quote", SGR_SEQ (COLOR_BOLD), 5, false },
   { NULL, NULL, 0, false }
@@ -195,7 +196,7 @@ colorize_stop (bool show_color)
 }
 
 /* Parse GCC_COLORS.  The default would look like:
-   GCC_COLORS='error=01;31:warning=01;35:note=01;36:caret=01;32:locus=01:quote=01'
+   GCC_COLORS='error=01;31:warning=01;35:note=01;36:range1=32:range2=34;locus=01:quote=01'
    No character escaping is needed or supported.  */
 static bool
 parse_gcc_colors (void)
diff --git a/gcc/diagnostic-core.h b/gcc/diagnostic-core.h
index 66d2e42..a8a7c37 100644
--- a/gcc/diagnostic-core.h
+++ b/gcc/diagnostic-core.h
@@ -63,18 +63,26 @@ extern bool warning_n (location_t, int, int, const char *, const char *, ...)
     ATTRIBUTE_GCC_DIAG(4,6) ATTRIBUTE_GCC_DIAG(5,6);
 extern bool warning_at (location_t, int, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,4);
+extern bool warning_at_rich_loc (rich_location *, int, const char *, ...)
+    ATTRIBUTE_GCC_DIAG(3,4);
 extern void error (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
 extern void error_n (location_t, int, const char *, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,5) ATTRIBUTE_GCC_DIAG(4,5);
 extern void error_at (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
+extern void error_at_rich_loc (rich_location *, const char *, ...)
+  ATTRIBUTE_GCC_DIAG(2,3);
 extern void fatal_error (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3)
      ATTRIBUTE_NORETURN;
 /* Pass one of the OPT_W* from options.h as the second parameter.  */
 extern bool pedwarn (location_t, int, const char *, ...)
      ATTRIBUTE_GCC_DIAG(3,4);
 extern bool permerror (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
+extern bool permerror_at_rich_loc (rich_location *, const char *,
+				   ...) ATTRIBUTE_GCC_DIAG(2,3);
 extern void sorry (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
 extern void inform (location_t, const char *, ...) ATTRIBUTE_GCC_DIAG(2,3);
+extern void inform_at_rich_loc (rich_location *, const char *,
+				...) ATTRIBUTE_GCC_DIAG(2,3);
 extern void inform_n (location_t, int, const char *, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,5) ATTRIBUTE_GCC_DIAG(4,5);
 extern void verbatim (const char *, ...) ATTRIBUTE_GCC_DIAG(1,2);
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 831859a..05f1d31 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -144,7 +144,7 @@ diagnostic_initialize (diagnostic_context *context, int n_opts)
     context->classify_diagnostic[i] = DK_UNSPECIFIED;
   context->show_caret = false;
   diagnostic_set_caret_max_width (context, pp_line_cutoff (context->printer));
-  for (i = 0; i < MAX_LOCATIONS_PER_MESSAGE; i++)
+  for (i = 0; i < rich_location::MAX_RANGES; i++)
     context->caret_chars[i] = '^';
   context->show_option_requested = false;
   context->abort_on_error = false;
@@ -234,16 +234,15 @@ diagnostic_finish (diagnostic_context *context)
    translated.  */
 void
 diagnostic_set_info_translated (diagnostic_info *diagnostic, const char *msg,
-				va_list *args, location_t location,
+				va_list *args, rich_location *richloc,
 				diagnostic_t kind)
 {
+  gcc_assert (richloc);
   diagnostic->message.err_no = errno;
   diagnostic->message.args_ptr = args;
   diagnostic->message.format_spec = msg;
-  diagnostic->message.set_location (0, location);
-  for (int i = 1; i < MAX_LOCATIONS_PER_MESSAGE; i++)
-    diagnostic->message.set_location (i, UNKNOWN_LOCATION);
-  diagnostic->override_column = 0;
+  diagnostic->message.m_richloc = richloc;
+  diagnostic->richloc = richloc;
   diagnostic->kind = kind;
   diagnostic->option_index = 0;
 }
@@ -252,10 +251,27 @@ diagnostic_set_info_translated (diagnostic_info *diagnostic, const char *msg,
    translated.  */
 void
 diagnostic_set_info (diagnostic_info *diagnostic, const char *gmsgid,
-		     va_list *args, location_t location,
+		     va_list *args, rich_location *richloc,
 		     diagnostic_t kind)
 {
-  diagnostic_set_info_translated (diagnostic, _(gmsgid), args, location, kind);
+  gcc_assert (richloc);
+  diagnostic_set_info_translated (diagnostic, _(gmsgid), args, richloc, kind);
+}
+
+static const char *const diagnostic_kind_color[] = {
+#define DEFINE_DIAGNOSTIC_KIND(K, T, C) (C),
+#include "diagnostic.def"
+#undef DEFINE_DIAGNOSTIC_KIND
+  NULL
+};
+
+/* Get a color name for diagnostics of type KIND
+   Result could be NULL.  */
+
+const char *
+diagnostic_get_color_for_kind (diagnostic_t kind)
+{
+  return diagnostic_kind_color[kind];
 }
 
 /* Return a malloc'd string describing a location.  The caller is
@@ -270,12 +286,6 @@ diagnostic_build_prefix (diagnostic_context *context,
 #undef DEFINE_DIAGNOSTIC_KIND
     "must-not-happen"
   };
-  static const char *const diagnostic_kind_color[] = {
-#define DEFINE_DIAGNOSTIC_KIND(K, T, C) (C),
-#include "diagnostic.def"
-#undef DEFINE_DIAGNOSTIC_KIND
-    NULL
-  };
   gcc_assert (diagnostic->kind < DK_LAST_DIAGNOSTIC_KIND);
 
   const char *text = _(diagnostic_kind_text[diagnostic->kind]);
@@ -771,10 +781,14 @@ diagnostic_report_diagnostic (diagnostic_context *context,
 
       if (option_text)
 	{
+	  const char *cs
+	    = colorize_start (pp_show_color (context->printer),
+			      diagnostic_kind_color[diagnostic->kind]);
+	  const char *ce = colorize_stop (pp_show_color (context->printer));
 	  diagnostic->message.format_spec
 	    = ACONCAT ((diagnostic->message.format_spec,
 			" ", 
-			"[", option_text, "]",
+			"[", cs, option_text, ce, "]",
 			NULL));
 	  free (option_text);
 	}
@@ -854,9 +868,40 @@ diagnostic_append_note (diagnostic_context *context,
   diagnostic_info diagnostic;
   va_list ap;
   const char *saved_prefix;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_NOTE);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_NOTE);
+  if (context->inhibit_notes_p)
+    {
+      va_end (ap);
+      return;
+    }
+  saved_prefix = pp_get_prefix (context->printer);
+  pp_set_prefix (context->printer,
+                 diagnostic_build_prefix (context, &diagnostic));
+  pp_newline (context->printer);
+  pp_format (context->printer, &diagnostic.message);
+  pp_output_formatted_text (context->printer);
+  pp_destroy_prefix (context->printer);
+  pp_set_prefix (context->printer, saved_prefix);
+  diagnostic_show_locus (context, &diagnostic);
+  va_end (ap);
+}
+
+/* Same as diagnostic_append_note, but at RICHLOC. */
+
+void
+diagnostic_append_note_at_rich_loc (diagnostic_context *context,
+				    rich_location *richloc,
+				    const char * gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
+  const char *saved_prefix;
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, richloc, DK_NOTE);
   if (context->inhibit_notes_p)
     {
       va_end (ap);
@@ -881,16 +926,17 @@ emit_diagnostic (diagnostic_t kind, location_t location, int opt,
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
   if (kind == DK_PERMERROR)
     {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
 			   permissive_error_kind (global_dc));
       diagnostic.option_index = permissive_error_option (global_dc);
     }
   else {
-      diagnostic_set_info (&diagnostic, gmsgid, &ap, location, kind);
+      diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, kind);
       if (kind == DK_WARNING || kind == DK_PEDWARN)
 	diagnostic.option_index = opt;
   }
@@ -907,9 +953,23 @@ inform (location_t location, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_NOTE);
+  report_diagnostic (&diagnostic);
+  va_end (ap);
+}
+
+/* Same as "inform", but at RICHLOC.  */
+void
+inform_at_rich_loc (rich_location *richloc, const char *gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_NOTE);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, richloc, DK_NOTE);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -922,11 +982,12 @@ inform_n (location_t location, int n, const char *singular_gmsgid,
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
 
   va_start (ap, plural_gmsgid);
   diagnostic_set_info_translated (&diagnostic,
                                   ngettext (singular_gmsgid, plural_gmsgid, n),
-                                  &ap, location, DK_NOTE);
+                                  &ap, &richloc, DK_NOTE);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -940,9 +1001,10 @@ warning (int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_WARNING);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_WARNING);
   diagnostic.option_index = opt;
 
   ret = report_diagnostic (&diagnostic);
@@ -960,9 +1022,27 @@ warning_at (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_WARNING);
+  diagnostic.option_index = opt;
+  ret = report_diagnostic (&diagnostic);
+  va_end (ap);
+  return ret;
+}
+
+/* Same as warning at, but using RICHLOC.  */
+
+bool
+warning_at_rich_loc (rich_location *richloc, int opt, const char *gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
+  bool ret;
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location, DK_WARNING);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, richloc, DK_WARNING);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
   va_end (ap);
@@ -980,11 +1060,13 @@ warning_n (location_t location, int opt, int n, const char *singular_gmsgid,
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, plural_gmsgid);
   diagnostic_set_info_translated (&diagnostic,
                                   ngettext (singular_gmsgid, plural_gmsgid, n),
-                                  &ap, location, DK_WARNING);
+                                  &ap, &richloc, DK_WARNING
+);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
   va_end (ap);
@@ -1010,9 +1092,10 @@ pedwarn (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,  DK_PEDWARN);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,  DK_PEDWARN);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
   va_end (ap);
@@ -1032,9 +1115,28 @@ permerror (location_t location, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
+  rich_location richloc (location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, location,
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
+                       permissive_error_kind (global_dc));
+  diagnostic.option_index = permissive_error_option (global_dc);
+  ret = report_diagnostic (&diagnostic);
+  va_end (ap);
+  return ret;
+}
+
+/* Same as "permerror", but at RICHLOC.  */
+
+bool
+permerror_at_rich_loc (rich_location *richloc, const char *gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
+  bool ret;
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, richloc,
                        permissive_error_kind (global_dc));
   diagnostic.option_index = permissive_error_option (global_dc);
   ret = report_diagnostic (&diagnostic);
@@ -1049,9 +1151,10 @@ error (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_ERROR);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ERROR);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -1064,11 +1167,12 @@ error_n (location_t location, int n, const char *singular_gmsgid,
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (location);
 
   va_start (ap, plural_gmsgid);
   diagnostic_set_info_translated (&diagnostic,
                                   ngettext (singular_gmsgid, plural_gmsgid, n),
-                                  &ap, location, DK_ERROR);
+                                  &ap, &richloc, DK_ERROR);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -1079,9 +1183,25 @@ error_at (location_t loc, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (loc);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, loc, DK_ERROR);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ERROR);
+  report_diagnostic (&diagnostic);
+  va_end (ap);
+}
+
+/* Same as above, but use RICH_LOC.  */
+
+void
+error_at_rich_loc (rich_location *rich_loc, const char *gmsgid, ...)
+{
+  diagnostic_info diagnostic;
+  va_list ap;
+
+  va_start (ap, gmsgid);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, rich_loc,
+		       DK_ERROR);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -1094,9 +1214,10 @@ sorry (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_SORRY);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_SORRY);
   report_diagnostic (&diagnostic);
   va_end (ap);
 }
@@ -1117,9 +1238,10 @@ fatal_error (location_t loc, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (loc);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, loc, DK_FATAL);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_FATAL);
   report_diagnostic (&diagnostic);
   va_end (ap);
 
@@ -1135,9 +1257,10 @@ internal_error (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_ICE);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ICE);
   report_diagnostic (&diagnostic);
   va_end (ap);
 
@@ -1152,9 +1275,10 @@ internal_error_no_backtrace (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
+  rich_location richloc (input_location);
 
   va_start (ap, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &ap, input_location, DK_ICE_NOBT);
+  diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ICE_NOBT);
   report_diagnostic (&diagnostic);
   va_end (ap);
 
@@ -1218,3 +1342,17 @@ real_abort (void)
 {
   abort ();
 }
+
+/* Display the given source_range instance, with MSG as a descriptive
+   comment.  This issues a "note" diagnostic at the range.
+
+   This is declared within libcpp, but implemented here, since it
+   makes use of the diagnostic-printing machinery.  */
+
+DEBUG_FUNCTION void
+source_range::debug (const char *msg) const
+{
+  rich_location richloc (m_start);
+  richloc.add_range (m_start, m_finish, false);
+  inform_at_rich_loc (&richloc, "%s", msg);
+}
diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h
index 7fcb6a8..9096e16 100644
--- a/gcc/diagnostic.h
+++ b/gcc/diagnostic.h
@@ -29,10 +29,12 @@ along with GCC; see the file COPYING3.  If not see
    list in diagnostic.def.  */
 struct diagnostic_info
 {
-  /* Text to be formatted. It also contains the location(s) for this
-     diagnostic.  */
+  /* Text to be formatted.  */
   text_info message;
-  unsigned int override_column;
+
+  /* The location at which the diagnostic is to be reported.  */
+  rich_location *richloc;
+
   /* Auxiliary data for client.  */
   void *x_data;
   /* The kind of diagnostic it is about.  */
@@ -102,8 +104,8 @@ struct diagnostic_context
   /* Maximum width of the source line printed.  */
   int caret_max_width;
 
-  /* Characters used for caret diagnostics.  */
-  char caret_chars[MAX_LOCATIONS_PER_MESSAGE];
+  /* Character used for caret diagnostics.  */
+  char caret_chars[rich_location::MAX_RANGES];
 
   /* True if we should print the command line option which controls
      each diagnostic, if known.  */
@@ -181,6 +183,15 @@ struct diagnostic_context
   int lock;
 
   bool inhibit_notes_p;
+
+  /* When printing source code, should the characters at carets and ranges
+     be colorized? (assuming colorization is on at all).
+     This should be true for frontends that generate range information
+     (so that the ranges of code are colorized),
+     and false for frontends that merely specify points within the
+     source code (to avoid e.g. colorizing just the first character in
+     a token, which would look strange).  */
+  bool colorize_source_p;
 };
 
 static inline void
@@ -252,10 +263,6 @@ extern diagnostic_context *global_dc;
 
 #define report_diagnostic(D) diagnostic_report_diagnostic (global_dc, D)
 
-/* Override the column number to be used for reporting a
-   diagnostic.  */
-#define diagnostic_override_column(DI, COL) (DI)->override_column = (COL)
-
 /* Override the option index to be used for reporting a
    diagnostic.  */
 #define diagnostic_override_option_index(DI, OPTIDX) \
@@ -279,13 +286,17 @@ extern bool diagnostic_report_diagnostic (diagnostic_context *,
 					  diagnostic_info *);
 #ifdef ATTRIBUTE_GCC_DIAG
 extern void diagnostic_set_info (diagnostic_info *, const char *, va_list *,
-				 location_t, diagnostic_t) ATTRIBUTE_GCC_DIAG(2,0);
+				 rich_location *, diagnostic_t) ATTRIBUTE_GCC_DIAG(2,0);
 extern void diagnostic_set_info_translated (diagnostic_info *, const char *,
-					    va_list *, location_t,
+					    va_list *, rich_location *,
 					    diagnostic_t)
      ATTRIBUTE_GCC_DIAG(2,0);
 extern void diagnostic_append_note (diagnostic_context *, location_t,
                                     const char *, ...) ATTRIBUTE_GCC_DIAG(3,4);
+extern void diagnostic_append_note_at_rich_loc (diagnostic_context *,
+						rich_location *,
+						const char *, ...)
+  ATTRIBUTE_GCC_DIAG(3,4);
 #endif
 extern char *diagnostic_build_prefix (diagnostic_context *, const diagnostic_info *);
 void default_diagnostic_starter (diagnostic_context *, diagnostic_info *);
@@ -306,6 +317,14 @@ diagnostic_location (const diagnostic_info * diagnostic, int which = 0)
   return diagnostic->message.get_location (which);
 }
 
+/* Return the number of locations to be printed in DIAGNOSTIC.  */
+
+static inline unsigned int
+diagnostic_num_locations (const diagnostic_info * diagnostic)
+{
+  return diagnostic->message.m_richloc->get_num_locations ();
+}
+
 /* Expand the location of this diagnostic. Use this function for
    consistency.  Parameter WHICH specifies which location. By default,
    expand the first one.  */
@@ -313,12 +332,7 @@ diagnostic_location (const diagnostic_info * diagnostic, int which = 0)
 static inline expanded_location
 diagnostic_expand_location (const diagnostic_info * diagnostic, int which = 0)
 {
-  expanded_location s
-    = expand_location_to_spelling_point (diagnostic_location (diagnostic,
-							      which));
-  if (which == 0 && diagnostic->override_column)
-    s.column = diagnostic->override_column;
-  return s;
+  return diagnostic->richloc->get_range (which)->m_caret;
 }
 
 /* This is somehow the right-side margin of a caret line, that is, we
@@ -338,11 +352,7 @@ diagnostic_same_line (const diagnostic_context *context,
     && context->caret_max_width - CARET_LINE_MARGIN > abs (s1.column - s2.column);
 }
 
-void
-diagnostic_print_caret_line (diagnostic_context * context,
-			     expanded_location xloc1,
-			     expanded_location xloc2,
-			     char caret1, char caret2);
+extern const char *diagnostic_get_color_for_kind (diagnostic_t kind);
 
 /* Pure text formatting support functions.  */
 extern char *file_name_as_prefix (diagnostic_context *, const char *);
diff --git a/gcc/fortran/cpp.c b/gcc/fortran/cpp.c
index daffc20..92dc584 100644
--- a/gcc/fortran/cpp.c
+++ b/gcc/fortran/cpp.c
@@ -149,9 +149,9 @@ static void cb_include (cpp_reader *, source_location, const unsigned char *,
 static void cb_ident (cpp_reader *, source_location, const cpp_string *);
 static void cb_used_define (cpp_reader *, source_location, cpp_hashnode *);
 static void cb_used_undef (cpp_reader *, source_location, cpp_hashnode *);
-static bool cb_cpp_error (cpp_reader *, int, int, location_t, unsigned int,
+static bool cb_cpp_error (cpp_reader *, int, int, rich_location *,
 			  const char *, va_list *)
-     ATTRIBUTE_GCC_DIAG(6,0);
+     ATTRIBUTE_GCC_DIAG(5,0);
 void pp_dir_change (cpp_reader *, const char *);
 
 static int dump_macro (cpp_reader *, cpp_hashnode *, void *);
@@ -1026,13 +1026,12 @@ cb_used_define (cpp_reader *pfile, source_location line ATTRIBUTE_UNUSED,
 /* Callback from cpp_error for PFILE to print diagnostics from the
    preprocessor.  The diagnostic is of type LEVEL, with REASON set
    to the reason code if LEVEL is represents a warning, at location
-   LOCATION, with column number possibly overridden by COLUMN_OVERRIDE
-   if not zero; MSG is the translated message and AP the arguments.
+   RICHLOC; MSG is the translated message and AP the arguments.
    Returns true if a diagnostic was emitted, false otherwise.  */
 
 static bool
 cb_cpp_error (cpp_reader *pfile ATTRIBUTE_UNUSED, int level, int reason,
-	      location_t location, unsigned int column_override,
+	      rich_location *richloc,
 	      const char *msg, va_list *ap)
 {
   diagnostic_info diagnostic;
@@ -1067,9 +1066,7 @@ cb_cpp_error (cpp_reader *pfile ATTRIBUTE_UNUSED, int level, int reason,
       gcc_unreachable ();
     }
   diagnostic_set_info_translated (&diagnostic, msg, ap,
-				  location, dlevel);
-  if (column_override)
-    diagnostic_override_column (&diagnostic, column_override);
+				  richloc, dlevel);
   if (reason == CPP_W_WARNING_DIRECTIVE)
     diagnostic_override_option_index (&diagnostic, OPT_Wcpp);
   ret = report_diagnostic (&diagnostic);
diff --git a/gcc/fortran/error.c b/gcc/fortran/error.c
index 3825751..4b3d31c 100644
--- a/gcc/fortran/error.c
+++ b/gcc/fortran/error.c
@@ -773,6 +773,7 @@ gfc_warning (int opt, const char *gmsgid, va_list ap)
   va_copy (argp, ap);
 
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
   bool fatal_errors = global_dc->fatal_errors;
   pretty_printer *pp = global_dc->printer;
   output_buffer *tmp_buffer = pp->buffer;
@@ -787,7 +788,7 @@ gfc_warning (int opt, const char *gmsgid, va_list ap)
       --werrorcount;
     }
 
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION,
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc,
 		       DK_WARNING);
   diagnostic.option_index = opt;
   bool ret = report_diagnostic (&diagnostic);
@@ -938,10 +939,12 @@ gfc_format_decoder (pretty_printer *pp,
 	/* If location[0] != UNKNOWN_LOCATION means that we already
 	   processed one of %C/%L.  */
 	int loc_num = text->get_location (0) == UNKNOWN_LOCATION ? 0 : 1;
-	text->set_location (loc_num,
-			    linemap_position_for_loc_and_offset (line_table,
-								 loc->lb->location,
-								 offset));
+	source_range range
+	  = source_range::from_location (
+	      linemap_position_for_loc_and_offset (line_table,
+						   loc->lb->location,
+						   offset));
+	text->set_range (loc_num, range, true);
 	pp_string (pp, result[loc_num]);
 	return true;
       }
@@ -1024,48 +1027,21 @@ gfc_diagnostic_build_locus_prefix (diagnostic_context *context,
 }
 
 /* This function prints the locus (file:line:column), the diagnostic kind
-   (Error, Warning) and (optionally) the caret line (a source line
-   with '1' and/or '2' below it).
+   (Error, Warning) and (optionally) the relevant lines of code with
+   annotation lines with '1' and/or '2' below them.
 
-   With -fdiagnostic-show-caret (the default) and for valid locations,
-   it prints for one location:
+   With -fdiagnostic-show-caret (the default) it prints:
 
-       [locus]:
+       [locus of primary range]:
        
           some code
                  1
        Error: Some error at (1)
         
-   for two locations that fit in the same locus line:
+  With -fno-diagnostic-show-caret or if the primary range is not
+  valid, it prints:
 
-       [locus]:
-       
-         some code and some more code
-                1       2
-       Error: Some error at (1) and (2)
-
-   and for two locations that do not fit in the same locus line:
-
-       [locus]:
-       
-         some code
-                1
-       [locus2]:
-       
-         some other code
-           2
-       Error: Some error at (1) and (2)
-       
-  With -fno-diagnostic-show-caret or if one of the locations is not
-  valid, it prints for one location (or for two locations that fit in
-  the same locus line):
-
-       [locus]: Error: Some error at (1) and (2)
-
-   and for two locations that do not fit in the same locus line:
-
-       [name]:[locus]: Error: (1)
-       [name]:[locus2]: Error: Some error at (1) and (2)
+       [locus of primary range]: Error: Some error at (1) and (2)
 */
 static void 
 gfc_diagnostic_starter (diagnostic_context *context,
@@ -1075,7 +1051,7 @@ gfc_diagnostic_starter (diagnostic_context *context,
 
   expanded_location s1 = diagnostic_expand_location (diagnostic);
   expanded_location s2;
-  bool one_locus = diagnostic_location (diagnostic, 1) == UNKNOWN_LOCATION;
+  bool one_locus = diagnostic->richloc->get_num_locations () < 2;
   bool same_locus = false;
 
   if (!one_locus) 
@@ -1125,35 +1101,6 @@ gfc_diagnostic_starter (diagnostic_context *context,
       /* If the caret line was shown, the prefix does not contain the
 	 locus.  */
       pp_set_prefix (context->printer, kind_prefix);
-
-      if (one_locus || same_locus)
-	  return;
-
-      locus_prefix = gfc_diagnostic_build_locus_prefix (context, s2);
-      if (diagnostic_location (diagnostic, 1) <= BUILTINS_LOCATION)
-	{
-	  /* No caret line for the second location. Override the previous
-	     prefix with [locus2]:[prefix].  */
-	  pp_set_prefix (context->printer,
-			 concat (locus_prefix, " ", kind_prefix, NULL));
-	  free (kind_prefix);
-	  free (locus_prefix);
-	}
-      else
-	{
-	  /* We print the caret for the second location.  */
-	  pp_verbatim (context->printer, locus_prefix);
-	  free (locus_prefix);
-	  /* Fortran uses an empty line between locus and caret line.  */
-	  pp_newline (context->printer);
-	  s1.column = 0; /* Print only a caret line for s2.  */
-	  diagnostic_print_caret_line (context, s2, s1,
-				       context->caret_chars[1], '\0');
-	  pp_newline (context->printer);
-	  /* If the caret line was shown, the prefix does not contain the
-	     locus.  */
-	  pp_set_prefix (context->printer, kind_prefix);
-	}
     }
 }
 
@@ -1173,10 +1120,11 @@ gfc_warning_now_at (location_t loc, int opt, const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (loc);
   bool ret;
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, loc, DK_WARNING);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc, DK_WARNING);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
   va_end (argp);
@@ -1190,10 +1138,11 @@ gfc_warning_now (int opt, const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
   bool ret;
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION,
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc,
 		       DK_WARNING);
   diagnostic.option_index = opt;
   ret = report_diagnostic (&diagnostic);
@@ -1209,11 +1158,12 @@ gfc_error_now (const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
 
   error_buffer.flag = true;
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION, DK_ERROR);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc, DK_ERROR);
   report_diagnostic (&diagnostic);
   va_end (argp);
 }
@@ -1226,9 +1176,10 @@ gfc_fatal_error (const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION, DK_FATAL);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc, DK_FATAL);
   report_diagnostic (&diagnostic);
   va_end (argp);
 
@@ -1291,6 +1242,7 @@ gfc_error (const char *gmsgid, va_list ap)
     }
 
   diagnostic_info diagnostic;
+  rich_location richloc (UNKNOWN_LOCATION);
   bool fatal_errors = global_dc->fatal_errors;
   pretty_printer *pp = global_dc->printer;
   output_buffer *tmp_buffer = pp->buffer;
@@ -1306,7 +1258,7 @@ gfc_error (const char *gmsgid, va_list ap)
       --errorcount;
     }
 
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION, DK_ERROR);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &richloc, DK_ERROR);
   report_diagnostic (&diagnostic);
 
   if (buffered_p)
@@ -1336,9 +1288,10 @@ gfc_internal_error (const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
+  rich_location rich_loc (UNKNOWN_LOCATION);
 
   va_start (argp, gmsgid);
-  diagnostic_set_info (&diagnostic, gmsgid, &argp, UNKNOWN_LOCATION, DK_ICE);
+  diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc, DK_ICE);
   report_diagnostic (&diagnostic);
   va_end (argp);
 
diff --git a/gcc/genmatch.c b/gcc/genmatch.c
index 102a635..8a03f35 100644
--- a/gcc/genmatch.c
+++ b/gcc/genmatch.c
@@ -53,14 +53,31 @@ unsigned verbose;
 
 static struct line_maps *line_table;
 
+/* The rich_location class within libcpp requires a way to expand
+   source_location instances, and relies on the client code
+   providing a symbol named
+     linemap_client_expand_location_to_spelling_point
+   to do this.
+
+   This is the implementation for genmatch.  */
+
+expanded_location
+linemap_client_expand_location_to_spelling_point (source_location loc)
+{
+  const struct line_map_ordinary *map;
+  loc = linemap_resolve_location (line_table, loc, LRK_SPELLING_LOCATION, &map);
+  return linemap_expand_location (line_table, map, loc);
+}
+
 static bool
 #if GCC_VERSION >= 4001
-__attribute__((format (printf, 6, 0)))
+__attribute__((format (printf, 5, 0)))
 #endif
-error_cb (cpp_reader *, int errtype, int, source_location location,
-	  unsigned int, const char *msg, va_list *ap)
+error_cb (cpp_reader *, int errtype, int, rich_location *richloc,
+	  const char *msg, va_list *ap)
 {
   const line_map_ordinary *map;
+  source_location location = richloc->get_loc ();
   linemap_resolve_location (line_table, location, LRK_SPELLING_LOCATION, &map);
   expanded_location loc = linemap_expand_location (line_table, map, location);
   fprintf (stderr, "%s:%d:%d %s: ", loc.file, loc.line, loc.column,
@@ -102,9 +119,10 @@ __attribute__((format (printf, 2, 3)))
 #endif
 fatal_at (const cpp_token *tk, const char *msg, ...)
 {
+  rich_location richloc (tk->src_loc);
   va_list ap;
   va_start (ap, msg);
-  error_cb (NULL, CPP_DL_FATAL, 0, tk->src_loc, 0, msg, &ap);
+  error_cb (NULL, CPP_DL_FATAL, 0, &richloc, msg, &ap);
   va_end (ap);
 }
 
@@ -114,9 +132,10 @@ __attribute__((format (printf, 2, 3)))
 #endif
 fatal_at (source_location loc, const char *msg, ...)
 {
+  rich_location richloc (loc);
   va_list ap;
   va_start (ap, msg);
-  error_cb (NULL, CPP_DL_FATAL, 0, loc, 0, msg, &ap);
+  error_cb (NULL, CPP_DL_FATAL, 0, &richloc, msg, &ap);
   va_end (ap);
 }
 
@@ -126,9 +145,10 @@ __attribute__((format (printf, 2, 3)))
 #endif
 warning_at (const cpp_token *tk, const char *msg, ...)
 {
+  rich_location richloc (tk->src_loc);
   va_list ap;
   va_start (ap, msg);
-  error_cb (NULL, CPP_DL_WARNING, 0, tk->src_loc, 0, msg, &ap);
+  error_cb (NULL, CPP_DL_WARNING, 0, &richloc, msg, &ap);
   va_end (ap);
 }
 
@@ -138,9 +158,10 @@ __attribute__((format (printf, 2, 3)))
 #endif
 warning_at (source_location loc, const char *msg, ...)
 {
+  rich_location richloc (loc);
   va_list ap;
   va_start (ap, msg);
-  error_cb (NULL, CPP_DL_WARNING, 0, loc, 0, msg, &ap);
+  error_cb (NULL, CPP_DL_WARNING, 0, &richloc, msg, &ap);
   va_end (ap);
 }
 
diff --git a/gcc/input.c b/gcc/input.c
index ff80dd9..0f6d448 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -751,6 +751,22 @@ expand_location_to_spelling_point (source_location loc)
   return expand_location_1 (loc, /*expansion_point_p=*/false);
 }
 
+/* The rich_location class within libcpp requires a way to expand
+   source_location instances, and relies on the client code
+   providing a symbol named
+     linemap_client_expand_location_to_spelling_point
+   to do this.
+
+   This is the implementation for libcommon.a (all host binaries),
+   which simply calls into expand_location_to_spelling_point.  */
+
+expanded_location
+linemap_client_expand_location_to_spelling_point (source_location loc)
+{
+  return expand_location_to_spelling_point (loc);
+}
+
+
 /* If LOCATION is in a system header and if it is a virtual location for
    a token coming from the expansion of a macro, unwind it to the
    location of the expansion point of the macro.  Otherwise, just return
diff --git a/gcc/pretty-print.c b/gcc/pretty-print.c
index 5889015..aee4172 100644
--- a/gcc/pretty-print.c
+++ b/gcc/pretty-print.c
@@ -31,6 +31,27 @@ along with GCC; see the file COPYING3.  If not see
 #include <iconv.h>
 #endif
 
+/* Overwrite the range within this text_info's rich_location.
+   For use e.g. when implementing "+" in client format decoders.  */
+
+void
+text_info::set_range (unsigned int idx, source_range range, bool caret_p)
+{
+  gcc_checking_assert (m_richloc);
+  m_richloc->set_range (idx, range, caret_p, true);
+}
+
+location_t
+text_info::get_location (unsigned int index_of_location) const
+{
+  gcc_checking_assert (m_richloc);
+
+  if (index_of_location == 0)
+    return m_richloc->get_loc ();
+  else
+    return UNKNOWN_LOCATION;
+}
+
 // Default construct an output buffer.
 
 output_buffer::output_buffer ()
diff --git a/gcc/pretty-print.h b/gcc/pretty-print.h
index 2654b0f..cdee253 100644
--- a/gcc/pretty-print.h
+++ b/gcc/pretty-print.h
@@ -27,11 +27,6 @@ along with GCC; see the file COPYING3.  If not see
 /* Maximum number of format string arguments.  */
 #define PP_NL_ARGMAX   30
 
-/* Maximum number of locations associated to each message.  If
-   location 'i' is UNKNOWN_LOCATION, then location 'i+1' is not
-   valid.  */
-#define MAX_LOCATIONS_PER_MESSAGE 2
-
 /* The type of a text to be formatted according a format specification
    along with a list of things.  */
 struct text_info
@@ -40,21 +35,17 @@ struct text_info
   va_list *args_ptr;
   int err_no;  /* for %m */
   void **x_data;
+  rich_location *m_richloc;
 
-  inline void set_location (unsigned int index_of_location, location_t loc)
+  inline void set_location (unsigned int idx, location_t loc, bool caret_p)
   {
-    gcc_checking_assert (index_of_location < MAX_LOCATIONS_PER_MESSAGE);
-    this->locations[index_of_location] = loc;
+    source_range src_range;
+    src_range.m_start = loc;
+    src_range.m_finish = loc;
+    set_range (idx, src_range, caret_p);
   }
-
-  inline location_t get_location (unsigned int index_of_location) const
-  {
-    gcc_checking_assert (index_of_location < MAX_LOCATIONS_PER_MESSAGE);
-    return this->locations[index_of_location];
-  }
-
-private:
-  location_t locations[MAX_LOCATIONS_PER_MESSAGE];
+  void set_range (unsigned int idx, source_range range, bool caret_p);
+  location_t get_location (unsigned int index_of_location) const;
 };
 
 /* How often diagnostics are prefixed by their locations:
diff --git a/gcc/rtl-error.c b/gcc/rtl-error.c
index 8b9b391..d28be1d 100644
--- a/gcc/rtl-error.c
+++ b/gcc/rtl-error.c
@@ -69,9 +69,10 @@ diagnostic_for_asm (const rtx_insn *insn, const char *msg, va_list *args_ptr,
 		    diagnostic_t kind)
 {
   diagnostic_info diagnostic;
+  rich_location richloc (location_for_asm (insn));
 
   diagnostic_set_info (&diagnostic, msg, args_ptr,
-		       location_for_asm (insn), kind);
+		       &richloc, kind);
   report_diagnostic (&diagnostic);
 }
 
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-bw.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-bw.c
new file mode 100644
index 0000000..a4b16da
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-bw.c
@@ -0,0 +1,149 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdiagnostics-show-caret" } */
+
+/* This is a collection of unittests for diagnostic_show_locus;
+   see the overview in diagnostic_plugin_test_show_locus.c.
+
+   In particular, note the discussion of why we need a very long line here:
+01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
+   and that we can't use macros in this file.  */
+
+void test_simple (void)
+{
+#if 0
+  myvar = myvar.x; /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   myvar = myvar.x;
+           ~~~~~^~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_simple_2 (void)
+{
+#if 0
+  x = first_function () + second_function ();  /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   x = first_function () + second_function ();
+       ~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+
+void test_multiline (void)
+{
+#if 0
+  x = (first_function ()
+       + second_function ()); /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   x = (first_function ()
+        ~~~~~~~~~~~~~~~~~
+        + second_function ());
+        ^ ~~~~~~~~~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_many_lines (void)
+{
+#if 0
+  x = (first_function_with_a_very_long_name (lorem, ipsum, dolor, sit, amet,
+                                            consectetur, adipiscing, elit,
+                                            sed, eiusmod, tempor,
+                                            incididunt, ut, labore, et,
+                                            dolore, magna, aliqua)
+       + second_function_with_a_very_long_name (lorem, ipsum, dolor, sit, /* { dg-warning "test" } */
+                                                amet, consectetur,
+                                                adipiscing, elit, sed,
+                                                eiusmod, tempor, incididunt,
+                                                ut, labore, et, dolore,
+                                                magna, aliqua));
+
+/* { dg-begin-multiline-output "" }
+   x = (first_function_with_a_very_long_name (lorem, ipsum, dolor, sit, amet,
+        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                             consectetur, adipiscing, elit,
+                                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                             sed, eiusmod, tempor,
+                                             ~~~~~~~~~~~~~~~~~~~~~
+                                             incididunt, ut, labore, et,
+                                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                             dolore, magna, aliqua)
+                                             ~~~~~~~~~~~~~~~~~~~~~~
+        + second_function_with_a_very_long_name (lorem, ipsum, dolor, sit,
+        ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                                 amet, consectetur,
+                                                 ~~~~~~~~~~~~~~~~~~
+                                                 adipiscing, elit, sed,
+                                                 ~~~~~~~~~~~~~~~~~~~~~~
+                                                 eiusmod, tempor, incididunt,
+                                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+                                                 ut, labore, et, dolore,
+                                                 ~~~~~~~~~~~~~~~~~~~~~~~
+                                                 magna, aliqua));
+                                                 ~~~~~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_richloc_from_proper_range (void)
+{
+#if 0
+  float f = 98.6f; /* { dg-warning "test" } */
+/* { dg-begin-multiline-output "" }
+   float f = 98.6f;
+             ^~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_caret_within_proper_range (void)
+{
+#if 0
+  float f = foo * bar; /* { dg-warning "17: test" } */
+/* { dg-begin-multiline-output "" }
+   float f = foo * bar;
+             ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_very_wide_line (void)
+{
+#if 0
+                                                                                float f = foo * bar; /* { dg-warning "95: test" } */
+/* { dg-begin-multiline-output "" }
+                                              float f = foo * bar;
+                                                        ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_multiple_carets (void)
+{
+#if 0
+   x = x + y /* { dg-warning "8: test" } */
+/* { dg-begin-multiline-output "" }
+    x = x + y
+        A   B
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_caret_on_leading_whitespace (void)
+{
+#if 0
+    ASSOCIATE (y => x)
+      y = 5 /* { dg-warning "6: test" } */
+/* { dg-begin-multiline-output "" }
+     ASSOCIATE (y => x)
+                    2
+       y = 5
+      1
+   { dg-end-multiline-output "" } */
+#endif
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-color.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-color.c
new file mode 100644
index 0000000..47639b2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-color.c
@@ -0,0 +1,158 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdiagnostics-show-caret -fplugin-arg-diagnostic_plugin_test_show_locus-color" } */
+
+/* This is a collection of unittests for diagnostic_show_locus;
+   see the overview in diagnostic_plugin_test_show_locus.c.
+
+   In particular, note the discussion of why we need a very long line here:
+01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
+   and that we can't use macros in this file.  */
+
+void test_simple (void)
+{
+#if 0
+  myvar = myvar.x; /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   myvar = ^[[32m^[[Kmyvar^[[m^[[K^[[01;35m^[[K.^[[m^[[K^[[34m^[[Kx^[[m^[[K;
+           ^[[32m^[[K~~~~~^[[m^[[K^[[01;35m^[[K^^[[m^[[K^[[34m^[[K~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_simple_2 (void)
+{
+#if 0
+  x = first_function () + second_function ();  /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   x = ^[[32m^[[Kfirst_function ()^[[m^[[K ^[[01;35m^[[K+^[[m^[[K ^[[34m^[[Ksecond_function ()^[[m^[[K;
+       ^[[32m^[[K~~~~~~~~~~~~~~~~~^[[m^[[K ^[[01;35m^[[K^^[[m^[[K ^[[34m^[[K~~~~~~~~~~~~~~~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+
+void test_multiline (void)
+{
+#if 0
+  x = (first_function ()
+       + second_function ()); /* { dg-warning "test" } */
+
+/* { dg-begin-multiline-output "" }
+   x = (^[[32m^[[Kfirst_function ()
+ ^[[m^[[K       ^[[32m^[[K~~~~~~~~~~~~~~~~~
+^[[m^[[K        ^[[01;35m^[[K+^[[m^[[K ^[[34m^[[Ksecond_function ()^[[m^[[K);
+        ^[[01;35m^[[K^^[[m^[[K ^[[34m^[[K~~~~~~~~~~~~~~~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_many_lines (void)
+{
+#if 0
+  x = (first_function_with_a_very_long_name (lorem, ipsum, dolor, sit, amet,
+                                            consectetur, adipiscing, elit,
+                                            sed, eiusmod, tempor,
+                                            incididunt, ut, labore, et,
+                                            dolore, magna, aliqua)
+       + second_function_with_a_very_long_name (lorem, ipsum, dolor, sit, /* { dg-warning "test" } */
+                                                amet, consectetur,
+                                                adipiscing, elit, sed,
+                                                eiusmod, tempor, incididunt,
+                                                ut, labore, et, dolore,
+                                                magna, aliqua));
+
+/* { dg-begin-multiline-output "" }
+   x = (^[[32m^[[Kfirst_function_with_a_very_long_name (lorem, ipsum, dolor, sit, amet,
+ ^[[m^[[K       ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[32m^[[K                                            consectetur, adipiscing, elit,
+ ^[[m^[[K                                            ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[32m^[[K                                            sed, eiusmod, tempor,
+ ^[[m^[[K                                            ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[32m^[[K                                            incididunt, ut, labore, et,
+ ^[[m^[[K                                            ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[32m^[[K                                            dolore, magna, aliqua)
+ ^[[m^[[K                                            ^[[32m^[[K~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K        ^[[01;35m^[[K+^[[m^[[K ^[[34m^[[Ksecond_function_with_a_very_long_name (lorem, ipsum, dolor, sit,
+ ^[[m^[[K       ^[[01;35m^[[K^^[[m^[[K ^[[34m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                amet, consectetur,
+ ^[[m^[[K                                                ^[[34m^[[K~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                adipiscing, elit, sed,
+ ^[[m^[[K                                                ^[[34m^[[K~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                eiusmod, tempor, incididunt,
+ ^[[m^[[K                                                ^[[34m^[[K~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                ut, labore, et, dolore,
+ ^[[m^[[K                                                ^[[34m^[[K~~~~~~~~~~~~~~~~~~~~~~~
+^[[m^[[K ^[[34m^[[K                                                magna, aliqua)^[[m^[[K);
+                                                 ^[[34m^[[K~~~~~~~~~~~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_richloc_from_proper_range (void)
+{
+#if 0
+  float f = 98.6f; /* { dg-warning "test" } */
+/* { dg-begin-multiline-output "" }
+   float f = ^[[01;35m^[[K98.6f^[[m^[[K;
+             ^[[01;35m^[[K^~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_caret_within_proper_range (void)
+{
+#if 0
+  float f = foo * bar; /* { dg-warning "17: test" } */
+/* { dg-begin-multiline-output "" }
+   float f = ^[[01;35m^[[Kfoo * bar^[[m^[[K;
+             ^[[01;35m^[[K~~~~^~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_very_wide_line (void)
+{
+#if 0
+                                                                                float f = foo * bar; /* { dg-warning "95: test" } */
+/* { dg-begin-multiline-output "" }
+                                              float f = ^[[01;35m^[[Kfoo * bar^[[m^[[K;
+                                                        ^[[01;35m^[[K~~~~^~~~~
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_multiple_carets (void)
+{
+#if 0
+   x = x + y /* { dg-warning "8: test" } */
+/* { dg-begin-multiline-output "" }
+    x = ^[[01;35m^[[Kx^[[m^[[K + ^[[32m^[[Ky^[[m^[[K
+        ^[[01;35m^[[KA^[[m^[[K   ^[[32m^[[KB
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
+
+void test_caret_on_leading_whitespace (void)
+{
+#if 0
+    ASSOCIATE (y => x)
+      y = 5 /* { dg-warning "6: test" } */
+/* { dg-begin-multiline-output "" }
+     ASSOCIATE (y =>^[[32m^[[K ^[[m^[[Kx)
+                    ^[[32m^[[K2
+^[[m^[[K      ^[[01;35m^[[K ^[[m^[[Ky = 5
+      ^[[01;35m^[[K1
+^[[m^[[K
+   { dg-end-multiline-output "" } */
+#endif
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
new file mode 100644
index 0000000..8f5724e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
@@ -0,0 +1,326 @@
+/* { dg-options "-O" } */
+
+/* This plugin exercises the diagnostics-printing code.
+
+   The goal is to unit-test the range-printing code without needing any
+   correct range data within the compiler's IR.  We can't use any real
+   diagnostics for this, so we have to fake it, hence this plugin.
+
+   There are two test files used with this code:
+
+     diagnostic-test-show-locus-ascii-bw.c
+     ..........................-ascii-color.c
+
+   to exercise uncolored vs colored output by supplying plugin arguments
+   to hack in the desired behavior:
+
+     -fplugin-arg-diagnostic_plugin_test_show_locus-color
+
+   The test files contain functions, but the body of each
+   function is disabled using the preprocessor.  The plugin detects
+   the functions by name, and inject diagnostics within them, using
+   hard-coded locations relative to the top of each function.
+
+   The plugin uses a function "get_loc" below to map from line/column
+   numbers to source_location, and this relies on input_location being in
+   the same ordinary line_map as the locations in question.  The plugin
+   runs after parsing, so input_location will be at the end of the file.
+
+   This need for all of the test code to be in a single ordinary line map
+   means that each test file needs to have a very long line near the top
+   (potentially to cover the extra byte-count of colorized data),
+   to ensure that further very long lines don't start a new linemap.
+   This also means that we can't use macros in the test files.  */
+
+#include "gcc-plugin.h"
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "stringpool.h"
+#include "toplev.h"
+#include "basic-block.h"
+#include "hash-table.h"
+#include "vec.h"
+#include "ggc.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "internal-fn.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "tree.h"
+#include "tree-pass.h"
+#include "intl.h"
+#include "plugin-version.h"
+#include "diagnostic.h"
+#include "context.h"
+#include "print-tree.h"
+
+int plugin_is_GPL_compatible;
+
+const pass_data pass_data_test_show_locus =
+{
+  GIMPLE_PASS, /* type */
+  "test_show_locus", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_NONE, /* tv_id */
+  PROP_ssa, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  0, /* todo_flags_finish */
+};
+
+class pass_test_show_locus : public gimple_opt_pass
+{
+public:
+  pass_test_show_locus(gcc::context *ctxt)
+    : gimple_opt_pass(pass_data_test_show_locus, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  bool gate (function *) { return true; }
+  virtual unsigned int execute (function *);
+
+}; // class pass_test_show_locus
+
+/* Given LINE_NUM and COL_NUM, generate a source_location in the
+   current file, relative to input_location.  This relies on the
+   location being expressible in the same ordinary line_map as
+   input_location (which is typically at the end of the source file
+   when this is called).  Hence the test files we compile with this
+   plugin must have an initial very long line (to avoid long lines
+   starting a new line map), and must not use macros.
+
+   COL_NUM uses the Emacs convention of 0-based column numbers.  */
+
+static source_location
+get_loc (unsigned int line_num, unsigned int col_num)
+{
+  /* Use input_location to get the relevant line_map */
+  const struct line_map_ordinary *line_map
+    = (const line_map_ordinary *)(linemap_lookup (line_table,
+						  input_location));
+
+  /* Convert from 0-based column numbers to 1-based column numbers.  */
+  source_location loc
+    = linemap_position_for_line_and_column (line_map,
+					    line_num, col_num + 1);
+
+  return loc;
+}
+
+/* Was "color" passed in as a plugin argument?  */
+static bool force_show_locus_color = false;
+
+/* We want to verify the colorized output of diagnostic_show_locus,
+   but turning on colorization for everything confuses "dg-warning" etc.
+   Hence we special-case it within this plugin by using this modified
+   version of default_diagnostic_finalizer, which, if "color" is
+   passed in as a plugin argument turns on colorization, but just
+   for diagnostic_show_locus.  */
+
+static void
+custom_diagnostic_finalizer (diagnostic_context *context,
+			     diagnostic_info *diagnostic)
+{
+  bool old_show_color = pp_show_color (context->printer);
+  if (force_show_locus_color)
+    pp_show_color (context->printer) = true;
+  diagnostic_show_locus (context, diagnostic);
+  pp_show_color (context->printer) = old_show_color;
+
+  pp_destroy_prefix (context->printer);
+  pp_newline_and_flush (context->printer);
+}
+
+/* Exercise the diagnostic machinery to emit various warnings,
+   for use by diagnostic-test-show-locus-*.c.
+
+   We inject each warning relative to the start of a function,
+   which avoids lots of hardcoded absolute locations.  */
+
+static void
+test_show_locus (function *fun)
+{
+  tree fndecl = fun->decl;
+  tree identifier = DECL_NAME (fndecl);
+  const char *fnname = IDENTIFIER_POINTER (identifier);
+  location_t fnstart = fun->function_start_locus;
+  int fnstart_line = LOCATION_LINE (fnstart);
+
+  diagnostic_finalizer (global_dc) = custom_diagnostic_finalizer;
+
+  /* Hardcode the "terminal width", to verify the behavior of
+     very wide lines.  */
+  global_dc->caret_max_width = 70;
+
+  if (0 == strcmp (fnname, "test_simple"))
+    {
+      const int line = fnstart_line + 2;
+      rich_location richloc (get_loc (line, 15));
+      richloc.add_range (get_loc (line, 10), get_loc (line, 14), false);
+      richloc.add_range (get_loc (line, 16), get_loc (line, 16), false);
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  if (0 == strcmp (fnname, "test_simple_2"))
+    {
+      const int line = fnstart_line + 2;
+      rich_location richloc (get_loc (line, 24));
+      richloc.add_range (get_loc (line, 6),
+			 get_loc (line, 22), false);
+      richloc.add_range (get_loc (line, 26),
+			 get_loc (line, 43), false);
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  if (0 == strcmp (fnname, "test_multiline"))
+    {
+      const int line = fnstart_line + 2;
+      rich_location richloc (get_loc (line + 1, 7));
+      richloc.add_range (get_loc (line, 7),
+			 get_loc (line, 23), false);
+      richloc.add_range (get_loc (line + 1, 9),
+			 get_loc (line + 1, 26), false);
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  if (0 == strcmp (fnname, "test_many_lines"))
+    {
+      const int line = fnstart_line + 2;
+      rich_location richloc (get_loc (line + 5, 7));
+      richloc.add_range (get_loc (line, 7),
+			 get_loc (line + 4, 65), false);
+      richloc.add_range (get_loc (line + 5, 9),
+			 get_loc (line + 10, 61), false);
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  /* Example of a rich_location constructed directly from a
+     source_range where the range is larger than one character.  */
+  if (0 == strcmp (fnname, "test_richloc_from_proper_range"))
+    {
+      const int line = fnstart_line + 2;
+      source_range src_range;
+      src_range.m_start = get_loc (line, 12);
+      src_range.m_finish = get_loc (line, 16);
+      rich_location richloc (src_range);
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  /* Example of a single-range location where the range starts
+     before the caret.  */
+  if (0 == strcmp (fnname, "test_caret_within_proper_range"))
+    {
+      const int line = fnstart_line + 2;
+      location_t caret = get_loc (line, 16);
+      source_range src_range;
+      src_range.m_start = get_loc (line, 12);
+      src_range.m_finish = get_loc (line, 20);
+      rich_location richloc (caret);
+      richloc.set_range (0, src_range, true, false);
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  /* Example of a very wide line, where the information of interest
+     is beyond the width of the terminal (hardcoded above).  */
+  if (0 == strcmp (fnname, "test_very_wide_line"))
+    {
+      const int line = fnstart_line + 2;
+      location_t caret = get_loc (line, 94);
+      source_range src_range;
+      src_range.m_start = get_loc (line, 90);
+      src_range.m_finish = get_loc (line, 98);
+      rich_location richloc (caret);
+      richloc.set_range (0, src_range, true, false);
+      warning_at_rich_loc (&richloc, 0, "test");
+    }
+
+  /* Example of multiple carets.  */
+  if (0 == strcmp (fnname, "test_multiple_carets"))
+    {
+      const int line = fnstart_line + 2;
+      location_t caret_a = get_loc (line, 7);
+      location_t caret_b = get_loc (line, 11);
+      rich_location richloc (caret_a);
+      richloc.add_range (caret_b, caret_b, true);
+      global_dc->caret_chars[0] = 'A';
+      global_dc->caret_chars[1] = 'B';
+      warning_at_rich_loc (&richloc, 0, "test");
+      global_dc->caret_chars[0] = '^';
+      global_dc->caret_chars[1] = '^';
+    }
+
+  /* Example of two carets where both carets appear to have an off-by-one
+     error appearing one column early.
+     Seen with gfortran.dg/associate_5.f03.
+     In an earlier version of the printer, the printing of caret 0 aka
+     "1" was suppressed due to it appearing within the leading whitespace
+     before the text in its line.  Ensure that we at least faithfully
+     print both carets, at the given (erroneous) locations.  */
+  if (0 == strcmp (fnname, "test_caret_on_leading_whitespace"))
+    {
+      const int line = fnstart_line + 3;
+      location_t caret_a = get_loc (line, 5);
+      location_t caret_b = get_loc (line - 1, 19);
+      rich_location richloc (caret_a);
+      richloc.add_range (caret_b, caret_b, true);
+      global_dc->caret_chars[0] = '1';
+      global_dc->caret_chars[1] = '2';
+      warning_at_rich_loc (&richloc, 0, "test");
+      global_dc->caret_chars[0] = '^';
+      global_dc->caret_chars[1] = '^';
+    }
+}
+
+unsigned int
+pass_test_show_locus::execute (function *fun)
+{
+  test_show_locus (fun);
+  return 0;
+}
+
+static gimple_opt_pass *
+make_pass_test_show_locus (gcc::context *ctxt)
+{
+  return new pass_test_show_locus (ctxt);
+}
+
+int
+plugin_init (struct plugin_name_args *plugin_info,
+	     struct plugin_gcc_version *version)
+{
+  struct register_pass_info pass_info;
+  const char *plugin_name = plugin_info->base_name;
+  int argc = plugin_info->argc;
+  struct plugin_argument *argv = plugin_info->argv;
+
+  if (!plugin_default_version_check (version, &gcc_version))
+    return 1;
+
+  /* For now, tell the dc to expect ranges and thus to colorize the source
+     lines, not just the carets/underlines.  This will be redundant
+     once the C frontend generates ranges.  */
+  global_dc->colorize_source_p = true;
+
+  for (int i = 0; i < argc; i++)
+    {
+      if (0 == strcmp (argv[i].key, "color"))
+	force_show_locus_color = true;
+    }
+
+  pass_info.pass = make_pass_test_show_locus (g);
+  pass_info.reference_pass_name = "ssa";
+  pass_info.ref_pass_instance_number = 1;
+  pass_info.pos_op = PASS_POS_INSERT_AFTER;
+  register_callback (plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL,
+		     &pass_info);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/plugin.exp b/gcc/testsuite/gcc.dg/plugin/plugin.exp
index 39fab6e..941bccc 100644
--- a/gcc/testsuite/gcc.dg/plugin/plugin.exp
+++ b/gcc/testsuite/gcc.dg/plugin/plugin.exp
@@ -63,6 +63,9 @@ set plugin_test_list [list \
     { start_unit_plugin.c start_unit-test-1.c } \
     { finish_unit_plugin.c finish_unit-test-1.c } \
     { wide-int_plugin.c wide-int-test-1.c } \
+    { diagnostic_plugin_test_show_locus.c \
+	  diagnostic-test-show-locus-bw.c \
+	  diagnostic-test-show-locus-color.c } \
 ]
 
 foreach plugin_test $plugin_test_list {
diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp
index 7c1ab85..8cc1d87 100644
--- a/gcc/testsuite/lib/gcc-dg.exp
+++ b/gcc/testsuite/lib/gcc-dg.exp
@@ -29,6 +29,7 @@ load_lib libgloss.exp
 load_lib target-libpath.exp
 load_lib torture-options.exp
 load_lib fortran-modules.exp
+load_lib multiline.exp
 
 # We set LC_ALL and LANG to C so that we get the same error messages as expected.
 setenv LC_ALL C
diff --git a/gcc/tree-diagnostic.c b/gcc/tree-diagnostic.c
index 135f142..02009d8 100644
--- a/gcc/tree-diagnostic.c
+++ b/gcc/tree-diagnostic.c
@@ -289,7 +289,7 @@ default_tree_printer (pretty_printer *pp, text_info *text, const char *spec,
     }
 
   if (set_locus)
-    text->set_location (0, DECL_SOURCE_LOCATION (t));
+    text->set_location (0, DECL_SOURCE_LOCATION (t), true);
 
   if (DECL_P (t))
     {
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index ce3f6a8..29bc48a 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -3592,7 +3592,7 @@ void
 percent_K_format (text_info *text)
 {
   tree t = va_arg (*text->args_ptr, tree), block;
-  text->set_location (0, EXPR_LOCATION (t));
+  text->set_location (0, EXPR_LOCATION (t), true);
   gcc_assert (pp_ti_abstract_origin (text) != NULL);
   block = TREE_BLOCK (t);
   *pp_ti_abstract_origin (text) = NULL;
diff --git a/libcpp/errors.c b/libcpp/errors.c
index a33196e..c351c11 100644
--- a/libcpp/errors.c
+++ b/libcpp/errors.c
@@ -57,7 +57,8 @@ cpp_diagnostic (cpp_reader * pfile, int level, int reason,
 
   if (!pfile->cb.error)
     abort ();
-  ret = pfile->cb.error (pfile, level, reason, src_loc, 0, _(msgid), ap);
+  rich_location richloc (src_loc);
+  ret = pfile->cb.error (pfile, level, reason, &richloc, _(msgid), ap);
 
   return ret;
 }
@@ -139,7 +140,9 @@ cpp_diagnostic_with_line (cpp_reader * pfile, int level, int reason,
   
   if (!pfile->cb.error)
     abort ();
-  ret = pfile->cb.error (pfile, level, reason, src_loc, column, _(msgid), ap);
+  rich_location richloc (src_loc);
+  richloc.override_column (column);
+  ret = pfile->cb.error (pfile, level, reason, &richloc, _(msgid), ap);
 
   return ret;
 }
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 5eaea6b..a2bdfa0 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -573,9 +573,9 @@ struct cpp_callbacks
 
   /* Called to emit a diagnostic.  This callback receives the
      translated message.  */
-  bool (*error) (cpp_reader *, int, int, source_location, unsigned int,
+  bool (*error) (cpp_reader *, int, int, rich_location *,
 		 const char *, va_list *)
-       ATTRIBUTE_FPTR_PRINTF(6,0);
+       ATTRIBUTE_FPTR_PRINTF(5,0);
 
   /* Callbacks for when a macro is expanded, or tested (whether
      defined or not at the time) in #ifdef, #ifndef or "defined".  */
diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index 09378f9..c8f636d 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -131,6 +131,47 @@ typedef unsigned int linenum_type;
   libcpp/location-example.txt.  */
 typedef unsigned int source_location;
 
+/* A range of source locations.
+
+   Ranges are closed:
+   m_start is the first location within the range,
+   m_finish is the last location within the range.
+
+   We may need a more compact way to store these, but for now,
+   let's do it the simple way, as a pair.  */
+struct GTY(()) source_range
+{
+  source_location m_start;
+  source_location m_finish;
+
+  /* Display this source_range instance, with MSG as a descriptive
+     comment.  This issues a "note" diagnostic at the range, using
+     gcc's diagnostic machinery.
+
+     This is declared here, but is implemented within gcc/diagnostic.c,
+     since it makes use of gcc's diagnostic-printing machinery.  This
+     is a slight layering violation, but this is sufficiently useful
+     for debugging that it's worth it.
+
+     This declaration would have a DEBUG_FUNCTION annotation, but that
+     is implemented in gcc/system.h and thus is not available here in
+     libcpp.  */
+  void debug (const char *msg) const;
+
+  /* We avoid using constructors, since various structs that
+     don't yet have constructors will embed instances of
+     source_range.  */
+
+  /* Make a source_range from a source_location.  */
+  static source_range from_location (source_location loc)
+  {
+    source_range result;
+    result.m_start = loc;
+    result.m_finish = loc;
+    return result;
+  }
+};
+
 /* Memory allocation function typedef.  Works like xrealloc.  */
 typedef void *(*line_map_realloc) (void *, size_t);
 
@@ -1028,6 +1069,174 @@ typedef struct
   bool sysp;
 } expanded_location;
 
+/* Both gcc and emacs number source *lines* starting at 1, but
+   they have differing conventions for *columns*.
+
+   GCC uses a 1-based convention for source columns,
+   whereas Emacs's M-x column-number-mode uses a 0-based convention.
+
+   For example, an error in the initial, left-hand
+   column of source line 3 is reported by GCC as:
+
+      some-file.c:3:1: error: ...etc...
+
+   On navigating to the location of that error in Emacs
+   (e.g. via "next-error"),
+   the locus is reported in the Mode Line
+   (assuming M-x column-number-mode) as:
+
+     some-file.c   10%   (3, 0)
+
+   i.e. "3:1:" in GCC corresponds to "(3, 0)" in Emacs.  */
+
+/* Ranges are closed
+   m_start is the first location within the range, and
+   m_finish is the last location within the range.  */
+struct location_range
+{
+  expanded_location m_start;
+  expanded_location m_finish;
+
+  /* Should a caret be drawn for this range?  Typically this is
+     true for the 0th range, and false for subsequent ranges,
+     but the Fortran frontend overrides this for rendering things like:
+
+       x = x + y
+           1   2
+       Error: Shapes for operands at (1) and (2) are not conformable
+
+     where "1" and "2" are notionally carets.  */
+  bool m_show_caret_p;
+  expanded_location m_caret;
+};
+
+/* A "rich" source code location, for use when printing diagnostics.
+   A rich_location has one or more ranges, each optionally with
+   a caret.   Typically the zeroth range has a caret; other ranges
+   sometimes have carets.
+
+   The "primary" location of a rich_location is the caret of range 0,
+   used for determining the line/column when printing diagnostic
+   text, such as:
+
+      some-file.c:3:1: error: ...etc...
+
+   Additional ranges may be added to help the user identify other
+   pertinent clauses in a diagnostic.
+
+   rich_location instances are intended to be allocated on the stack
+   when generating diagnostics, and to be short-lived.
+
+   Examples of rich locations
+   --------------------------
+
+   Example A
+   *********
+      int i = "foo";
+              ^
+   This "rich" location is simply a single range (range 0), with
+   caret = start = finish at the given point.
+
+   Example B
+   *********
+      a = (foo && bar)
+          ~~~~~^~~~~~~
+   This rich location has a single range (range 0), with the caret
+   at the first "&", and the start/finish at the parentheses.
+   Compare with example C below.
+
+   Example C
+   *********
+      a = (foo && bar)
+           ~~~ ^~ ~~~
+   This rich location has three ranges:
+   - Range 0 has its caret and start location at the first "&" and
+     end at the second "&.
+   - Range 1 has its start and finish at the "f" and "o" of "foo";
+     the caret is not flagged for display, but is perhaps at the "f"
+     of "foo".
+   - Similarly, range 2 has its start and finish at the "b" and "r" of
+     "bar"; the caret is not flagged for display, but is perhaps at the
+     "b" of "bar".
+   Compare with example B above.
+
+   Example D (Fortran frontend)
+   ****************************
+       x = x + y
+           1   2
+   This rich location has range 0 at "1", and range 1 at "2".
+   Both are flagged for caret display.  Both ranges have start/finish
+   equal to their caret point.  The frontend overrides the diagnostic
+   context's default caret character for these ranges.
+
+   Example E
+   *********
+      printf ("arg0: %i  arg1: %s arg2: %i",
+                               ^~
+              100, 101, 102);
+                   ~~~
+   This rich location has two ranges:
+   - range 0 is at the "%s" with start = caret = "%" and finish at
+     the "s".
+   - range 1 has start/finish covering the "101" and is not flagged for
+     caret printing; it is perhaps at the start of "101".  */
+
+class rich_location
+{
+ public:
+  /* Constructors.  */
+
+  /* Constructing from a location.  */
+  rich_location (source_location loc);
+
+  /* Constructing from a source_range.  */
+  rich_location (source_range src_range);
+
+  /* Accessors.  */
+  source_location get_loc () const { return m_loc; }
+
+  source_location *get_loc_addr () { return &m_loc; }
+
+  void
+  add_range (source_location start, source_location finish,
+	     bool show_caret_p);
+
+  void
+  add_range (source_range src_range, bool show_caret_p);
+
+  void
+  add_range (location_range *src_range);
+
+  void
+  set_range (unsigned int idx, source_range src_range,
+	     bool show_caret_p, bool overwrite_loc_p);
+
+  unsigned int get_num_locations () const { return m_num_ranges; }
+
+  location_range *get_range (unsigned int idx)
+  {
+    linemap_assert (idx < m_num_ranges);
+    return &m_ranges[idx];
+  }
+
+  expanded_location lazily_expand_location ();
+
+  void
+  override_column (int column);
+
+public:
+  static const int MAX_RANGES = 3;
+
+protected:
+  source_location m_loc;
+
+  unsigned int m_num_ranges;
+  location_range m_ranges[MAX_RANGES];
+
+  bool m_have_expanded_location;
+  expanded_location m_expanded_location;
+};
+
 /* This is enum is used by the function linemap_resolve_location
    below.  The meaning of the values is explained in the comment of
    that function.  */
@@ -1173,4 +1382,13 @@ void linemap_dump (FILE *, struct line_maps *, unsigned, bool);
    specifies how many macro maps to dump.  */
 void line_table_dump (FILE *, struct line_maps *, unsigned int, unsigned int);
 
+/* The rich_location class requires a way to expand source_location instances.
+   We would directly use expand_location_to_spelling_point, which is
+   implemented in gcc/input.c, but we also need to use it for rich_location
+   within genmatch.c.
+   Hence we require client code of libcpp to implement the following
+   symbol.  */
+extern expanded_location
+linemap_client_expand_location_to_spelling_point (source_location );
+
 #endif /* !LIBCPP_LINE_MAP_H  */
diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index 84403de..3c19f93 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -1755,3 +1755,133 @@ line_table_dump (FILE *stream, struct line_maps *set, unsigned int num_ordinary,
       fprintf (stream, "\n");
     }
 }
+
+/* class rich_location.  */
+
+/* Construct a rich_location with location LOC as its initial range.  */
+
+rich_location::rich_location (source_location loc) :
+  m_loc (loc),
+  m_num_ranges (0),
+  m_have_expanded_location (false)
+{
+  /* Set up the 0th range: */
+  add_range (loc, loc, true);
+  m_ranges[0].m_caret = lazily_expand_location ();
+}
+
+/* Construct a rich_location with source_range SRC_RANGE as its
+   initial range.  */
+
+rich_location::rich_location (source_range src_range)
+: m_loc (src_range.m_start),
+  m_num_ranges (0),
+  m_have_expanded_location (false)
+{
+  /* Set up the 0th range: */
+  add_range (src_range, true);
+}
+
+/* Get an expanded_location for this rich_location's primary
+   location.  */
+
+expanded_location
+rich_location::lazily_expand_location ()
+{
+  if (!m_have_expanded_location)
+    {
+      m_expanded_location
+	= linemap_client_expand_location_to_spelling_point (m_loc);
+      m_have_expanded_location = true;
+    }
+
+  return m_expanded_location;
+}
+
+/* Set the column of the primary location.  */
+
+void
+rich_location::override_column (int column)
+{
+  lazily_expand_location ();
+  m_expanded_location.column = column;
+}
+
+/* Add the given range.  */
+
+void
+rich_location::add_range (source_location start, source_location finish,
+			  bool show_caret_p)
+{
+  linemap_assert (m_num_ranges < MAX_RANGES);
+
+  location_range *range = &m_ranges[m_num_ranges++];
+  range->m_start = linemap_client_expand_location_to_spelling_point (start);
+  range->m_finish = linemap_client_expand_location_to_spelling_point (finish);
+  range->m_caret = range->m_start;
+  range->m_show_caret_p = show_caret_p;
+}
+
+/* Add the given range.  */
+
+void
+rich_location::add_range (source_range src_range, bool show_caret_p)
+{
+  linemap_assert (m_num_ranges < MAX_RANGES);
+
+  add_range (src_range.m_start, src_range.m_finish, show_caret_p);
+}
+
+void
+rich_location::add_range (location_range *src_range)
+{
+  linemap_assert (m_num_ranges < MAX_RANGES);
+
+  m_ranges[m_num_ranges++] = *src_range;
+}
+
+/* Add or overwrite the range given by IDX.  It must either
+   overwrite an existing range, or add one *exactly* on the end of
+   the array.
+
+   This is primarily for use by gcc when implementing diagnostic
+   format decoders e.g. the "+" in the C/C++ frontends, for handling
+   format codes like "%q+D" (which writes the source location of a
+   tree back into range 0 of the rich_location).
+
+   If SHOW_CARET_P is true, then the range should be rendered with
+   a caret at its starting location.  This
+   is for use by the Fortran frontend, for implementing the
+   "%C" and "%L" format codes.  */
+
+void
+rich_location::set_range (unsigned int idx, source_range src_range,
+			  bool show_caret_p, bool overwrite_loc_p)
+{
+  linemap_assert (idx < MAX_RANGES);
+
+  /* We can either overwrite an existing range, or add one exactly
+     on the end of the array.  */
+  linemap_assert (idx <= m_num_ranges);
+
+  location_range *locrange = &m_ranges[idx];
+  locrange->m_start
+    = linemap_client_expand_location_to_spelling_point (src_range.m_start);
+  locrange->m_finish
+    = linemap_client_expand_location_to_spelling_point (src_range.m_finish);
+
+  locrange->m_show_caret_p = show_caret_p;
+  if (overwrite_loc_p)
+    locrange->m_caret = locrange->m_start;
+
+  /* Are we adding a range onto the end?  */
+  if (idx == m_num_ranges)
+    m_num_ranges = idx + 1;
+
+  if (idx == 0 && overwrite_loc_p)
+    {
+      m_loc = src_range.m_start;
+      /* Mark any cached value here as dirty.  */
+      m_have_expanded_location = false;
+    }
+}
-- 
1.8.5.3

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 04/10] Reimplement diagnostic_show_locus, introducing rich_location classes (v5)
  2015-10-28 17:52       ` David Malcolm
                           ` (2 preceding siblings ...)
  2015-10-28 17:59         ` [PATCH 4c] Other changes: everything apart from diagnostic-show-locus.c changes David Malcolm
@ 2015-10-30  4:49         ` Jeff Law
  3 siblings, 0 replies; 83+ messages in thread
From: Jeff Law @ 2015-10-30  4:49 UTC (permalink / raw)
  To: David Malcolm; +Cc: gcc-patches

On 10/28/2015 12:09 PM, David Malcolm wrote:
>>
>> If you have a blob of new code and a blob of deletes, even breaking it
>> down that way may help in this case (ie, a patch with new classes &
>> code, then a pass that deletes old crud we're not going to use anymore).
>
> I've split patch 4 of the kit into 3 sub-patches:
>
>    [PATCH 4a] diagnostic-show-locus.c changes: Deletions
>    [PATCH 4b] diagnostic-show-locus.c changes: Insertions
>    [PATCH 4c] Other changes: everything apart from diagnostic-show-locus.c changes
>
> 4a, 4b, and 4c should appear as followups to this mail (assuming my
> "git send-email" command works OK).  They're only split up for
> ease-of-review purposes; they're intended to be committed as one commit.
Right, the split is just to make the diffs readable.  They'll be an 
atomic commit.  We're on the same page here.

>
> The 4a/4b split seems to have allowed "git diff" to done a
> much better job on diagnostic-show-locus.c.
Most definitely :-)

>
> Patch 4c contains updates based on your review comments below; a diff
> relative to the prior version of the patch can be seen at:
>   https://dmalcolm.fedorapeople.org/gcc/2015-10-28/rich-locations/0001-Fix-issues-found-in-review.patch
4a, 4c are  OK.  I'm working through 4b now.

Thanks.
jeff

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 4b] diagnostic-show-locus.c changes: Insertions
  2015-10-28 17:51         ` [PATCH 4b] diagnostic-show-locus.c changes: Insertions David Malcolm
@ 2015-10-30  4:53           ` Jeff Law
  2015-10-30 19:42             ` David Malcolm
  2015-11-06 19:59             ` David Malcolm
  0 siblings, 2 replies; 83+ messages in thread
From: Jeff Law @ 2015-10-30  4:53 UTC (permalink / raw)
  To: David Malcolm; +Cc: gcc-patches

On 10/28/2015 12:09 PM, David Malcolm wrote:
> gcc/ChangeLog:
> 	* diagnostic-show-locus.c (struct point_state): New struct.
> 	(class colorizer): New class.
> 	(class layout_point): New class.
> 	(class layout_range): New class.
> 	(class layout): New class.
> 	(colorizer::colorizer): New ctor.
> 	(colorizer::~colorizer): New dtor.
> 	(layout::layout): New ctor.
> 	(layout::print_line): New method.
> 	(layout::get_state_at_point): New method.
> 	(layout::get_x_bound_for_row): New method.
> 	(show_ruler): New function.
> 	(diagnostic_show_locus): Reimplement in terms of class layout.
> ---
> +};
> +
> +/* A class to inject colorization codes when printing the diagnostic locus.
> +
> +   It has one kind of colorization for each of:
> +     - normal text
> +     - range 0 (the "primary location")
> +     - range 1
> +     - range 2
> +
> +   The class caches the lookup of the color codes for the above.
> +
> +   The class also has responsibility for tracking which of the above is
> +   active, filtering out unnecessary changes.  This allows layout::print_line
> +   to simply request a colorization code for *every* character it prints
> +   through this class, and have the filtering be done for it here.  */
Not asking you to do anything here -- hopefully this isn't a huge burden 
on the diagnostic performance.  Normally I wouldn't even notice except 
that we're inserting colorization on every character.  That kind of 
model can get expensive.  Something to watch out for -- though I doubt 
we do he massive diagnostic spews we used to which is probably the only 
place it'd be noticeable.



> +
> +/* A point within a layout_range; similar to an expanded_location,
> +   but after filtering on file.  */
> +
> +class layout_point
> +{
> + public:
> +  layout_point (const expanded_location &exploc)
> +  : m_line (exploc.line),
> +    m_column (exploc.column) {}
> +
> +  int m_line;
> +  int m_column;
> +};
Is this even deserving of its own class?  If you pulled up 
m_line/m_column you don't need the class, though I guess you need thee 
of each, one for the start, one for the finish & one for the caret, 
which in turn bloats the layout_range's constructor.  So I guess this is OK.






> +/* Given a source line LINE of length LINE_WIDTH, determine the width
> +   without any trailing whitespace.  */
> +
> +static int
> +get_line_width_without_trailing_whitespace (const char *line, int line_width)
> +{
> +  int result = line_width;
> +  while (result > 0)
> +    {
> +      char ch = line[result - 1];
> +      if (ch == ' ' || ch == '\t')
> +	result--;
> +      else
> +	break;
> +    }
> +  gcc_assert (result >= 0);
> +  gcc_assert (result <= line_width);
> +  gcc_assert (result == 0 ||
> +	      (line[result - 1] != ' '
> +	       && line[result -1] != '\t'));
> +  return result;
> +}
If you use an unsigned for the line width, don't all the asserts become 
redundant & unnecessary?  I love the sanity checking and I could see how 
it might be useful it someone were to reimplmenent this function at a 
later date.  So maybe keep.

> +
> +/* Implementation of class layout.  */
> +
> +/* Constructor for class layout.
> +
> +   Filter the ranges from the rich_location to those that we can
> +   sanely print, populating m_layout_ranges.
> +   Determine the range of lines that we will print.
> +   Determine m_x_offset, to ensure that the primary caret
> +   will fit within the max_width provided by the diagnostic_context.  */
> +
> +layout::layout (diagnostic_context * context,
> +		const diagnostic_info *diagnostic)
[ ... ]
> +  if (0)
> +    show_ruler (context, line_width, m_x_offset);
Debugging code?  If it's if (0) you should probably delete it at this point.


> +}
> +
> +/* Print text describing a line of source code.
> +   This typically prints two lines:
> +
> +   (1) the source code itself, potentially colorized at any ranges, and
> +   (2) an annotation line containing any carets/underlines
> +   describing the ranges.  */
> +
> +void
> +layout::print_line (int row)
Consider breaking this into two functions.  One to print the source line 
and another to print caret/underlines.


  +
> +/* Return true if (ROW/COLUMN) is within a range of the layout.
> +   If it returns true, OUT_STATE is written to, with the
> +   range index, and whether we should draw the caret at
> +   (ROW/COLUMN) (as opposed to an underline).  */
> +
> +bool
> +layout::get_state_at_point (/* Inputs.  */
> +			    int row, int column,
> +			    int first_non_ws, int last_non_ws,
> +			    /* Outputs.  */
> +			    point_state *out_state)
> +{
> +  layout_range *range;
> +  int i;
> +  FOR_EACH_VEC_ELT (m_layout_ranges, i, range)
> +    {
> +      if (0)
> +	fprintf (stderr,
> +		 "range ( (%i, %i), (%i, %i))->contains_point (%i, %i): %s\n",
> +		 range->m_start.m_line,
> +		 range->m_start.m_column,
> +		 range->m_finish.m_line,
> +		 range->m_finish.m_column,
> +		 row,
> +		 column,
> +		 range->contains_point (row, column) ? "true" : "false");
More old debugging code that needs to be removed?



> +
> +/* For debugging layout issues in diagnostic_show_locus and friends,
> +   render a ruler giving column numbers (after the 1-column indent).  */
> +
> +static void
> +show_ruler (diagnostic_context *context, int max_width, int x_offset)
Seems like it ought to be DEBUG_FUNCTION or removed.  I believe it's 
only caller is in if (0) code in layout's ctor.


Overall this looks good.  Take the actions you deem appropriate WRT the 
debugging bits, breaking print_line into two functions and the signed vs 
unsigned stuff in get_line_width_without_trailing_whitespace and it's 
good for the trunk.

Jeff



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 10/10] Compress short ranges into source_location
  2015-10-23 20:26   ` [PATCH 10/10] Compress short ranges into source_location David Malcolm
@ 2015-10-30  6:07     ` Jeff Law
  2015-11-04 20:42     ` Dodji Seketeli
  1 sibling, 0 replies; 83+ messages in thread
From: Jeff Law @ 2015-10-30  6:07 UTC (permalink / raw)
  To: David Malcolm, gcc-patches

On 10/23/2015 02:41 PM, David Malcolm wrote:
>
> gcc/ada/ChangeLog:
> 	* gcc-interface/trans.c (Sloc_to_locus): Add line_table param when
> 	calling linemap_position_for_line_and_column.
>
> gcc/ChangeLog:
> 	* input.c (dump_line_table_statistics): Dump stats on how many
> 	ranges were optimized vs how many needed ad-hoc table.
> 	(write_digit_row): Add "map" param; use its range_bits
> 	to calculate the per-character offset.
> 	(dump_location_info): Print the range and column bits for each
> 	ordinary map.  Use the range bits to calculate the per-character
> 	offset.  Pass the map as a new param to the various calls to
> 	write_digit_row.  Eliminate uses of
> 	ORDINARY_MAP_NUMBER_OF_COLUMN_BITS.
> 	* toplev.c (general_init): Initialize line_table's
> 	default_range_bits.
> 	* tree.c (get_pure_location): New function.
> 	(set_block): Use the pure form of the location for the
> 	caret in the combined location.
> 	(set_source_range): Likewise.
>
> gcc/testsuite/ChangeLog:
> 	* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c (get_loc): Add
> 	line_table param when calling
> 	linemap_position_for_line_and_column.
> 	* gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c
> 	(emit_warning): Remove restriction that "loc" must be ad-hoc.
>
> libcpp/ChangeLog:
> 	* include/line-map.h (source_location): Update the descriptive
> 	comment to reflect the packing scheme for short ranges.
> 	(struct line_map_ordinary): Drop field "column_bits" in favor
> 	of field "m_column_and_range_bits"; add field "m_range_bits".
> 	(ORDINARY_MAP_NUMBER_OF_COLUMN_BITS): Delete.
> 	(struct line_maps): Add fields "default_range_bits",
> 	"num_optimized_ranges" and "num_unoptimized_ranges".
> 	(get_range_from_adhoc_loc): Delete prototype.
> 	(get_range_from_loc): Convert from an inline function to a
> 	prototype.
> 	(pure_location_p): New prototype.
> 	(SOURCE_LINE): Update for renaming of column_bits.
> 	(SOURCE_COLUMN): Likewise.  Shift the column right by the map's
> 	range_bits.
> 	(LAST_SOURCE_LINE_LOCATION): Update for renaming of column_bits.
> 	(linemap_position_for_line_and_column): Add line_maps * params.
> 	* lex.c (_cpp_lex_direct): Don't attempt to record token ranges
> 	for UNKNOWN_LOCATION and BUILTINS_LOCATION.
> 	* line-map.c (LINE_MAP_MAX_COLUMN_NUMBER): Reduce from 1U << 17 to
> 	1U << 9.
> 	(can_be_stored_compactly_p): New function.
> 	(get_combined_adhoc_loc): Implement bit-packing scheme for short
> 	ranges.
> 	(get_range_from_adhoc_loc): Make static.
> 	(get_range_from_loc): New function.
> 	(pure_location_p): New function.
> 	(linemap_add): Ensure that start_location has zero for the
> 	range_bits, unless we're past LINE_MAP_MAX_LOCATION_WITH_COLS.
> 	Initialize range_bits to zero.  Assert that the start_location
> 	is "pure".
> 	(linemap_line_start): Assert that the
> 	column_and_range_bits >= range_bits.
> 	Update determinination of whether we need to start a new map
> 	using the effective column bits, without the range bits.
> 	Use the set's default_range_bits in new maps, apart from
> 	those with column_bits == 0, which should also have 0 range_bits.
> 	Increase the column bits for new maps by the range bits.
> 	When adding lines to an existing map, use set->highest_line
> 	directly rather than offsetting highest by SOURCE_COLUMN.
> 	Add assertions to sanity-check the return value.
> 	(linemap_position_for_column): Offset to_column by range_bits.
> 	Update set->hightest_location if necessary.
> 	(linemap_position_for_line_and_column): Add line_maps * param.
> 	Update the calculation to offset the column by range_bits, and
> 	conditionalize it on being <= LINE_MAP_MAX_LOCATION_WITH_COLS.
> 	Bound it by LINEMAPS_MACRO_LOWEST_LOCATION.  Update
> 	set->highest_location if necessary.
> 	(linemap_position_for_loc_and_offset): Pass "set" to
> 	linemap_position_for_line_and_column.
> 	* location-example.txt: Regenerate, showing new representation.
"determinination"? :-)

> ---
>   gcc/ada/gcc-interface/trans.c                      |   3 +-
>   gcc/input.c                                        |  28 ++-
>   .../plugin/diagnostic_plugin_test_show_locus.c     |   3 +-
>   .../diagnostic_plugin_test_tree_expression_range.c |   8 +-
>   gcc/toplev.c                                       |   1 +
>   gcc/tree.c                                         |  25 ++-
>   libcpp/include/line-map.h                          | 121 +++++++----
>   libcpp/lex.c                                       |   9 +-
>   libcpp/line-map.c                                  | 229 +++++++++++++++++++--
>   libcpp/location-example.txt                        | 188 +++++++++--------
>   10 files changed, 450 insertions(+), 165 deletions(-)
>

> diff --git a/gcc/tree.c b/gcc/tree.c
> index a676352..4ec4a38 100644
> --- a/gcc/tree.c
> +++ b/gcc/tree.c
> @@ -13653,11 +13653,31 @@ nonnull_arg_p (const_tree arg)
>     return false;
>   }
>
> +static location_t
> +get_pure_location (location_t loc)
Function comment.



> diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
> index 0ef29d9..1a2dab8 100644
> --- a/libcpp/include/line-map.h
> +++ b/libcpp/include/line-map.h
  > @@ -821,8 +871,10 @@ extern source_location get_combined_adhoc_loc 
(struct line_maps *,
>   extern void *get_data_from_adhoc_loc (struct line_maps *, source_location);
>   extern source_location get_location_from_adhoc_loc (struct line_maps *,
>   						    source_location);
> -extern source_range get_range_from_adhoc_loc (struct line_maps *,
> -					      source_location);
> +
> +extern source_range
> +get_range_from_loc (line_maps *set,
> +		    source_location loc);
Nit.  Should probably all be on one line.

  diff --git a/libcpp/line-map.c b/libcpp/line-map.c
> index 6385fdf..fe8d784 100644
> --- a/libcpp/line-map.c
> +++ b/libcpp/line-map.c
> @@ -29,7 +29,7 @@ along with this program; see the file COPYING3.  If not see
>   /* Do not track column numbers higher than this one.  As a result, the
>      range of column_bits is [7, 18] (or 0 if column numbers are
>      disabled).  */
> -const unsigned int LINE_MAP_MAX_COLUMN_NUMBER = (1U << 17);
> +const unsigned int LINE_MAP_MAX_COLUMN_NUMBER = (1U << 9);
Comment needs updating, right?


> +/* Helper function for get_combined_adhoc_loc.
> +   Can the given LOCUS + SRC_RANGE and DATA pointer be stored compactly
> +   within a source_location, without needing to use an ad-hoc location.  */
> +
> +static bool
> +can_be_stored_compactly_p (struct line_maps *set,
> +			   source_location locus,
> +			   source_range src_range,
> +			   void *data)
> +{
> +  /* If there's an ad-hoc pointer, we can't store it directly in the
> +     source_location, we need the lookaside.  */
> +  if (data)
> +    return false;
> +
> +  /* We only store ranges that begin at the locus and that are sufficientl
sufficiently


> +
> +#define DEBUG_PACKING 0
> +
> +#if DEBUG_PACKING
> +  fprintf (stderr, "get_combined_adhoc_loc: %x %x %x\n",
> +	   locus, src_range.m_start, src_range.m_finish);
> +#endif
Shouldn't this stuff (DEBUG_PACKING) just go away?


With the nits above fixed, this is OK.  Obviously there's prereqs that 
need to be approved and committed together.

jeff

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 08/10] Wire things up so that libcpp users get token underlines
  2015-10-23 20:26   ` [PATCH 08/10] Wire things up so that libcpp users get token underlines David Malcolm
@ 2015-10-30  6:15     ` Jeff Law
  0 siblings, 0 replies; 83+ messages in thread
From: Jeff Law @ 2015-10-30  6:15 UTC (permalink / raw)
  To: David Malcolm, gcc-patches

On 10/23/2015 02:41 PM, David Malcolm wrote:
> A previous patch introduced the ability to print one or more ranges
> for a diagnostic via a rich_location class.
>
> Another patch generalized source_location (aka location_t) to be both
> a caret and a range, and generated range information for all tokens
> coming out of libcpp's lexer.
>
> The attached patch combines these efforts by updating the
> rich_location constructor for a single source_location so that it
> makes use of the range within the source_location.  Doing so requires
> passing the line_table to the ctor, so that it can extract the range
> from there.
>
> The effect of this is that all of the various "warning", "warning_at"
> "error", "error_at" diagnostics now emit underlines showing the range
> of the token associated with the location_t (or input_location), for
> those frontends using libcpp.  Similar things should happen for
> expressions in the C FE for diagnostics using EXPR_LOCATION.
>
> A test case is added showing various token-based warnings that now
> have underlines (without having to go through and add range information
> to them).  For example:
>
> diagnostic-token-ranges.c: In function ‘wide_string_literal_in_asm’:
> diagnostic-token-ranges.c:68:8: error: wide string literal in ‘asm’
>     asm (L"nop");
>          ^~~~~~
>
> gcc/c-family/ChangeLog:
> 	* c-opts.c (c_common_init_options): Set
> 	global_dc->colorize_source_p.
>
> gcc/c/ChangeLog:
> 	* c-decl.c (warn_defaults_to): Pass line_table to
> 	rich_location ctor.
> 	* c-errors.c (pedwarn_c99): Likewise.
> 	(pedwarn_c90): Likewise.
>
> gcc/cp/ChangeLog:
> 	* error.c (pedwarn_cxx98): Pass line_table to
> 	rich_location ctor.
>
> gcc/ChangeLog:
> 	* diagnostic.c (diagnostic_append_note): Pass line_table to
> 	rich_location ctor.
> 	(emit_diagnostic): Likewise.
> 	(inform): Likewise.
> 	(inform_n): Likewise.
> 	(warning): Likewise.
> 	(warning_at): Likewise.
> 	(warning_n): Likewise.
> 	(pedwarn): Likewise.
> 	(permerror): Likewise.
> 	(error): Likewise.
> 	(error_n): Likewise.
> 	(error_at): Likewise.
> 	(sorry): Likewise.
> 	(fatal_error): Likewise.
> 	(internal_error): Likewise.
> 	(internal_error_no_backtrace): Likewise.
> 	(real_abort): Likewise.
> 	* gcc-rich-location.h (gcc_rich_location::gcc_rich_location):
> 	Likewise.
> 	* genmatch.c (fatal_at): Likewise.
> 	(warning_at): Likewise.
> 	* rtl-error.c (diagnostic_for_asm): Likewise.
>
> gcc/fortran/ChangeLog:
> 	* error.c (gfc_warning): Pass line_table to rich_location ctor.
> 	(gfc_warning_now_at): Likewise.
> 	(gfc_warning_now): Likewise.
> 	(gfc_error_now): Likewise.
> 	(gfc_fatal_error): Likewise.
> 	(gfc_error): Likewise.
> 	(gfc_internal_error): Likewise.
>
> gcc/testsuite/ChangeLog:
> 	* gcc.dg/diagnostic-token-ranges.c: New file.
> 	* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
> 	(test_show_locus): Pass line_table to rich_location ctors.
> 	(plugin_init): Remove setting of global_dc->colorize_source_p.
> 	* gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c:
> 	Remove include of gcc-rich-location.h.
> 	(get_range_for_expr): Delete.
> 	(gcc_rich_location::add_expr): Delete.
> 	(emit_warning): Change param from rich_location * to location_t.
> 	Require an ad-hoc location, and extract range from it.
> 	Use warning_at directly, without using a rich_location.
> 	(cb_walk_tree_fn): Pass EXPR_LOCATION (arg) directly to
> 	emit_warning, without creating a rich_location.
>
> libcpp/ChangeLog:
> 	* errors.c (cpp_diagnostic): Pass pfile->line_table to
> 	rich_location ctor.
> 	(cpp_diagnostic_with_line): Likewise.
> 	* include/line-map.h (rich_location::rich_location): Add
> 	line_maps * param.
> 	* line-map.c (rich_location::rich_location): Likewise; use
> 	it to extract the range from the source_location.
OK.  Commit with prereqs.

jeff


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 06/10] Track expression ranges in C frontend
  2015-10-23 20:24   ` [PATCH 06/10] Track expression ranges in C frontend David Malcolm
@ 2015-10-30  8:01     ` Jeff Law
  2015-11-02 19:14       ` Status of rich location work (was Re: [PATCH 06/10] Track expression ranges in C frontend) David Malcolm
  0 siblings, 1 reply; 83+ messages in thread
From: Jeff Law @ 2015-10-30  8:01 UTC (permalink / raw)
  To: David Malcolm, gcc-patches

On 10/23/2015 02:41 PM, David Malcolm wrote:
> As in the previous version of this patch
>   "Implement tree expression tracking in C FE (v2)"
> the patch now captures ranges for all C expressions during parsing within
> a new field of c_expr, and for all tree nodes with a location_t, it stores
> them in ad-hoc locations for later use.
>
> Hence compound expressions get ranges; see:
>    https://dmalcolm.fedorapeople.org/gcc/2015-09-22/diagnostic-test-expressions-1.html
>
> and for this example:
>
>    int test (int foo)
>    {
>      return foo * 100;
>             ^^^   ^^^
>    }
>
> we have access to the ranges of "foo" and "100" during C parsing via
> the c_expr, but once we have GENERIC, all we have is a VAR_DECL and an
> INTEGER_CST (the former's location is in at the top of the
> function, and the latter has no location).
>
> gcc/ChangeLog:
> 	* Makefile.in (OBJS): Add gcc-rich-location.o.
> 	* gcc-rich-location.c: New file.
> 	* gcc-rich-location.h: New file.
> 	* print-tree.c (print_node): Print any source range information.
> 	* tree.c (set_source_range): New functions.
> 	* tree.h (CAN_HAVE_RANGE_P): New.
> 	(EXPR_LOCATION_RANGE): New.
> 	(EXPR_HAS_RANGE): New.
> 	(get_expr_source_range): New inline function.
> 	(DECL_LOCATION_RANGE): New.
> 	(set_source_range): New decls.
> 	(get_decl_source_range): New inline function.
>
> gcc/c-family/ChangeLog:
> 	* c-common.c (c_fully_fold_internal): Capture existing souce_range,
> 	and store it on the result.
>
> gcc/c/ChangeLog:
> 	* c-parser.c (set_c_expr_source_range): New functions.
> 	(c_token::get_range): New method.
> 	(c_token::get_finish): New method.
> 	(c_parser_expr_no_commas): Call set_c_expr_source_range on the ret
> 	based on the range from the start of the LHS to the end of the
> 	RHS.
> 	(c_parser_conditional_expression): Likewise, based on the range
> 	from the start of the cond.value to the end of exp2.value.
> 	(c_parser_binary_expression): Call set_c_expr_source_range on
> 	the stack values for TRUTH_ANDIF_EXPR and TRUTH_ORIF_EXPR.
> 	(c_parser_cast_expression): Call set_c_expr_source_range on ret
> 	based on the cast_loc through to the end of the expr.
> 	(c_parser_unary_expression): Likewise, based on the
> 	op_loc through to the end of op.
> 	(c_parser_sizeof_expression) Likewise, based on the start of the
> 	sizeof token through to either the closing paren or the end of
> 	expr.
> 	(c_parser_postfix_expression): Likewise, using the token range,
> 	or from the open paren through to the close paren for
> 	parenthesized expressions.
> 	(c_parser_postfix_expression_after_primary): Likewise, for
> 	various kinds of expression.
> 	* c-tree.h (struct c_expr): Add field "src_range".
> 	(c_expr::get_start): New method.
> 	(c_expr::get_finish): New method.
> 	(set_c_expr_source_range): New decls.
> 	* c-typeck.c (parser_build_unary_op): Call set_c_expr_source_range
> 	on ret for prefix unary ops.
> 	(parser_build_binary_op): Likewise, running from the start of
> 	arg1.value through to the end of arg2.value.
>
> gcc/testsuite/ChangeLog:
> 	* gcc.dg/plugin/diagnostic-test-expressions-1.c: New file.
> 	* gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c:
> 	New file.
> 	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add
> 	diagnostic_plugin_test_tree_expression_range.c and
> 	diagnostic-test-expressions-1.c.

>   /* Initialization routine for this file.  */
>
> @@ -6085,6 +6112,9 @@ c_parser_expr_no_commas (c_parser *parser, struct c_expr *after,
>     ret.value = build_modify_expr (op_location, lhs.value, lhs.original_type,
>   				 code, exp_location, rhs.value,
>   				 rhs.original_type);
> +  set_c_expr_source_range (&ret,
> +			   lhs.get_start (),
> +			   rhs.get_finish ());
One line if it fits.


> @@ -6198,6 +6232,9 @@ c_parser_conditional_expression (c_parser *parser, struct c_expr *after,
>   			   ? t1
>   			   : NULL);
>       }
> +  set_c_expr_source_range (&ret,
> +			   start,
> +			   exp2.get_finish ());
Here too.

> @@ -6522,6 +6564,10 @@ c_parser_cast_expression (c_parser *parser, struct c_expr *after)
>   	expr = convert_lvalue_to_rvalue (expr_loc, expr, true, true);
>         }
>         ret.value = c_cast_expr (cast_loc, type_name, expr.value);
> +      if (ret.value && expr.value)
> +	set_c_expr_source_range (&ret,
> +				 cast_loc,
> +				 expr.get_finish ());
And here?

With the nits fixed, this is OK.

I think that covers this iteration of the rich location work and that 
you'll continue working with Jason on extending this into the C++ front-end.

jeff

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 4b] diagnostic-show-locus.c changes: Insertions
  2015-10-30  4:53           ` Jeff Law
@ 2015-10-30 19:42             ` David Malcolm
  2015-11-06 19:59             ` David Malcolm
  1 sibling, 0 replies; 83+ messages in thread
From: David Malcolm @ 2015-10-30 19:42 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Thu, 2015-10-29 at 22:49 -0600, Jeff Law wrote:
> On 10/28/2015 12:09 PM, David Malcolm wrote:
> > gcc/ChangeLog:
> > 	* diagnostic-show-locus.c (struct point_state): New struct.
> > 	(class colorizer): New class.
> > 	(class layout_point): New class.
> > 	(class layout_range): New class.
> > 	(class layout): New class.
> > 	(colorizer::colorizer): New ctor.
> > 	(colorizer::~colorizer): New dtor.
> > 	(layout::layout): New ctor.
> > 	(layout::print_line): New method.
> > 	(layout::get_state_at_point): New method.
> > 	(layout::get_x_bound_for_row): New method.
> > 	(show_ruler): New function.
> > 	(diagnostic_show_locus): Reimplement in terms of class layout.
> > ---
> > +};
> > +
> > +/* A class to inject colorization codes when printing the diagnostic locus.
> > +
> > +   It has one kind of colorization for each of:
> > +     - normal text
> > +     - range 0 (the "primary location")
> > +     - range 1
> > +     - range 2
> > +
> > +   The class caches the lookup of the color codes for the above.
> > +
> > +   The class also has responsibility for tracking which of the above is
> > +   active, filtering out unnecessary changes.  This allows layout::print_line
> > +   to simply request a colorization code for *every* character it prints
> > +   through this class, and have the filtering be done for it here.  */
> Not asking you to do anything here -- hopefully this isn't a huge burden 
> on the diagnostic performance.  Normally I wouldn't even notice except 
> that we're inserting colorization on every character.  That kind of 
> model can get expensive.  Something to watch out for -- though I doubt 
> we do he massive diagnostic spews we used to which is probably the only 
> place it'd be noticeable.

Maybe the comment for the class is unclear: we're *not* emitting color
codes on every character: in fact preventing that is the point of the
colorizer class.  The layout class does indeed request a color code for
each character, but the colorizer class filters them so that the only
actual colorization codes that are emitted are when the status changes
(and they're not emitted at all if -fdiagnostics-color=never).

It's a separation of responsibilities, to keep the layout class simpler.

[it's even tested, in that
gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-locus-color.c
only contains status-change color codes in the 
dg-begin/end-multiline-output directives]

> > +
> > +/* A point within a layout_range; similar to an expanded_location,
> > +   but after filtering on file.  */
> > +
> > +class layout_point
> > +{
> > + public:
> > +  layout_point (const expanded_location &exploc)
> > +  : m_line (exploc.line),
> > +    m_column (exploc.column) {}
> > +
> > +  int m_line;
> > +  int m_column;
> > +};
> Is this even deserving of its own class?  If you pulled up 
> m_line/m_column you don't need the class, though I guess you need thee 
> of each, one for the start, one for the finish & one for the caret, 
> which in turn bloats the layout_range's constructor.  So I guess this is OK.

Yes, it's to simplify layout_range.


> > +/* Given a source line LINE of length LINE_WIDTH, determine the width
> > +   without any trailing whitespace.  */
> > +
> > +static int
> > +get_line_width_without_trailing_whitespace (const char *line, int line_width)
> > +{
> > +  int result = line_width;
> > +  while (result > 0)
> > +    {
> > +      char ch = line[result - 1];
> > +      if (ch == ' ' || ch == '\t')
> > +	result--;
> > +      else
> > +	break;
> > +    }
> > +  gcc_assert (result >= 0);
> > +  gcc_assert (result <= line_width);
> > +  gcc_assert (result == 0 ||
> > +	      (line[result - 1] != ' '
> > +	       && line[result -1] != '\t'));
> > +  return result;
> > +}
> If you use an unsigned for the line width, don't all the asserts become 
> redundant & unnecessary?  I love the sanity checking and I could see how 
> it might be useful it someone were to reimplmenent this function at a 
> later date.  So maybe keep.

FWIW, the type for the line width ultimately comes from the 3rd param to
location_get_source_line:

extern const char *location_get_source_line (const char *file_path, int line,
					     int *line_size);

We could change that, since negative values are clearly bogus, but to do
seems to me to be tangential to this patch kit.


> > +
> > +/* Implementation of class layout.  */
> > +
> > +/* Constructor for class layout.
> > +
> > +   Filter the ranges from the rich_location to those that we can
> > +   sanely print, populating m_layout_ranges.
> > +   Determine the range of lines that we will print.
> > +   Determine m_x_offset, to ensure that the primary caret
> > +   will fit within the max_width provided by the diagnostic_context.  */
> > +
> > +layout::layout (diagnostic_context * context,
> > +		const diagnostic_info *diagnostic)
> [ ... ]
> > +  if (0)
> > +    show_ruler (context, line_width, m_x_offset);
> Debugging code?  If it's if (0) you should probably delete it at this point.
> 
> 
> > +}
> > +
> > +/* Print text describing a line of source code.
> > +   This typically prints two lines:
> > +
> > +   (1) the source code itself, potentially colorized at any ranges, and
> > +   (2) an annotation line containing any carets/underlines
> > +   describing the ranges.  */
> > +
> > +void
> > +layout::print_line (int row)
> Consider breaking this into two functions.  One to print the source line 
> and another to print caret/underlines.

There's some state set up when printing the source line that's used when
printing the annotation line, so it's not trivial to break them,
although desirable; I may introduce a
  class line_layout
to encapsulate that, if that sounds reasonable.

>   +
> > +/* Return true if (ROW/COLUMN) is within a range of the layout.
> > +   If it returns true, OUT_STATE is written to, with the
> > +   range index, and whether we should draw the caret at
> > +   (ROW/COLUMN) (as opposed to an underline).  */
> > +
> > +bool
> > +layout::get_state_at_point (/* Inputs.  */
> > +			    int row, int column,
> > +			    int first_non_ws, int last_non_ws,
> > +			    /* Outputs.  */
> > +			    point_state *out_state)
> > +{
> > +  layout_range *range;
> > +  int i;
> > +  FOR_EACH_VEC_ELT (m_layout_ranges, i, range)
> > +    {
> > +      if (0)
> > +	fprintf (stderr,
> > +		 "range ( (%i, %i), (%i, %i))->contains_point (%i, %i): %s\n",
> > +		 range->m_start.m_line,
> > +		 range->m_start.m_column,
> > +		 range->m_finish.m_line,
> > +		 range->m_finish.m_column,
> > +		 row,
> > +		 column,
> > +		 range->contains_point (row, column) ? "true" : "false");
> More old debugging code that needs to be removed?

Will remove.

> > +
> > +/* For debugging layout issues in diagnostic_show_locus and friends,
> > +   render a ruler giving column numbers (after the 1-column indent).  */
> > +
> > +static void
> > +show_ruler (diagnostic_context *context, int max_width, int x_offset)
> Seems like it ought to be DEBUG_FUNCTION or removed.  I believe it's 
> only caller is in if (0) code in layout's ctor.

Will remove.

> Overall this looks good.  Take the actions you deem appropriate WRT the 
> debugging bits, breaking print_line into two functions and the signed vs 
> unsigned stuff in get_line_width_without_trailing_whitespace and it's 
> good for the trunk.
> 
> Jeff

Thanks
Dave


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Status of rich location work (was Re: [PATCH 06/10] Track expression ranges in C frontend)
  2015-10-30  8:01     ` Jeff Law
@ 2015-11-02 19:14       ` David Malcolm
  2015-11-02 19:53         ` David Malcolm
                           ` (3 more replies)
  0 siblings, 4 replies; 83+ messages in thread
From: David Malcolm @ 2015-11-02 19:14 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches, Richard Biener, Dodji Seketeli

On Fri, 2015-10-30 at 00:15 -0600, Jeff Law wrote:
> On 10/23/2015 02:41 PM, David Malcolm wrote:
> > As in the previous version of this patch
> >   "Implement tree expression tracking in C FE (v2)"
> > the patch now captures ranges for all C expressions during parsing within
> > a new field of c_expr, and for all tree nodes with a location_t, it stores
> > them in ad-hoc locations for later use.
> >
> > Hence compound expressions get ranges; see:
> >    https://dmalcolm.fedorapeople.org/gcc/2015-09-22/diagnostic-test-expressions-1.html
> >
> > and for this example:
> >
> >    int test (int foo)
> >    {
> >      return foo * 100;
> >             ^^^   ^^^
> >    }
> >
> > we have access to the ranges of "foo" and "100" during C parsing via
> > the c_expr, but once we have GENERIC, all we have is a VAR_DECL and an
> > INTEGER_CST (the former's location is in at the top of the
> > function, and the latter has no location).
> >
> > gcc/ChangeLog:
> > 	* Makefile.in (OBJS): Add gcc-rich-location.o.
> > 	* gcc-rich-location.c: New file.
> > 	* gcc-rich-location.h: New file.
> > 	* print-tree.c (print_node): Print any source range information.
> > 	* tree.c (set_source_range): New functions.
> > 	* tree.h (CAN_HAVE_RANGE_P): New.
> > 	(EXPR_LOCATION_RANGE): New.
> > 	(EXPR_HAS_RANGE): New.
> > 	(get_expr_source_range): New inline function.
> > 	(DECL_LOCATION_RANGE): New.
> > 	(set_source_range): New decls.
> > 	(get_decl_source_range): New inline function.
> >
> > gcc/c-family/ChangeLog:
> > 	* c-common.c (c_fully_fold_internal): Capture existing souce_range,
> > 	and store it on the result.
> >
> > gcc/c/ChangeLog:
> > 	* c-parser.c (set_c_expr_source_range): New functions.
> > 	(c_token::get_range): New method.
> > 	(c_token::get_finish): New method.
> > 	(c_parser_expr_no_commas): Call set_c_expr_source_range on the ret
> > 	based on the range from the start of the LHS to the end of the
> > 	RHS.
> > 	(c_parser_conditional_expression): Likewise, based on the range
> > 	from the start of the cond.value to the end of exp2.value.
> > 	(c_parser_binary_expression): Call set_c_expr_source_range on
> > 	the stack values for TRUTH_ANDIF_EXPR and TRUTH_ORIF_EXPR.
> > 	(c_parser_cast_expression): Call set_c_expr_source_range on ret
> > 	based on the cast_loc through to the end of the expr.
> > 	(c_parser_unary_expression): Likewise, based on the
> > 	op_loc through to the end of op.
> > 	(c_parser_sizeof_expression) Likewise, based on the start of the
> > 	sizeof token through to either the closing paren or the end of
> > 	expr.
> > 	(c_parser_postfix_expression): Likewise, using the token range,
> > 	or from the open paren through to the close paren for
> > 	parenthesized expressions.
> > 	(c_parser_postfix_expression_after_primary): Likewise, for
> > 	various kinds of expression.
> > 	* c-tree.h (struct c_expr): Add field "src_range".
> > 	(c_expr::get_start): New method.
> > 	(c_expr::get_finish): New method.
> > 	(set_c_expr_source_range): New decls.
> > 	* c-typeck.c (parser_build_unary_op): Call set_c_expr_source_range
> > 	on ret for prefix unary ops.
> > 	(parser_build_binary_op): Likewise, running from the start of
> > 	arg1.value through to the end of arg2.value.
> >
> > gcc/testsuite/ChangeLog:
> > 	* gcc.dg/plugin/diagnostic-test-expressions-1.c: New file.
> > 	* gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c:
> > 	New file.
> > 	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add
> > 	diagnostic_plugin_test_tree_expression_range.c and
> > 	diagnostic-test-expressions-1.c.
> 
> >   /* Initialization routine for this file.  */
> >
> > @@ -6085,6 +6112,9 @@ c_parser_expr_no_commas (c_parser *parser, struct c_expr *after,
> >     ret.value = build_modify_expr (op_location, lhs.value, lhs.original_type,
> >   				 code, exp_location, rhs.value,
> >   				 rhs.original_type);
> > +  set_c_expr_source_range (&ret,
> > +			   lhs.get_start (),
> > +			   rhs.get_finish ());
> One line if it fits.
> 
> 
> > @@ -6198,6 +6232,9 @@ c_parser_conditional_expression (c_parser *parser, struct c_expr *after,
> >   			   ? t1
> >   			   : NULL);
> >       }
> > +  set_c_expr_source_range (&ret,
> > +			   start,
> > +			   exp2.get_finish ());
> Here too.
> 
> > @@ -6522,6 +6564,10 @@ c_parser_cast_expression (c_parser *parser, struct c_expr *after)
> >   	expr = convert_lvalue_to_rvalue (expr_loc, expr, true, true);
> >         }
> >         ret.value = c_cast_expr (cast_loc, type_name, expr.value);
> > +      if (ret.value && expr.value)
> > +	set_c_expr_source_range (&ret,
> > +				 cast_loc,
> > +				 expr.get_finish ());
> And here?
> 
> With the nits fixed, this is OK.
> 
> I think that covers this iteration of the rich location work and that 
> you'll continue working with Jason on extending this into the C++ front-end.

Here's a summary of the current status of this work [1]:

Patches 1-4 of the kit: these Jeff has approved, with some pre-approved
nit fixes in 4.  I see these as relatively low risk, and plan to commit
these today/tomorrow.

Patches 5-10: Jeff approved these also (again with some nits). These
feel higher-risk to me, owing to the potential for performance
regressions; I haven't yet answered at least one of Richi's performance
questions (impact on time taken to generate the C++ PCH file); the last
performance testing I did can be seen here:
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02283.html
where the right-most column is this kit.

CCing Richi to keep him in the loop for the above.  Richi, is there any
other specific testing you'd want me to do for this?
Or is it OK to commit, and to see what impact it has on your daily
performance testing?  (and to revert if the impact is unacceptable).

Talking about risks: the reduction of the space for ordinary maps by a
factor of 32, by taking up 5 bits for the packed range information
optimization (patch 10):
 https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02539.html
CCing Dodji: Dodji; is this reasonable?

I did some experiments back in July on this; instrumented builds of
openldap and openssl to see how much space we have in location_t:
https://dmalcolm.fedorapeople.org/gcc/2015-07-23/openldap.csv
https://dmalcolm.fedorapeople.org/gcc/2015-07-24/openldap.csv

(these are space-separated:
           SRPM name
           sourcefile
           maximal ordinary location_t
           minimal macro location_t)

openldap build #files: 906
maximal ordinary location_t was:
sourcefile='/builddir/build/BUILD/openldap-2.4.40/openldap-2.4.40/servers/slapd/bconfig.c'
          max_ordinary_location=0x0081bd1b
          (and min_macro_location=0x7ffe5903
minimal macro location_t was:
sourcefile='/builddir/build/BUILD/openldap-2.4.40/openldap-2.4.40/servers/slapd/aclparse.c'
          min_macro_location=0x7ffe57e2
          (with max_ordinary_location=0x00719775)

openssl-1.0.1k-8.fc22.src.rpm.x86_64:
      #files: 1495
max_ordinary_location=0x00be3726
 (openssl-1.0.1k/apps/s_client.c)
 with min_macro_location=0x7ffe7b6b

min_macro_location=0x7ffdf069 
 (openssl-1.0.1k/apps/speed.c)
 with max_ordinary_location=0x00a1abdf

In all of the above cases, we had enough room to do the bit-packing
optimization, but this is just two projects (albeit real-world C code).

Comparing the gap between maximal ordinary map location and minimal
macro map location, and seeing how much we can expand the ordinary map
locations, the openldap build had:
  (0x7ffe57e2 - 0x0081bd1b) / 0x0081bd1b  == factor of 251 i.e.
7 bits of space available

openssl build had:
  (0x7ffdf069 - 0x00be3726) / 0x00be3726  == factor of 171 i.e. 7 bits
of space available

hence allocating 5 bits to packing ranges is (I hope) reasonable.


Jeff: I'm working on expression ranges in the C++ FE; is that a
prerequisite for patches 5-10, or can 5-10 go ahead without the C++
work?  (assuming the other issues above are acceptable).

Hope this all makes sense and sounds sane
Dave

[1] Together the kit gives us:
* patch 4: infrastructure for printing underlines in
diagnostic_show_locus and for multiple ranges
* patches 5-10: the "source_location" (aka location_t) type becomes a
caret plus a range; the tokens coming from libcpp gain ranges, so
everything using libcpp gains at least underlines of tokens; the C
frontend generates sane ranges for expressions as it parses them, better
showing the user how the parser "sees" their code.

Hence we ought to get underlined ranges for many diagnostics in C and C
++ with this (e.g. input_location gives an underline covering the range
of the token starting at the caret).  The "caret" should remain
unchanged from the status quo, so e.g. debugging locations shouldn't be
affected by the addition of ranges.

I'm anticipating that we'd need some followup patches to pick better
ranges for some diagnostics, analogous to the way we convert "warning"
to "warning_at" for where input_location isn't the best location; I'd
expect these followup patches to be relative simple and low-risk.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Status of rich location work (was Re: [PATCH 06/10] Track expression ranges in C frontend)
  2015-11-02 19:14       ` Status of rich location work (was Re: [PATCH 06/10] Track expression ranges in C frontend) David Malcolm
@ 2015-11-02 19:53         ` David Malcolm
  2015-11-02 22:26         ` Jeff Law
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 83+ messages in thread
From: David Malcolm @ 2015-11-02 19:53 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches, Richard Biener, Dodji Seketeli

On Mon, 2015-11-02 at 14:14 -0500, David Malcolm wrote:
> On Fri, 2015-10-30 at 00:15 -0600, Jeff Law wrote:
> > On 10/23/2015 02:41 PM, David Malcolm wrote:
> > > As in the previous version of this patch
> > >   "Implement tree expression tracking in C FE (v2)"
> > > the patch now captures ranges for all C expressions during parsing within
> > > a new field of c_expr, and for all tree nodes with a location_t, it stores
> > > them in ad-hoc locations for later use.
> > >
> > > Hence compound expressions get ranges; see:
> > >    https://dmalcolm.fedorapeople.org/gcc/2015-09-22/diagnostic-test-expressions-1.html
> > >
> > > and for this example:
> > >
> > >    int test (int foo)
> > >    {
> > >      return foo * 100;
> > >             ^^^   ^^^
> > >    }
> > >
> > > we have access to the ranges of "foo" and "100" during C parsing via
> > > the c_expr, but once we have GENERIC, all we have is a VAR_DECL and an
> > > INTEGER_CST (the former's location is in at the top of the
> > > function, and the latter has no location).
> > >
> > > gcc/ChangeLog:
> > > 	* Makefile.in (OBJS): Add gcc-rich-location.o.
> > > 	* gcc-rich-location.c: New file.
> > > 	* gcc-rich-location.h: New file.
> > > 	* print-tree.c (print_node): Print any source range information.
> > > 	* tree.c (set_source_range): New functions.
> > > 	* tree.h (CAN_HAVE_RANGE_P): New.
> > > 	(EXPR_LOCATION_RANGE): New.
> > > 	(EXPR_HAS_RANGE): New.
> > > 	(get_expr_source_range): New inline function.
> > > 	(DECL_LOCATION_RANGE): New.
> > > 	(set_source_range): New decls.
> > > 	(get_decl_source_range): New inline function.
> > >
> > > gcc/c-family/ChangeLog:
> > > 	* c-common.c (c_fully_fold_internal): Capture existing souce_range,
> > > 	and store it on the result.
> > >
> > > gcc/c/ChangeLog:
> > > 	* c-parser.c (set_c_expr_source_range): New functions.
> > > 	(c_token::get_range): New method.
> > > 	(c_token::get_finish): New method.
> > > 	(c_parser_expr_no_commas): Call set_c_expr_source_range on the ret
> > > 	based on the range from the start of the LHS to the end of the
> > > 	RHS.
> > > 	(c_parser_conditional_expression): Likewise, based on the range
> > > 	from the start of the cond.value to the end of exp2.value.
> > > 	(c_parser_binary_expression): Call set_c_expr_source_range on
> > > 	the stack values for TRUTH_ANDIF_EXPR and TRUTH_ORIF_EXPR.
> > > 	(c_parser_cast_expression): Call set_c_expr_source_range on ret
> > > 	based on the cast_loc through to the end of the expr.
> > > 	(c_parser_unary_expression): Likewise, based on the
> > > 	op_loc through to the end of op.
> > > 	(c_parser_sizeof_expression) Likewise, based on the start of the
> > > 	sizeof token through to either the closing paren or the end of
> > > 	expr.
> > > 	(c_parser_postfix_expression): Likewise, using the token range,
> > > 	or from the open paren through to the close paren for
> > > 	parenthesized expressions.
> > > 	(c_parser_postfix_expression_after_primary): Likewise, for
> > > 	various kinds of expression.
> > > 	* c-tree.h (struct c_expr): Add field "src_range".
> > > 	(c_expr::get_start): New method.
> > > 	(c_expr::get_finish): New method.
> > > 	(set_c_expr_source_range): New decls.
> > > 	* c-typeck.c (parser_build_unary_op): Call set_c_expr_source_range
> > > 	on ret for prefix unary ops.
> > > 	(parser_build_binary_op): Likewise, running from the start of
> > > 	arg1.value through to the end of arg2.value.
> > >
> > > gcc/testsuite/ChangeLog:
> > > 	* gcc.dg/plugin/diagnostic-test-expressions-1.c: New file.
> > > 	* gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c:
> > > 	New file.
> > > 	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add
> > > 	diagnostic_plugin_test_tree_expression_range.c and
> > > 	diagnostic-test-expressions-1.c.
> > 
> > >   /* Initialization routine for this file.  */
> > >
> > > @@ -6085,6 +6112,9 @@ c_parser_expr_no_commas (c_parser *parser, struct c_expr *after,
> > >     ret.value = build_modify_expr (op_location, lhs.value, lhs.original_type,
> > >   				 code, exp_location, rhs.value,
> > >   				 rhs.original_type);
> > > +  set_c_expr_source_range (&ret,
> > > +			   lhs.get_start (),
> > > +			   rhs.get_finish ());
> > One line if it fits.
> > 
> > 
> > > @@ -6198,6 +6232,9 @@ c_parser_conditional_expression (c_parser *parser, struct c_expr *after,
> > >   			   ? t1
> > >   			   : NULL);
> > >       }
> > > +  set_c_expr_source_range (&ret,
> > > +			   start,
> > > +			   exp2.get_finish ());
> > Here too.
> > 
> > > @@ -6522,6 +6564,10 @@ c_parser_cast_expression (c_parser *parser, struct c_expr *after)
> > >   	expr = convert_lvalue_to_rvalue (expr_loc, expr, true, true);
> > >         }
> > >         ret.value = c_cast_expr (cast_loc, type_name, expr.value);
> > > +      if (ret.value && expr.value)
> > > +	set_c_expr_source_range (&ret,
> > > +				 cast_loc,
> > > +				 expr.get_finish ());
> > And here?
> > 
> > With the nits fixed, this is OK.
> > 
> > I think that covers this iteration of the rich location work and that 
> > you'll continue working with Jason on extending this into the C++ front-end.
> 
> Here's a summary of the current status of this work [1]:
> 
> Patches 1-4 of the kit: these Jeff has approved, with some pre-approved
> nit fixes in 4.  I see these as relatively low risk, and plan to commit
> these today/tomorrow.
> 
> Patches 5-10: Jeff approved these also (again with some nits). These
> feel higher-risk to me, owing to the potential for performance
> regressions; I haven't yet answered at least one of Richi's performance
> questions (impact on time taken to generate the C++ PCH file); the last
> performance testing I did can be seen here:
> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02283.html
> where the right-most column is this kit.
> 
> CCing Richi to keep him in the loop for the above.  Richi, is there any
> other specific testing you'd want me to do for this?
> Or is it OK to commit, and to see what impact it has on your daily
> performance testing?  (and to revert if the impact is unacceptable).
> 
> Talking about risks: the reduction of the space for ordinary maps by a
> factor of 32, by taking up 5 bits for the packed range information
> optimization (patch 10):
>  https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02539.html
> CCing Dodji: Dodji; is this reasonable?
> 
> I did some experiments back in July on this; instrumented builds of
> openldap and openssl to see how much space we have in location_t:
> https://dmalcolm.fedorapeople.org/gcc/2015-07-23/openldap.csv
> https://dmalcolm.fedorapeople.org/gcc/2015-07-24/openldap.csv
> 
> (these are space-separated:
>            SRPM name
>            sourcefile
>            maximal ordinary location_t
>            minimal macro location_t)
> 
> openldap build #files: 906
> maximal ordinary location_t was:
> sourcefile='/builddir/build/BUILD/openldap-2.4.40/openldap-2.4.40/servers/slapd/bconfig.c'
>           max_ordinary_location=0x0081bd1b
>           (and min_macro_location=0x7ffe5903
> minimal macro location_t was:
> sourcefile='/builddir/build/BUILD/openldap-2.4.40/openldap-2.4.40/servers/slapd/aclparse.c'
>           min_macro_location=0x7ffe57e2
>           (with max_ordinary_location=0x00719775)
> 
> openssl-1.0.1k-8.fc22.src.rpm.x86_64:
>       #files: 1495
> max_ordinary_location=0x00be3726
>  (openssl-1.0.1k/apps/s_client.c)
>  with min_macro_location=0x7ffe7b6b
> 
> min_macro_location=0x7ffdf069 
>  (openssl-1.0.1k/apps/speed.c)
>  with max_ordinary_location=0x00a1abdf
> 
> In all of the above cases, we had enough room to do the bit-packing
> optimization, but this is just two projects (albeit real-world C code).
> 
> Comparing the gap between maximal ordinary map location and minimal
> macro map location, and seeing how much we can expand the ordinary map
> locations, the openldap build had:
>   (0x7ffe57e2 - 0x0081bd1b) / 0x0081bd1b  == factor of 251 i.e.
> 7 bits of space available
> 
> openssl build had:
>   (0x7ffdf069 - 0x00be3726) / 0x00be3726  == factor of 171 i.e. 7 bits
> of space available
> 
> hence allocating 5 bits to packing ranges is (I hope) reasonable.

Actually, I realize now that a better upper limit to be concerned with
for ordinary line maps is LINE_MAP_MAX_LOCATION_WITH_COLS i.e.
0x60000000, since there'd be a user-visible impact if we hit that limit.

For the openldap build, we'd have:
  (0x60000000 - 0x0081bd1b) / 0x0081bd1b == factor of 188.
and for openssl:
  (0x60000000 - 0x00be3726) / 0x00be3726 == factor of 128.

So paying 5 bits (for a factor of 32) still seems a reasonable cost, for
these two builds.


> Jeff: I'm working on expression ranges in the C++ FE; is that a
> prerequisite for patches 5-10, or can 5-10 go ahead without the C++
> work?  (assuming the other issues above are acceptable).
> 
> Hope this all makes sense and sounds sane
> Dave
> 
> [1] Together the kit gives us:
> * patch 4: infrastructure for printing underlines in
> diagnostic_show_locus and for multiple ranges
> * patches 5-10: the "source_location" (aka location_t) type becomes a
> caret plus a range; the tokens coming from libcpp gain ranges, so
> everything using libcpp gains at least underlines of tokens; the C
> frontend generates sane ranges for expressions as it parses them, better
> showing the user how the parser "sees" their code.
> 
> Hence we ought to get underlined ranges for many diagnostics in C and C
> ++ with this (e.g. input_location gives an underline covering the range
> of the token starting at the caret).  The "caret" should remain
> unchanged from the status quo, so e.g. debugging locations shouldn't be
> affected by the addition of ranges.
> 
> I'm anticipating that we'd need some followup patches to pick better
> ranges for some diagnostics, analogous to the way we convert "warning"
> to "warning_at" for where input_location isn't the best location; I'd
> expect these followup patches to be relative simple and low-risk.
> 
> 


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Status of rich location work (was Re: [PATCH 06/10] Track expression ranges in C frontend)
  2015-11-02 19:14       ` Status of rich location work (was Re: [PATCH 06/10] Track expression ranges in C frontend) David Malcolm
  2015-11-02 19:53         ` David Malcolm
@ 2015-11-02 22:26         ` Jeff Law
  2015-11-06  7:12         ` Dodji Seketeli
  2015-11-13 16:37         ` libcpp/C FE source range patch committed (r230331) David Malcolm
  3 siblings, 0 replies; 83+ messages in thread
From: Jeff Law @ 2015-11-02 22:26 UTC (permalink / raw)
  To: David Malcolm; +Cc: gcc-patches, Richard Biener, Dodji Seketeli

On 11/02/2015 12:14 PM, David Malcolm wrote:

>
>
> Jeff: I'm working on expression ranges in the C++ FE; is that a
> prerequisite for patches 5-10, or can 5-10 go ahead without the C++
> work?  (assuming the other issues above are acceptable).
>
> Hope this all makes sense and sounds sane
I think 5-10 can go in now given you're already in discussions with 
Jason on how to wire this into the C++ front-end.

jeff

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 10/10] Compress short ranges into source_location
  2015-10-23 20:26   ` [PATCH 10/10] Compress short ranges into source_location David Malcolm
  2015-10-30  6:07     ` Jeff Law
@ 2015-11-04 20:42     ` Dodji Seketeli
  1 sibling, 0 replies; 83+ messages in thread
From: Dodji Seketeli @ 2015-11-04 20:42 UTC (permalink / raw)
  To: David Malcolm; +Cc: gcc-patches

[...]

> diff --git a/libcpp/line-map.c b/libcpp/line-map.c

[...]

> +
> +  /* Any ordinary locations ought to be "pure" at this point: no
> +     compressed ranges.  */
> +  linemap_assert (locus < RESERVED_LOCATION_COUNT
> +		  || locus >= LINE_MAP_MAX_LOCATION_WITH_COLS
> +		  || locus >= LINEMAPS_MACRO_LOWEST_LOCATION (set)
> +		  || pure_location_p (set, locus));

Just for my own education, why aren't the tests

    locus < RESERVED_LOCATION_COUNT
    || locus >= LINE_MAP_MAX_LOCATION_WITH_COLS
    || locus >= LINEMAPS_MACRO_LOWEST_LOCATION (set)

not part of pure_location_p() ?  I mean, would it make sense to say that
a locus that that satisfies that condition is pure?

By the way, I like this great piece of code of yours, kudos!

Cheers,

-- 
		Dodji

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Status of rich location work (was Re: [PATCH 06/10] Track expression ranges in C frontend)
  2015-11-02 19:14       ` Status of rich location work (was Re: [PATCH 06/10] Track expression ranges in C frontend) David Malcolm
  2015-11-02 19:53         ` David Malcolm
  2015-11-02 22:26         ` Jeff Law
@ 2015-11-06  7:12         ` Dodji Seketeli
  2015-11-13 16:37         ` libcpp/C FE source range patch committed (r230331) David Malcolm
  3 siblings, 0 replies; 83+ messages in thread
From: Dodji Seketeli @ 2015-11-06  7:12 UTC (permalink / raw)
  To: David Malcolm; +Cc: Jeff Law, gcc-patches, Richard Biener

> Talking about risks: the reduction of the space for ordinary maps by a
> factor of 32, by taking up 5 bits for the packed range information
> optimization (patch 10):
>  https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02539.html
> CCing Dodji: Dodji; is this reasonable?

FWIW, I am definitely to get this (patch 10/10 of the series) if other
agrees.  I just have some minor questions to ask about that patch and
I replied to the patch to ask.

As for the "reduction of the space for ordinary maps by a factor of 32,
by taking up 5 bits for the packed range information" that you mention,
I think it's a trade off I'd live with.

Ultimately, if it shows that we really move out of space with this, we
should probably explore the impact of just moving to a 64 bits size for
source_location.

Until then, a possible mitigation strategy could be to add an option to
disable the range tracking altogether (even at the preprocessor's lexer
level), to provide an escape path to users running low on resources.  A
bit what we do with -ftrack-macro-expansion=0.

Cheers,

-- 
		Dodji

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 4b] diagnostic-show-locus.c changes: Insertions
  2015-10-30  4:53           ` Jeff Law
  2015-10-30 19:42             ` David Malcolm
@ 2015-11-06 19:59             ` David Malcolm
  1 sibling, 0 replies; 83+ messages in thread
From: David Malcolm @ 2015-11-06 19:59 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 6481 bytes --]

On Thu, 2015-10-29 at 22:49 -0600, Jeff Law wrote:
> On 10/28/2015 12:09 PM, David Malcolm wrote:
> > gcc/ChangeLog:
> > 	* diagnostic-show-locus.c (struct point_state): New struct.
> > 	(class colorizer): New class.
> > 	(class layout_point): New class.
> > 	(class layout_range): New class.
> > 	(class layout): New class.
> > 	(colorizer::colorizer): New ctor.
> > 	(colorizer::~colorizer): New dtor.
> > 	(layout::layout): New ctor.
> > 	(layout::print_line): New method.
> > 	(layout::get_state_at_point): New method.
> > 	(layout::get_x_bound_for_row): New method.
> > 	(show_ruler): New function.
> > 	(diagnostic_show_locus): Reimplement in terms of class layout.
> > ---
> > +};
> > +
> > +/* A class to inject colorization codes when printing the diagnostic locus.
> > +
> > +   It has one kind of colorization for each of:
> > +     - normal text
> > +     - range 0 (the "primary location")
> > +     - range 1
> > +     - range 2
> > +
> > +   The class caches the lookup of the color codes for the above.
> > +
> > +   The class also has responsibility for tracking which of the above is
> > +   active, filtering out unnecessary changes.  This allows layout::print_line
> > +   to simply request a colorization code for *every* character it prints
> > +   through this class, and have the filtering be done for it here.  */
> Not asking you to do anything here -- hopefully this isn't a huge burden 
> on the diagnostic performance.  Normally I wouldn't even notice except 
> that we're inserting colorization on every character.  That kind of 
> model can get expensive.  Something to watch out for -- though I doubt 
> we do he massive diagnostic spews we used to which is probably the only 
> place it'd be noticeable.
> 
> 
> 
> > +
> > +/* A point within a layout_range; similar to an expanded_location,
> > +   but after filtering on file.  */
> > +
> > +class layout_point
> > +{
> > + public:
> > +  layout_point (const expanded_location &exploc)
> > +  : m_line (exploc.line),
> > +    m_column (exploc.column) {}
> > +
> > +  int m_line;
> > +  int m_column;
> > +};
> Is this even deserving of its own class?  If you pulled up 
> m_line/m_column you don't need the class, though I guess you need thee 
> of each, one for the start, one for the finish & one for the caret, 
> which in turn bloats the layout_range's constructor.  So I guess this is OK.
> 
> 
> 
> 
> 
> 
> > +/* Given a source line LINE of length LINE_WIDTH, determine the width
> > +   without any trailing whitespace.  */
> > +
> > +static int
> > +get_line_width_without_trailing_whitespace (const char *line, int line_width)
> > +{
> > +  int result = line_width;
> > +  while (result > 0)
> > +    {
> > +      char ch = line[result - 1];
> > +      if (ch == ' ' || ch == '\t')
> > +	result--;
> > +      else
> > +	break;
> > +    }
> > +  gcc_assert (result >= 0);
> > +  gcc_assert (result <= line_width);
> > +  gcc_assert (result == 0 ||
> > +	      (line[result - 1] != ' '
> > +	       && line[result -1] != '\t'));
> > +  return result;
> > +}
> If you use an unsigned for the line width, don't all the asserts become 
> redundant & unnecessary?  I love the sanity checking and I could see how 
> it might be useful it someone were to reimplmenent this function at a 
> later date.  So maybe keep.
> 
> > +
> > +/* Implementation of class layout.  */
> > +
> > +/* Constructor for class layout.
> > +
> > +   Filter the ranges from the rich_location to those that we can
> > +   sanely print, populating m_layout_ranges.
> > +   Determine the range of lines that we will print.
> > +   Determine m_x_offset, to ensure that the primary caret
> > +   will fit within the max_width provided by the diagnostic_context.  */
> > +
> > +layout::layout (diagnostic_context * context,
> > +		const diagnostic_info *diagnostic)
> [ ... ]
> > +  if (0)
> > +    show_ruler (context, line_width, m_x_offset);
> Debugging code?  If it's if (0) you should probably delete it at this point.
> 
> 
> > +}
> > +
> > +/* Print text describing a line of source code.
> > +   This typically prints two lines:
> > +
> > +   (1) the source code itself, potentially colorized at any ranges, and
> > +   (2) an annotation line containing any carets/underlines
> > +   describing the ranges.  */
> > +
> > +void
> > +layout::print_line (int row)
> Consider breaking this into two functions.  One to print the source line 
> and another to print caret/underlines.
> 
> 
>   +
> > +/* Return true if (ROW/COLUMN) is within a range of the layout.
> > +   If it returns true, OUT_STATE is written to, with the
> > +   range index, and whether we should draw the caret at
> > +   (ROW/COLUMN) (as opposed to an underline).  */
> > +
> > +bool
> > +layout::get_state_at_point (/* Inputs.  */
> > +			    int row, int column,
> > +			    int first_non_ws, int last_non_ws,
> > +			    /* Outputs.  */
> > +			    point_state *out_state)
> > +{
> > +  layout_range *range;
> > +  int i;
> > +  FOR_EACH_VEC_ELT (m_layout_ranges, i, range)
> > +    {
> > +      if (0)
> > +	fprintf (stderr,
> > +		 "range ( (%i, %i), (%i, %i))->contains_point (%i, %i): %s\n",
> > +		 range->m_start.m_line,
> > +		 range->m_start.m_column,
> > +		 range->m_finish.m_line,
> > +		 range->m_finish.m_column,
> > +		 row,
> > +		 column,
> > +		 range->contains_point (row, column) ? "true" : "false");
> More old debugging code that needs to be removed?
> 
> 
> 
> > +
> > +/* For debugging layout issues in diagnostic_show_locus and friends,
> > +   render a ruler giving column numbers (after the 1-column indent).  */
> > +
> > +static void
> > +show_ruler (diagnostic_context *context, int max_width, int x_offset)
> Seems like it ought to be DEBUG_FUNCTION or removed.  I believe it's 
> only caller is in if (0) code in layout's ctor.
> 
> 
> Overall this looks good.  Take the actions you deem appropriate WRT the 
> debugging bits, breaking print_line into two functions and the signed vs 
> unsigned stuff in get_line_width_without_trailing_whitespace and it's 
> good for the trunk.

Thanks.  I broke print_line into two methods, and eliminated the
debugging bits; I didn't touch the signedness within 
get_line_width_without_trailing_whitespace (since this signedness is
coming from input.c, as noted before).

I've committed the combination of 4a+4b+4c+cleanups to trunk as r229884
(having bootstrapped&regrtested).

For reference, I'm attaching the diffs (relative to 4b) for the cleanups
mentioned above.

Dave

[-- Attachment #2: 0001-Split-layout-print_line-into-two-methods.patch --]
[-- Type: text/x-patch, Size: 5965 bytes --]

From 221d5daa7c88d1776a8ea1dc7bd5b4b1bb460ce5 Mon Sep 17 00:00:00 2001
From: David Malcolm <dmalcolm@redhat.com>
Date: Sat, 31 Oct 2015 04:38:15 -0400
Subject: [PATCH] Split layout::print_line into two methods.

gcc/ChangeLog:
	* diagnostic-show-locus.c (class colorizer): Update comment to
	reflect split of layout::print_line into layout::print_source_line
	and layout::print_annotation_line.
	(struct line_bounds): New.
	(class layout): Update comment to reflect split of
	layout::print_line.
	(layout::print_line): Delete, in favor of...
	(layout::print_source_line): ...this new method and...
	(layout::print_annotation_line): ...this new method.
	(diagnostic_show_locus): Update for split of layout::print_line
	into layout::print_source_line and layout::print_annotation_line.
---
 gcc/diagnostic-show-locus.c | 78 ++++++++++++++++++++++++++++++---------------
 1 file changed, 52 insertions(+), 26 deletions(-)

diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
index 6865209..97f2853 100644
--- a/gcc/diagnostic-show-locus.c
+++ b/gcc/diagnostic-show-locus.c
@@ -67,9 +67,10 @@ struct point_state
    The class caches the lookup of the color codes for the above.
 
    The class also has responsibility for tracking which of the above is
-   active, filtering out unnecessary changes.  This allows layout::print_line
-   to simply request a colorization code for *every* character it prints
-   through this class, and have the filtering be done for it here.  */
+   active, filtering out unnecessary changes.  This allows
+   layout::print_source_line and layout::print_annotation_line
+   to simply request a colorization code for *every* character they print,
+   via this class, and have the filtering be done for them here.  */
 
 class colorizer
 {
@@ -128,12 +129,21 @@ class layout_range
   layout_point m_caret;
 };
 
+/* A struct for use by layout::print_source_line for telling
+   layout::print_annotation_line the extents of the source line that
+   it printed, so that underlines can be clipped appropriately.  */
+
+struct line_bounds
+{
+  int m_first_non_ws;
+  int m_last_non_ws;
+};
+
 /* A class to control the overall layout when printing a diagnostic.
 
    The layout is determined within the constructor.
-   It is then printed by repeatedly calling the "print_line" method.
-   Each such call can print two lines: one for the source line itself,
-   and potentially an "annotation" line, containing carets/underlines.
+   It is then printed by repeatedly calling the "print_source_line"
+   and "print_annotation_line" methods.
 
    We assume we have disjoint ranges.  */
 
@@ -146,7 +156,8 @@ class layout
   int get_first_line () const { return m_first_line; }
   int get_last_line () const { return m_last_line; }
 
-  void print_line (int row);
+  bool print_source_line (int row, line_bounds *lbounds_out);
+  void print_annotation_line (int row, const line_bounds lbounds);
 
  private:
   bool
@@ -477,32 +488,30 @@ layout::layout (diagnostic_context * context,
     show_ruler (context, line_width, m_x_offset);
 }
 
-/* Print text describing a line of source code.
-   This typically prints two lines:
+/* Attempt to print line ROW of source code, potentially colorized at any
+   ranges.
+   Return true if the line was printed, populating *LBOUNDS_OUT.
+   Return false if the source line could not be read, leaving *LBOUNDS_OUT
+   untouched.  */
 
-   (1) the source code itself, potentially colorized at any ranges, and
-   (2) an annotation line containing any carets/underlines
-   describing the ranges.  */
-
-void
-layout::print_line (int row)
+bool
+layout::print_source_line (int row, line_bounds *lbounds_out)
 {
   int line_width;
   const char *line = location_get_source_line (m_exploc.file, row,
 					       &line_width);
   if (!line)
-    return;
+    return false;
 
   line += m_x_offset;
 
   m_colorizer.set_normal_text ();
 
-  /* Step 1: print the source code line.  */
+  /* We will stop printing the source line at any trailing
+     whitespace.  */
+  line_width = get_line_width_without_trailing_whitespace (line,
+							   line_width);
 
-  /* We will stop printing at any trailing whitespace.  */
-  line_width
-    = get_line_width_without_trailing_whitespace (line,
-						  line_width);
   pp_space (m_pp);
   int first_non_ws = INT_MAX;
   int last_non_ws = 0;
@@ -547,10 +556,19 @@ layout::print_line (int row)
     }
   pp_newline (m_pp);
 
-  /* Step 2: print a line consisting of the caret/underlines for the
-     given source line.  */
+  lbounds_out->m_first_non_ws = first_non_ws;
+  lbounds_out->m_last_non_ws = last_non_ws;
+  return true;
+}
+
+/* Print a line consisting of the caret/underlines for the given
+   source line.  */
+
+void
+layout::print_annotation_line (int row, const line_bounds lbounds)
+{
   int x_bound = get_x_bound_for_row (row, m_exploc.column,
-				     last_non_ws);
+				     lbounds.m_last_non_ws);
 
   pp_space (m_pp);
   for (int column = 1 + m_x_offset; column < x_bound; column++)
@@ -558,7 +576,8 @@ layout::print_line (int row)
       bool in_range_p;
       point_state state;
       in_range_p = get_state_at_point (row, column,
-				       first_non_ws, last_non_ws,
+				       lbounds.m_first_non_ws,
+				       lbounds.m_last_non_ws,
 				       &state);
       if (in_range_p)
 	{
@@ -734,7 +753,14 @@ diagnostic_show_locus (diagnostic_context * context,
     for (int row = layout.get_first_line ();
 	 row <= last_line;
 	 row++)
-      layout.print_line (row);
+      {
+	/* Print the source line, followed by an annotation line
+	   consisting of any caret/underlines.  If the source line can't
+	   be read, print nothing.  */
+	line_bounds lbounds;
+	if (layout.print_source_line (row, &lbounds))
+	  layout.print_annotation_line (row, lbounds);
+      }
 
     /* The closing scope here leads to the dtor for layout and thus
        colorizer being called here, which affects the precise
-- 
1.8.5.3


[-- Attachment #3: 0002-Fix-nits-in-diagnostic-show-locus.c.patch --]
[-- Type: text/x-patch, Size: 3119 bytes --]

From 4b1113880023d404ebb52d7efab934b913ff3284 Mon Sep 17 00:00:00 2001
From: David Malcolm <dmalcolm@redhat.com>
Date: Mon, 2 Nov 2015 15:35:34 -0500
Subject: [PATCH] Fix nits in diagnostic-show-locus.c

gcc/ChangeLog:
	* diagnostic-show-locus.c (layout::layout): Eliminate call to
	show_ruler.
	(layout::get_state_at_point): Eliminate debug code.
	(show_ruler): Delete.
---
 gcc/diagnostic-show-locus.c | 51 ---------------------------------------------
 1 file changed, 51 deletions(-)

diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
index 97f2853..22203cd 100644
--- a/gcc/diagnostic-show-locus.c
+++ b/gcc/diagnostic-show-locus.c
@@ -36,9 +36,6 @@ along with GCC; see the file COPYING3.  If not see
 # include <sys/ioctl.h>
 #endif
 
-static void
-show_ruler (diagnostic_context *context, int max_width, int x_offset);
-
 /* Classes for rendering source code and diagnostics, within an
    anonymous namespace.
    The work is done by "class layout", which embeds and uses
@@ -483,9 +480,6 @@ layout::layout (diagnostic_context * context,
 	m_x_offset = column - right_margin;
       gcc_assert (m_x_offset >= 0);
     }
-
-  if (0)
-    show_ruler (context, line_width, m_x_offset);
 }
 
 /* Attempt to print line ROW of source code, potentially colorized at any
@@ -615,17 +609,6 @@ layout::get_state_at_point (/* Inputs.  */
   int i;
   FOR_EACH_VEC_ELT (m_layout_ranges, i, range)
     {
-      if (0)
-	fprintf (stderr,
-		 "range ( (%i, %i), (%i, %i))->contains_point (%i, %i): %s\n",
-		 range->m_start.m_line,
-		 range->m_start.m_column,
-		 range->m_finish.m_line,
-		 range->m_finish.m_column,
-		 row,
-		 column,
-		 range->contains_point (row, column) ? "true" : "false");
-
       if (range->contains_point (row, column))
 	{
 	  out_state->range_idx = i;
@@ -694,40 +677,6 @@ layout::get_x_bound_for_row (int row, int caret_column,
 
 } /* End of anonymous namespace.  */
 
-/* For debugging layout issues in diagnostic_show_locus and friends,
-   render a ruler giving column numbers (after the 1-column indent).  */
-
-static void
-show_ruler (diagnostic_context *context, int max_width, int x_offset)
-{
-  /* Hundreds.  */
-  if (max_width > 99)
-    {
-      pp_space (context->printer);
-      for (int column = 1 + x_offset; column < max_width; column++)
-	if (0 == column % 10)
-	  pp_character (context->printer, '0' + (column / 100) % 10);
-	else
-	  pp_space (context->printer);
-      pp_newline (context->printer);
-    }
-
-  /* Tens.  */
-  pp_space (context->printer);
-  for (int column = 1 + x_offset; column < max_width; column++)
-    if (0 == column % 10)
-      pp_character (context->printer, '0' + (column / 10) % 10);
-    else
-      pp_space (context->printer);
-  pp_newline (context->printer);
-
-  /* Units.  */
-  pp_space (context->printer);
-  for (int column = 1 + x_offset; column < max_width; column++)
-    pp_character (context->printer, '0' + (column % 10));
-  pp_newline (context->printer);
-}
-
 /* Print the physical source code corresponding to the location of
    this diagnostic, with additional annotations.  */
 
-- 
1.8.5.3


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Benchmarks of v2 (was Re: [PATCH 0/5] RFC: Overhaul of diagnostics (v2))
  2015-10-14  9:00           ` Richard Biener
  2015-10-14 12:49             ` Michael Matz
  2015-10-16 15:57             ` David Malcolm
@ 2015-11-13 16:02             ` David Malcolm
  2 siblings, 0 replies; 83+ messages in thread
From: David Malcolm @ 2015-11-13 16:02 UTC (permalink / raw)
  To: Richard Biener; +Cc: Michael Matz, GCC Patches

On Wed, 2015-10-14 at 11:00 +0200, Richard Biener wrote:
> On Tue, Oct 13, 2015 at 5:32 PM, David Malcolm <dmalcolm@redhat.com> wrote:
> > On Thu, 2015-09-24 at 10:15 +0200, Richard Biener wrote:
> >> On Thu, Sep 24, 2015 at 2:25 AM, David Malcolm <dmalcolm@redhat.com> wrote:
> >> > On Wed, 2015-09-23 at 15:36 +0200, Richard Biener wrote:
> >> >> On Wed, Sep 23, 2015 at 3:19 PM, Michael Matz <matz@suse.de> wrote:
> >> >> > Hi,
> >> >> >
> >> >> > On Tue, 22 Sep 2015, David Malcolm wrote:
> >> >> >
> >> >> >> The drawback is that it could bloat the ad-hoc table.  Can the ad-hoc
> >> >> >> table ever get smaller, or does it only ever get inserted into?
> >> >> >
> >> >> > It only ever grows.
> >> >> >
> >> >> >> An idea I had is that we could stash short ranges directly into the 32
> >> >> >> bits of location_t, by offsetting the per-column-bits somewhat.
> >> >> >
> >> >> > It's certainly worth an experiment: let's say you restrict yourself to
> >> >> > tokens less than 8 characters, you need an additional 3 bits (using one
> >> >> > value, e.g. zero, as the escape value).  That leaves 20 bits for the line
> >> >> > numbers (for the normal 8 bit columns), which might be enough for most
> >> >> > single-file compilations.  For LTO compilation this often won't be enough.
> >> >> >
> >> >> >> My plan is to investigate the impact these patches have on the time and
> >> >> >> memory consumption of the compiler,
> >> >> >
> >> >> > When you do so, make sure you're also measuring an LTO compilation with
> >> >> > debug info of something big (firefox).  I know that we already had issues
> >> >> > with the size of the linemap data in the past for these cases (probably
> >> >> > when we added columns).
> >> >>
> >> >> The issue we have with LTO is that the linemap gets populated in quite
> >> >> random order and thus we repeatedly switch files (we've mitigated this
> >> >> somewhat for GCC 5).  We also considered dropping column info
> >> >> (and would drop range info) as diagnostics are from optimizers only
> >> >> with LTO and we keep locations merely for debug info.
> >> >
> >> > Thanks.  Presumably the mitigation you're referring to is the
> >> > lto_location_cache class in lto-streamer-in.c?
> >> >
> >> > Am I right in thinking that, right now, the LTO code doesn't support
> >> > ad-hoc locations? (presumably the block pointers only need to exist
> >> > during optimization, which happens after the serialization)
> >>
> >> LTO code does support ad-hoc locations but they are "restored" only
> >> when reading function bodies and stmts (by means of COMBINE_LOCATION_DATA).
> >>
> >> > The obvious simplification would be, as you suggest, to not bother
> >> > storing range information with LTO, falling back to just the existing
> >> > representation.  Then there's no need to extend LTO to serialize ad-hoc
> >> > data; simply store the underlying locus into the bit stream.  I think
> >> > that this happens already: lto-streamer-out.c calls expand_location and
> >> > stores the result, so presumably any ad-hoc location_t values made by
> >> > the v2 patches would have dropped their range data there when I ran the
> >> > test suite.
> >>
> >> Yep.  We only preserve BLOCKs, so if you don't add extra code to
> >> preserve ranges they'll be "dropped".
> >>
> >> > If it's acceptable to not bother with ranges for LTO, one way to do the
> >> > "stashing short ranges into the location_t" idea might be for the
> >> > bits-per-range of location_t values to be a property of the line_table
> >> > (or possibly the line map), set up when the struct line_maps is created.
> >> > For non-LTO it could be some tuned value (maybe from a param?); for LTO
> >> > it could be zero, so that we have as many bits as before for line/column
> >> > data.
> >>
> >> That could be a possibility (likewise for column info?)
> >>
> >> Richard.
> >>
> >> > Hope this sounds sane
> >> > Dave
> >
> > I did some crude benchmarking of the patchkit, using these scripts:
> >   https://github.com/davidmalcolm/gcc-benchmarking
> > (specifically, bb0222b455df8cefb53bfc1246eb0a8038256f30),
> > using the "big-code.c" and "kdecore.cc" files Michael posted as:
> >   https://gcc.gnu.org/ml/gcc-patches/2013-09/msg00062.html
> > and "influence.i", a preprocessed version of SPEC2006's 445.gobmk
> > engine/influence.c (as an example of a moderate-sized pure C source
> > file).
> >
> > This doesn't yet cover very large autogenerated C files, and the .cc
> > file is only being measured to see the effect on the ad-hoc table (and
> > tokenization).
> >
> > "control" was r227977.
> > "experiment" was the same revision with the v2 patchkit applied.
> >
> > Recall that this patchkit captures ranges for tokens as an extra field
> > within tokens within libcpp and the C FE, and adds ranges to the ad-hoc
> > location lookaside, storing them for all tree nodes within the C FE that
> > have a location_t, and passing them around within c_expr for all C
> > expressions (including those that don't have a location_t).
> >
> > Both control and experiment were built with
> >   --enable-checking=release \
> >   --disable-bootstrap \
> >   --disable-multilib \
> >   --enable-languages=c,ada,c++,fortran,go,java,lto,objc,obj-c++
> >
> > The script measures:
> >
> > (a) wallclock time for "xgcc -S" so it's measuring the driver, parsing,
> > optimimation, etc, rather than attempting to directly measure parsing.
> > This is without -ftime-report, since Mikhail indicated it's sufficiently
> > expensive to skew timings in this post:
> >   https://gcc.gnu.org/ml/gcc/2015-07/msg00165.html
> >
> > (b) memory usage: by performing a separate build with -ftime-report,
> > extracting the "TOTAL" ggc value (actually 3 builds, but it's the same
> > each time).
> >
> > Is this a fair way to measure things?  It could be argued that by
> > measuring totals I'm hiding the extra parsing cost in the overall cost.
> 
> Overall cost is what matters.   Time to build the libstdc++ PCHs
> would be interesting as well ;)  (and their size)

I measured the time taken for libstdc++ PCH generation for the latest
version of the kit (using the bit-packing idea for short ranges), vs a
control build (r230270).

This is without the C++ FE changes that I've posted elsewhere: just
tracking of token ranges (via bit-packing, falling back to an expanded
ad-hoc lookaside table; also C FE expressions, but that shouldn't affect
cc1plus):

Wallclock time:
{'control': [15.664, 15.669, 15.75, 15.671, 16.406, 15.692, 15.642,
16.325, 15.702], 'experiment': [15.852, 18.092, 15.876, 15.857, 15.883,
15.873, 17.18, 15.887, 16.646]}
Min: 15.642000 -> 15.852000: 1.01x slower
Avg: 15.835667 -> 16.349556: 1.03x slower
Stddev: 0.30258 -> 0.80520: 2.6611x larger
Timeline: http://preview.tinyurl.com/Wallclock-time-for-pch-rebuild
aka:
http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,14.642,19.092&chco=FF0000,0000FF&chdl=control|experiment&chds=14.642,19.092&chd=t:15.66,15.67,15.75,15.67,16.41,15.69,15.64,16.32,15.7|15.85,18.09,15.88,15.86,15.88,15.87,17.18,15.89,16.65&chxl=0:|1|2|3|4|5|6|7|8|9|2:||Iteration|3:||Time+(secs)&chtt=Wallclock+time+for+pch+rebuild

User time:
{'control': [14.477, 14.393, 14.445, 14.458, 14.487, 14.432, 14.394,
14.399, 14.454], 'experiment': [14.628, 14.655, 14.665, 14.683, 14.627,
14.658, 14.575, 14.637, 14.746]}
Min: 14.393000 -> 14.575000: 1.01x slower
Avg: 14.437667 -> 14.652667: 1.01x slower
Stddev: 0.03561 -> 0.04659: 1.3083x larger
Timeline:
http://preview.tinyurl.com/user-time-for-pch-rebuild
aka:
http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,13.393,15.746&chco=FF0000,0000FF&chdl=control|experiment&chds=13.393,15.746&chd=t:14.48,14.39,14.45,14.46,14.49,14.43,14.39,14.4,14.45|14.63,14.65,14.66,14.68,14.63,14.66,14.57,14.64,14.75&chxl=0:|1|2|3|4|5|6|7|8|9|2:||Iteration|3:||Time+(secs)&chtt=user+time+for+pch+rebuild

So about 1% slower.  Rerunning under perf, and looking at "perf diff",
the slowdown appears to be due to the extra memory taken by the
lookaside table.

The PCH files themselves aren't significantly different in size:
                               control   experiment  ratio
extc++.h.gch/O2g.gch         113781104    113846640  1.000576
stdc++.h.gch/O2g.gch          76789968     76826832  1.000480
stdc++.h.gch/O2ggnu++0x.gch   74647696     74684560  1.000494
stdtr1c++.h.gch/O2g.gch       83996240     84024912  1.000341

so much less than a %.


> One could have argued you should have used -fsyntax-only.
> 
> > Full logs can be seen at:
> >   https://dmalcolm.fedorapeople.org/gcc/2015-09-25/bmark-v2.txt
> > (v2 of the patchkit)
> >
> > I also investigated a version of the patchkit with the token tracking
> > rewritten to build ad-hoc ranges for *every token*, without attempting
> > any kind of optimization (e.g. for short ranges).
> > A log of this can be seen at:
> > https://dmalcolm.fedorapeople.org/gcc/2015-09-25/bmark-v2-plus-adhoc-ranges-for-tokens.txt
> > (v2 of the patchkit, with token tracking rewritten to build ad-hoc
> > ranges for *every token*).
> > The nice thing about this approach is that lots of token-related
> > diagnostics gain underlining of the relevant token "for free" simply
> > from the location_t, without having to individually patch them.  Without
> > any optimization, the memory consumed by this approach is clearly
> > larger.
> >
> > A summary comparing the two logs:
> >
> > Minimal wallclock time (s) over 10 iterations
> >                           Control -> v2                                 Control -> v2+adhocloc+at+every+token
> > kdecore.cc -g -O0          10.306548 -> 10.268712: 1.00x faster          10.247160 -> 10.444528: 1.02x slower
> > kdecore.cc -g -O1          27.026285 -> 27.220654: 1.01x slower          27.280681 -> 27.622676: 1.01x slower
> > kdecore.cc -g -O2          43.791668 -> 44.020270: 1.01x slower          43.904934 -> 44.248477: 1.01x slower
> > kdecore.cc -g -O3          47.471836 -> 47.651101: 1.00x slower          47.645985 -> 48.005495: 1.01x slower
> > kdecore.cc -g -Os          31.678652 -> 31.802829: 1.00x slower          31.741484 -> 32.033478: 1.01x slower
> >    empty.c -g -O0            0.012662 -> 0.011932: 1.06x faster            0.012888 -> 0.013143: 1.02x slower
> >    empty.c -g -O1            0.012685 -> 0.012558: 1.01x faster            0.013164 -> 0.012790: 1.03x faster
> >    empty.c -g -O2            0.012694 -> 0.012846: 1.01x slower            0.012912 -> 0.013175: 1.02x slower
> >    empty.c -g -O3            0.012654 -> 0.012699: 1.00x slower            0.012596 -> 0.012792: 1.02x slower
> >    empty.c -g -Os            0.013057 -> 0.012766: 1.02x faster            0.012691 -> 0.012885: 1.02x slower
> > big-code.c -g -O0            3.292680 -> 3.325748: 1.01x slower            3.292948 -> 3.303049: 1.00x slower
> > big-code.c -g -O1          15.701810 -> 15.765014: 1.00x slower          15.714116 -> 15.759254: 1.00x slower
> > big-code.c -g -O2          22.575615 -> 22.620187: 1.00x slower          22.567406 -> 22.605435: 1.00x slower
> > big-code.c -g -O3          52.423586 -> 52.590075: 1.00x slower          52.421460 -> 52.703835: 1.01x slower
> > big-code.c -g -Os          21.153980 -> 21.253598: 1.00x slower          21.146266 -> 21.260138: 1.01x slower
> > influence.i -g -O0            0.148229 -> 0.149518: 1.01x slower            0.148672 -> 0.156262: 1.05x slower
> > influence.i -g -O1            0.387397 -> 0.389930: 1.01x slower            0.387734 -> 0.396655: 1.02x slower
> > influence.i -g -O2            0.587514 -> 0.589604: 1.00x slower            0.588064 -> 0.596510: 1.01x slower
> > influence.i -g -O3            1.273561 -> 1.280514: 1.01x slower            1.274599 -> 1.287596: 1.01x slower
> > influence.i -g -Os            0.526045 -> 0.527579: 1.00x slower            0.526827 -> 0.535635: 1.02x slower
> >
> >
> > Maximal ggc memory (kb)
> >                      Control -> v2                                 Control -> v2+adhocloc+at+every+token
> > kdecore.cc -g -O0      650337.000 -> 654435.000: 1.0063x larger      650337.000 -> 711775.000: 1.0945x larger
> > kdecore.cc -g -O1      931966.000 -> 940144.000: 1.0088x larger      931951.000 -> 989384.000: 1.0616x larger
> > kdecore.cc -g -O2    1125325.000 -> 1133514.000: 1.0073x larger    1125318.000 -> 1182384.000: 1.0507x larger
> > kdecore.cc -g -O3    1221408.000 -> 1229596.000: 1.0067x larger    1221410.000 -> 1278658.000: 1.0469x larger
> > kdecore.cc -g -Os      867140.000 -> 871235.000: 1.0047x larger      867141.000 -> 928700.000: 1.0710x larger
> >    empty.c -g -O0          1189.000 -> 1192.000: 1.0025x larger          1189.000 -> 1193.000: 1.0034x larger
> >    empty.c -g -O1          1189.000 -> 1192.000: 1.0025x larger          1189.000 -> 1193.000: 1.0034x larger
> >    empty.c -g -O2          1189.000 -> 1192.000: 1.0025x larger          1189.000 -> 1193.000: 1.0034x larger
> >    empty.c -g -O3          1189.000 -> 1192.000: 1.0025x larger          1189.000 -> 1193.000: 1.0034x larger
> >    empty.c -g -Os          1189.000 -> 1192.000: 1.0025x larger          1189.000 -> 1193.000: 1.0034x larger
> > big-code.c -g -O0      166584.000 -> 172731.000: 1.0369x larger      166584.000 -> 172726.000: 1.0369x larger
> > big-code.c -g -O1      279793.000 -> 285940.000: 1.0220x larger      279793.000 -> 285935.000: 1.0220x larger
> > big-code.c -g -O2      400058.000 -> 406194.000: 1.0153x larger      400058.000 -> 406189.000: 1.0153x larger
> > big-code.c -g -O3      903648.000 -> 909750.000: 1.0068x larger      903906.000 -> 910001.000: 1.0067x larger
> > big-code.c -g -Os      357060.000 -> 363010.000: 1.0167x larger      357060.000 -> 363005.000: 1.0166x larger
> > influence.i -g -O0          9273.000 -> 9719.000: 1.0481x larger         9273.000 -> 13303.000: 1.4346x larger
> > influence.i -g -O1        12968.000 -> 13414.000: 1.0344x larger        12968.000 -> 16998.000: 1.3108x larger
> > influence.i -g -O2        16386.000 -> 16768.000: 1.0233x larger        16386.000 -> 20352.000: 1.2420x larger
> > influence.i -g -O3        35508.000 -> 35763.000: 1.0072x larger        35508.000 -> 39346.000: 1.1081x larger
> > influence.i -g -Os        14287.000 -> 14669.000: 1.0267x larger        14287.000 -> 18253.000: 1.2776x larger
> >
> > Thoughts?
> 
> The compile-time and memory-usage impact for the adhocloc at every
> token patchkit is quite big.  Remember
> that gaining 1% in compile-time is hard and 20-40% memory increase for
> influence.i looks too much.
> 
> I also wonder why you see differences in memory usage change for
> different -O levels.  I think we should
> have a pretty "static" line table after parsing?  Thus rather than
> percentages I'd like to see absolute changes
> (which I'd expect to be the same for all -O levels).
> 
> Richard.
> 
> > Dave
> >
> >


^ permalink raw reply	[flat|nested] 83+ messages in thread

* libcpp/C FE source range patch committed (r230331).
  2015-11-02 19:14       ` Status of rich location work (was Re: [PATCH 06/10] Track expression ranges in C frontend) David Malcolm
                           ` (2 preceding siblings ...)
  2015-11-06  7:12         ` Dodji Seketeli
@ 2015-11-13 16:37         ` David Malcolm
  3 siblings, 0 replies; 83+ messages in thread
From: David Malcolm @ 2015-11-13 16:37 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches, Richard Biener, Dodji Seketeli

[-- Attachment #1: Type: text/plain, Size: 9857 bytes --]

On Mon, 2015-11-02 at 14:14 -0500, David Malcolm wrote:
> On Fri, 2015-10-30 at 00:15 -0600, Jeff Law wrote:
> > On 10/23/2015 02:41 PM, David Malcolm wrote:
> > > As in the previous version of this patch
> > >   "Implement tree expression tracking in C FE (v2)"
> > > the patch now captures ranges for all C expressions during parsing within
> > > a new field of c_expr, and for all tree nodes with a location_t, it stores
> > > them in ad-hoc locations for later use.
> > >
> > > Hence compound expressions get ranges; see:
> > >    https://dmalcolm.fedorapeople.org/gcc/2015-09-22/diagnostic-test-expressions-1.html
> > >
> > > and for this example:
> > >
> > >    int test (int foo)
> > >    {
> > >      return foo * 100;
> > >             ^^^   ^^^
> > >    }
> > >
> > > we have access to the ranges of "foo" and "100" during C parsing via
> > > the c_expr, but once we have GENERIC, all we have is a VAR_DECL and an
> > > INTEGER_CST (the former's location is in at the top of the
> > > function, and the latter has no location).
> > >
> > > gcc/ChangeLog:
> > > 	* Makefile.in (OBJS): Add gcc-rich-location.o.
> > > 	* gcc-rich-location.c: New file.
> > > 	* gcc-rich-location.h: New file.
> > > 	* print-tree.c (print_node): Print any source range information.
> > > 	* tree.c (set_source_range): New functions.
> > > 	* tree.h (CAN_HAVE_RANGE_P): New.
> > > 	(EXPR_LOCATION_RANGE): New.
> > > 	(EXPR_HAS_RANGE): New.
> > > 	(get_expr_source_range): New inline function.
> > > 	(DECL_LOCATION_RANGE): New.
> > > 	(set_source_range): New decls.
> > > 	(get_decl_source_range): New inline function.
> > >
> > > gcc/c-family/ChangeLog:
> > > 	* c-common.c (c_fully_fold_internal): Capture existing souce_range,
> > > 	and store it on the result.
> > >
> > > gcc/c/ChangeLog:
> > > 	* c-parser.c (set_c_expr_source_range): New functions.
> > > 	(c_token::get_range): New method.
> > > 	(c_token::get_finish): New method.
> > > 	(c_parser_expr_no_commas): Call set_c_expr_source_range on the ret
> > > 	based on the range from the start of the LHS to the end of the
> > > 	RHS.
> > > 	(c_parser_conditional_expression): Likewise, based on the range
> > > 	from the start of the cond.value to the end of exp2.value.
> > > 	(c_parser_binary_expression): Call set_c_expr_source_range on
> > > 	the stack values for TRUTH_ANDIF_EXPR and TRUTH_ORIF_EXPR.
> > > 	(c_parser_cast_expression): Call set_c_expr_source_range on ret
> > > 	based on the cast_loc through to the end of the expr.
> > > 	(c_parser_unary_expression): Likewise, based on the
> > > 	op_loc through to the end of op.
> > > 	(c_parser_sizeof_expression) Likewise, based on the start of the
> > > 	sizeof token through to either the closing paren or the end of
> > > 	expr.
> > > 	(c_parser_postfix_expression): Likewise, using the token range,
> > > 	or from the open paren through to the close paren for
> > > 	parenthesized expressions.
> > > 	(c_parser_postfix_expression_after_primary): Likewise, for
> > > 	various kinds of expression.
> > > 	* c-tree.h (struct c_expr): Add field "src_range".
> > > 	(c_expr::get_start): New method.
> > > 	(c_expr::get_finish): New method.
> > > 	(set_c_expr_source_range): New decls.
> > > 	* c-typeck.c (parser_build_unary_op): Call set_c_expr_source_range
> > > 	on ret for prefix unary ops.
> > > 	(parser_build_binary_op): Likewise, running from the start of
> > > 	arg1.value through to the end of arg2.value.
> > >
> > > gcc/testsuite/ChangeLog:
> > > 	* gcc.dg/plugin/diagnostic-test-expressions-1.c: New file.
> > > 	* gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c:
> > > 	New file.
> > > 	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add
> > > 	diagnostic_plugin_test_tree_expression_range.c and
> > > 	diagnostic-test-expressions-1.c.
> > 
> > >   /* Initialization routine for this file.  */
> > >
> > > @@ -6085,6 +6112,9 @@ c_parser_expr_no_commas (c_parser *parser, struct c_expr *after,
> > >     ret.value = build_modify_expr (op_location, lhs.value, lhs.original_type,
> > >   				 code, exp_location, rhs.value,
> > >   				 rhs.original_type);
> > > +  set_c_expr_source_range (&ret,
> > > +			   lhs.get_start (),
> > > +			   rhs.get_finish ());
> > One line if it fits.
> > 
> > 
> > > @@ -6198,6 +6232,9 @@ c_parser_conditional_expression (c_parser *parser, struct c_expr *after,
> > >   			   ? t1
> > >   			   : NULL);
> > >       }
> > > +  set_c_expr_source_range (&ret,
> > > +			   start,
> > > +			   exp2.get_finish ());
> > Here too.
> > 
> > > @@ -6522,6 +6564,10 @@ c_parser_cast_expression (c_parser *parser, struct c_expr *after)
> > >   	expr = convert_lvalue_to_rvalue (expr_loc, expr, true, true);
> > >         }
> > >         ret.value = c_cast_expr (cast_loc, type_name, expr.value);
> > > +      if (ret.value && expr.value)
> > > +	set_c_expr_source_range (&ret,
> > > +				 cast_loc,
> > > +				 expr.get_finish ());
> > And here?
> > 
> > With the nits fixed, this is OK.
> > 
> > I think that covers this iteration of the rich location work and that 
> > you'll continue working with Jason on extending this into the C++ front-end.
> 
> Here's a summary of the current status of this work [1]:
> 
> Patches 1-4 of the kit: these Jeff has approved, with some pre-approved
> nit fixes in 4.  I see these as relatively low risk, and plan to commit
> these today/tomorrow.
> 
> Patches 5-10: Jeff approved these also (again with some nits). These
> feel higher-risk to me, owing to the potential for performance
> regressions; I haven't yet answered at least one of Richi's performance
> questions (impact on time taken to generate the C++ PCH file); the last
> performance testing I did can be seen here:
> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02283.html
> where the right-most column is this kit.
> 
> CCing Richi to keep him in the loop for the above.  Richi, is there any
> other specific testing you'd want me to do for this?
> Or is it OK to commit, and to see what impact it has on your daily
> performance testing?  (and to revert if the impact is unacceptable).
> 
> Talking about risks: the reduction of the space for ordinary maps by a
> factor of 32, by taking up 5 bits for the packed range information
> optimization (patch 10):
>  https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02539.html
> CCing Dodji: Dodji; is this reasonable?
> 
> I did some experiments back in July on this; instrumented builds of
> openldap and openssl to see how much space we have in location_t:
> https://dmalcolm.fedorapeople.org/gcc/2015-07-23/openldap.csv
> https://dmalcolm.fedorapeople.org/gcc/2015-07-24/openldap.csv
> 
> (these are space-separated:
>            SRPM name
>            sourcefile
>            maximal ordinary location_t
>            minimal macro location_t)
> 
> openldap build #files: 906
> maximal ordinary location_t was:
> sourcefile='/builddir/build/BUILD/openldap-2.4.40/openldap-2.4.40/servers/slapd/bconfig.c'
>           max_ordinary_location=0x0081bd1b
>           (and min_macro_location=0x7ffe5903
> minimal macro location_t was:
> sourcefile='/builddir/build/BUILD/openldap-2.4.40/openldap-2.4.40/servers/slapd/aclparse.c'
>           min_macro_location=0x7ffe57e2
>           (with max_ordinary_location=0x00719775)
> 
> openssl-1.0.1k-8.fc22.src.rpm.x86_64:
>       #files: 1495
> max_ordinary_location=0x00be3726
>  (openssl-1.0.1k/apps/s_client.c)
>  with min_macro_location=0x7ffe7b6b
> 
> min_macro_location=0x7ffdf069 
>  (openssl-1.0.1k/apps/speed.c)
>  with max_ordinary_location=0x00a1abdf
> 
> In all of the above cases, we had enough room to do the bit-packing
> optimization, but this is just two projects (albeit real-world C code).
> 
> Comparing the gap between maximal ordinary map location and minimal
> macro map location, and seeing how much we can expand the ordinary map
> locations, the openldap build had:
>   (0x7ffe57e2 - 0x0081bd1b) / 0x0081bd1b  == factor of 251 i.e.
> 7 bits of space available
> 
> openssl build had:
>   (0x7ffdf069 - 0x00be3726) / 0x00be3726  == factor of 171 i.e. 7 bits
> of space available
> 
> hence allocating 5 bits to packing ranges is (I hope) reasonable.
> 
> 
> Jeff: I'm working on expression ranges in the C++ FE; is that a
> prerequisite for patches 5-10, or can 5-10 go ahead without the C++
> work?  (assuming the other issues above are acceptable).
> 
> Hope this all makes sense and sounds sane
> Dave
> 
> [1] Together the kit gives us:
> * patch 4: infrastructure for printing underlines in
> diagnostic_show_locus and for multiple ranges
> * patches 5-10: the "source_location" (aka location_t) type becomes a
> caret plus a range; the tokens coming from libcpp gain ranges, so
> everything using libcpp gains at least underlines of tokens; the C
> frontend generates sane ranges for expressions as it parses them, better
> showing the user how the parser "sees" their code.
> 
> Hence we ought to get underlined ranges for many diagnostics in C and C
> ++ with this (e.g. input_location gives an underline covering the range
> of the token starting at the caret).  The "caret" should remain
> unchanged from the status quo, so e.g. debugging locations shouldn't be
> affected by the addition of ranges.
> 
> I'm anticipating that we'd need some followup patches to pick better
> ranges for some diagnostics, analogous to the way we convert "warning"
> to "warning_at" for where input_location isn't the best location; I'd
> expect these followup patches to be relative simple and low-risk.

I've addressed the nits raised by Jeff in review, combined patches 5-10
plus "libcpp: add examples to source_location description" into one
patch, verified bootstrap&regrtest (x86_64-pc-linux-gnu) and committed
it to trunk as r230331.

Patch attached for reference.

(I have a separate patch for the C++ FE which bootstraps, but it shows 4
testsuite regressions; hope to fix that today)


[-- Attachment #2: source-range-tracking-in-libcpp-and-C-FE-with-bit-pa.patch --]
[-- Type: text/x-patch, Size: 134371 bytes --]

From 89d6a22e3c7d9e670eec48f614368b986a2d0132 Mon Sep 17 00:00:00 2001
From: David Malcolm <dmalcolm@redhat.com>
Date: Fri, 23 Oct 2015 12:52:13 -0400
Subject: [PATCH] Source range tracking in libcpp and C FE, with bit-packing

This patch combines:
  [PATCH 05/10] Add ranges to libcpp tokens (via ad-hoc data, unoptimized)
  [PATCH 06/10] Track expression ranges in C frontend
  [PATCH 07/10] Add plugin to recursively dump the source-ranges in a tree (v2)
  [PATCH 08/10] Wire things up so that libcpp users get token underlines
  [PATCH 09/10] Delay some resolution of ad-hoc locations, preserving ranges
  [PATCH 10/10] Compress short ranges into source_location
  [PATCH] libcpp: add examples to source_location description
along with fixes for the nits identified by Jeff.

In particular:
  -const unsigned int LINE_MAP_MAX_COLUMN_NUMBER = (1U << 17);
  +const unsigned int LINE_MAP_MAX_COLUMN_NUMBER = (1U << 12);

rather than to:
  +const unsigned int LINE_MAP_MAX_COLUMN_NUMBER = (1U << 9);

gcc/ChangeLog:
	* Makefile.in (OBJS): Add gcc-rich-location.o.
	* diagnostic.c (diagnostic_append_note): Pass line_table to
	rich_location ctor.
	(emit_diagnostic): Likewise.
	(inform): Likewise.
	(inform_n): Likewise.
	(warning): Likewise.
	(warning_at): Likewise.
	(warning_n): Likewise.
	(pedwarn): Likewise.
	(permerror): Likewise.
	(error): Likewise.
	(error_n): Likewise.
	(error_at): Likewise.
	(sorry): Likewise.
	(fatal_error): Likewise.
	(internal_error): Likewise.
	(internal_error_no_backtrace): Likewise.
	(source_range::debug): Likewise.
	* gcc-rich-location.c: New file.
	* gcc-rich-location.h: New file.
	* genmatch.c (fatal_at): Pass line_table to rich_location ctor.
	(warning_at): Likewise.
	* gimple.h (gimple_set_block): Use set_block function.
	* input.c (dump_line_table_statistics): Dump stats on how many
	ranges were optimized vs how many needed ad-hoc table.
	(write_digit_row): Add "map" param; use its range_bits
	to calculate the per-character offset.
	(dump_location_info): Print the range and column bits for each
	ordinary map.  Use the range bits to calculate the per-character
	offset.  Pass the map as a new param to the various calls to
	write_digit_row.  Eliminate uses of
	ORDINARY_MAP_NUMBER_OF_COLUMN_BITS.
	* print-tree.c (print_node): Print any source range information.
	* rtl-error.c (diagnostic_for_asm): Likewise.
	* toplev.c (general_init): Initialize line_table's
	default_range_bits.
	* tree-cfg.c (move_block_to_fn): Likewise.
	(move_block_to_fn): Likewise.
	* tree-inline.c (copy_phis_for_bb): Likewise.
	* tree.c (tree_set_block): Likewise.
	(get_pure_location): New function.
	(set_source_range): New functions.
	(set_block): New function.
	(set_source_range): New functions.
	* tree.h (CAN_HAVE_RANGE_P): New.
	(EXPR_LOCATION_RANGE): New.
	(EXPR_HAS_RANGE): New.
	(get_expr_source_range): New inline function.
	(DECL_LOCATION_RANGE): New.
	(set_source_range): New decls.
	(get_decl_source_range): New inline function.

gcc/ada/ChangeLog:
	* gcc-interface/trans.c (Sloc_to_locus): Add line_table param when
	calling linemap_position_for_line_and_column.

gcc/c-family/ChangeLog:
	* c-common.c (c_fully_fold_internal): Capture existing souce_range,
	and store it on the result.
	* c-opts.c (c_common_init_options): Set
	global_dc->colorize_source_p.

gcc/c/ChangeLog:
	* c-decl.c (warn_defaults_to): Pass line_table to
	rich_location ctor.
	* c-errors.c (pedwarn_c99): Likewise.
	(pedwarn_c90): Likewise.
	* c-parser.c (set_c_expr_source_range): New functions.
	(c_token::get_range): New method.
	(c_token::get_finish): New method.
	(c_parser_expr_no_commas): Call set_c_expr_source_range on the ret
	based on the range from the start of the LHS to the end of the
	RHS.
	(c_parser_conditional_expression): Likewise, based on the range
	from the start of the cond.value to the end of exp2.value.
	(c_parser_binary_expression): Call set_c_expr_source_range on
	the stack values for TRUTH_ANDIF_EXPR and TRUTH_ORIF_EXPR.
	(c_parser_cast_expression): Call set_c_expr_source_range on ret
	based on the cast_loc through to the end of the expr.
	(c_parser_unary_expression): Likewise, based on the
	op_loc through to the end of op.
	(c_parser_sizeof_expression) Likewise, based on the start of the
	sizeof token through to either the closing paren or the end of
	expr.
	(c_parser_postfix_expression): Likewise, using the token range,
	or from the open paren through to the close paren for
	parenthesized expressions.
	(c_parser_postfix_expression_after_primary): Likewise, for
	various kinds of expression.
	* c-tree.h (struct c_expr): Add field "src_range".
	(c_expr::get_start): New method.
	(c_expr::get_finish): New method.
	(set_c_expr_source_range): New decls.
	* c-typeck.c (parser_build_unary_op): Call set_c_expr_source_range
	on ret for prefix unary ops.
	(parser_build_binary_op): Likewise, running from the start of
	arg1.value through to the end of arg2.value.

gcc/cp/ChangeLog:
	* error.c (pedwarn_cxx98): Pass line_table to rich_location ctor.

gcc/fortran/ChangeLog:
	* error.c (gfc_warning): Pass line_table to rich_location ctor.
	(gfc_warning_now_at): Likewise.
	(gfc_warning_now): Likewise.
	(gfc_error_now): Likewise.
	(gfc_fatal_error): Likewise.
	(gfc_error): Likewise.
	(gfc_internal_error): Likewise.

gcc/testsuite/ChangeLog:
	* gcc.dg/diagnostic-token-ranges.c: New file.
	* gcc.dg/diagnostic-tree-expr-ranges-2.c: New file.
	* gcc.dg/plugin/diagnostic-test-expressions-1.c: New file.
	* gcc.dg/plugin/diagnostic-test-show-trees-1.c: New file.
	* gcc.dg/plugin/diagnostic_plugin_show_trees.c: New file.
	* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c (get_loc): Add
	line_table param when calling
	linemap_position_for_line_and_column.
	(test_show_locus): Pass line_table to rich_location ctors.
	(plugin_init): Remove setting of global_dc->colorize_source_p.
	* gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c:
	New file.
	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add
	diagnostic_plugin_test_tree_expression_range.c,
	diagnostic-test-expressions-1.c, diagnostic_plugin_show_trees.c,
	and diagnostic-test-show-trees-1.c.

libcpp/ChangeLog:
	* errors.c (cpp_diagnostic): Pass pfile->line_table to
	rich_location ctor.
	(cpp_diagnostic_with_line): Likewise.
	* include/cpplib.h (struct cpp_token): Update comment for src_loc
	to indicate that the range of the token is "baked into" the
	source_location.
	* include/line-map.h (source_location): Update the descriptive
	comment to reflect the packing scheme for short ranges, adding
	worked examples of location encoding.
	(struct line_map_ordinary): Drop field "column_bits" in favor
	of field "m_column_and_range_bits"; add field "m_range_bits".
	(ORDINARY_MAP_NUMBER_OF_COLUMN_BITS): Delete.
	(location_adhoc_data): Add source_range field.
	(struct line_maps): Add fields "default_range_bits",
	"num_optimized_ranges" and "num_unoptimized_ranges".
	(get_combined_adhoc_loc): Add source_range param.
	(get_range_from_loc): New declaration.
	(pure_location_p): New prototype.
	(COMBINE_LOCATION_DATA):  Add source_range param.
	(SOURCE_LINE): Update for renaming of column_bits.
	(SOURCE_COLUMN): Likewise.  Shift the column right by the map's
	range_bits.
	(LAST_SOURCE_LINE_LOCATION): Update for renaming of column_bits.
	(linemap_position_for_line_and_column): Add line_maps * params.
	(rich_location::rich_location): Likewise.
	* lex.c (_cpp_lex_direct): Capture the range of the token, baking
	it into token->src_loc via a call to COMBINE_LOCATION_DATA.
	* line-map.c (LINE_MAP_MAX_COLUMN_NUMBER): Reduce from 1U << 17 to
	1U << 12.
	(location_adhoc_data_hash): Add the src_range into
	the hash value.
	(location_adhoc_data_eq): Require equality of the src_range
	values.
	(can_be_stored_compactly_p): New function.
	(get_combined_adhoc_loc): Add src_range param, and store it,
	via a bit-packing scheme for short ranges, otherwise within the
	lookaside table.  Remove the requirement that data is non-NULL.
	(get_range_from_adhoc_loc): New function.
	(get_range_from_loc): New function.
	(pure_location_p): New function.
	(linemap_add): Ensure that start_location has zero for the
	range_bits, unless we're past LINE_MAP_MAX_LOCATION_WITH_COLS.
	Initialize range_bits to zero.  Assert that the start_location
	is "pure".
	(linemap_line_start): Assert that the
	column_and_range_bits >= range_bits.
	Update determinination of whether we need to start a new map
	using the effective column bits, without the range bits.
	Use the set's default_range_bits in new maps, apart from
	those with column_bits == 0, which should also have 0 range_bits.
	Increase the column bits for new maps by the range bits.
	When adding lines to an existing map, use set->highest_line
	directly rather than offsetting highest by SOURCE_COLUMN.
	Add assertions to sanity-check the return value.
	(linemap_position_for_column): Offset to_column by range_bits.
	Update set->highest_location if necessary.
	(linemap_position_for_line_and_column): Add line_maps * param.
	Update the calculation to offset the column by range_bits, and
	conditionalize it on being <= LINE_MAP_MAX_LOCATION_WITH_COLS.
	Bound it by LINEMAPS_MACRO_LOWEST_LOCATION.  Update
	set->highest_location if necessary.
	(linemap_position_for_loc_and_offset): Handle ad-hoc locations;
	pass "set" to linemap_position_for_line_and_column.
	(linemap_macro_map_loc_unwind_toward_spelling): Add line_maps
	param.  Handle ad-hoc locations.
	(linemap_location_in_system_header_p): Pass on "set" to call to
	linemap_macro_map_loc_unwind_toward_spelling.
	(linemap_macro_loc_to_spelling_point): Retain ad-hoc locations.
	Pass on "set" to call to
	linemap_macro_map_loc_unwind_toward_spelling.
	(linemap_resolve_location): Retain ad-hoc locations.  Pass on
	"set" to call to linemap_macro_map_loc_unwind_toward_spelling.
	(linemap_unwind_toward_expansion):  Pass on "set" to call to
	linemap_macro_map_loc_unwind_toward_spelling.
	(linemap_expand_location): Extract the data pointer before
	extracting the location.
	(rich_location::rich_location): Add line_maps param; use it to
	extract the range from the source_location.
	* location-example.txt: Regenerate, showing new representation.
---
 gcc/Makefile.in                                    |   1 +
 gcc/ada/gcc-interface/trans.c                      |   3 +-
 gcc/c-family/c-common.c                            |  10 +-
 gcc/c-family/c-opts.c                              |   2 +
 gcc/c/c-decl.c                                     |   2 +-
 gcc/c/c-errors.c                                   |   4 +-
 gcc/c/c-parser.c                                   |  92 ++++-
 gcc/c/c-tree.h                                     |  19 +
 gcc/c/c-typeck.c                                   |  10 +
 gcc/cp/error.c                                     |   2 +-
 gcc/diagnostic.c                                   |  34 +-
 gcc/fortran/error.c                                |  14 +-
 gcc/gcc-rich-location.c                            |  86 +++++
 gcc/gcc-rich-location.h                            |  47 +++
 gcc/genmatch.c                                     |   8 +-
 gcc/gimple.h                                       |   6 +-
 gcc/input.c                                        |  28 +-
 gcc/print-tree.c                                   |  21 +
 gcc/rtl-error.c                                    |   2 +-
 gcc/testsuite/gcc.dg/diagnostic-token-ranges.c     | 120 ++++++
 .../gcc.dg/diagnostic-tree-expr-ranges-2.c         |  23 ++
 .../gcc.dg/plugin/diagnostic-test-expressions-1.c  | 422 +++++++++++++++++++++
 .../gcc.dg/plugin/diagnostic-test-show-trees-1.c   |  65 ++++
 .../gcc.dg/plugin/diagnostic_plugin_show_trees.c   | 174 +++++++++
 .../plugin/diagnostic_plugin_test_show_locus.c     |  24 +-
 .../diagnostic_plugin_test_tree_expression_range.c |  98 +++++
 gcc/testsuite/gcc.dg/plugin/plugin.exp             |   4 +
 gcc/toplev.c                                       |   1 +
 gcc/tree-cfg.c                                     |   9 +-
 gcc/tree-inline.c                                  |   5 +-
 gcc/tree.c                                         |  60 ++-
 gcc/tree.h                                         |  33 ++
 libcpp/errors.c                                    |   4 +-
 libcpp/include/cpplib.h                            |   3 +-
 libcpp/include/line-map.h                          | 219 +++++++++--
 libcpp/lex.c                                       |  13 +
 libcpp/line-map.c                                  | 274 +++++++++++--
 libcpp/location-example.txt                        | 188 ++++-----
 38 files changed, 1888 insertions(+), 242 deletions(-)
 create mode 100644 gcc/gcc-rich-location.c
 create mode 100644 gcc/gcc-rich-location.h
 create mode 100644 gcc/testsuite/gcc.dg/diagnostic-token-ranges.c
 create mode 100644 gcc/testsuite/gcc.dg/diagnostic-tree-expr-ranges-2.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-trees-1.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_show_trees.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 34d2356..bd6f484 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1263,6 +1263,7 @@ OBJS = \
 	fold-const-call.o \
 	function.o \
 	fwprop.o \
+	gcc-rich-location.o \
 	gcse.o \
 	gcse-common.o \
 	ggc-common.o \
diff --git a/gcc/ada/gcc-interface/trans.c b/gcc/ada/gcc-interface/trans.c
index ca66a03..eeb2aac 100644
--- a/gcc/ada/gcc-interface/trans.c
+++ b/gcc/ada/gcc-interface/trans.c
@@ -9650,7 +9650,8 @@ Sloc_to_locus (Source_Ptr Sloc, location_t *locus, bool clear_column)
     line = 1;
 
   /* Translate the location.  */
-  *locus = linemap_position_for_line_and_column (map, line, column);
+  *locus = linemap_position_for_line_and_column (line_table, map,
+						 line, column);
 
   return true;
 }
diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 6e2ce0a..89e978d 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -1187,6 +1187,7 @@ c_fully_fold_internal (tree expr, bool in_init, bool *maybe_const_operands,
   bool op0_const_self = true, op1_const_self = true, op2_const_self = true;
   bool nowarning = TREE_NO_WARNING (expr);
   bool unused_p;
+  source_range old_range;
 
   /* This function is not relevant to C++ because C++ folds while
      parsing, and may need changes to be correct for C++ when C++
@@ -1202,6 +1203,9 @@ c_fully_fold_internal (tree expr, bool in_init, bool *maybe_const_operands,
       || code == SAVE_EXPR)
     return expr;
 
+  if (IS_EXPR_CODE_CLASS (kind))
+    old_range = EXPR_LOCATION_RANGE (expr);
+
   /* Operands of variable-length expressions (function calls) have
      already been folded, as have __builtin_* function calls, and such
      expressions cannot occur in constant expressions.  */
@@ -1626,7 +1630,11 @@ c_fully_fold_internal (tree expr, bool in_init, bool *maybe_const_operands,
       TREE_NO_WARNING (ret) = 1;
     }
   if (ret != expr)
-    protected_set_expr_location (ret, loc);
+    {
+      protected_set_expr_location (ret, loc);
+      if (IS_EXPR_CODE_CLASS (kind))
+	set_source_range (ret, old_range.m_start, old_range.m_finish);
+    }
   return ret;
 }
 
diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 4da6f31..9ae181f 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -245,6 +245,8 @@ c_common_init_options (unsigned int decoded_options_count,
 	    break;
 	  }
     }
+
+  global_dc->colorize_source_p = true;
 }
 
 /* Handle switch SCODE with argument ARG.  VALUE is true, unless no-
diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index a636474..9a222d8 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -5278,7 +5278,7 @@ warn_defaults_to (location_t location, int opt, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
diff --git a/gcc/c/c-errors.c b/gcc/c/c-errors.c
index ef0f9a2..ee9c2b5 100644
--- a/gcc/c/c-errors.c
+++ b/gcc/c/c-errors.c
@@ -37,7 +37,7 @@ pedwarn_c99 (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool warned = false;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, gmsgid);
   /* If desired, issue the C99/C11 compat warning, which is more specific
@@ -76,7 +76,7 @@ pedwarn_c90 (location_t location, int opt, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, gmsgid);
   /* Warnings such as -Wvla are the most specific ones.  */
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 2484b92..24a7010 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -59,6 +59,23 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-expr.h"
 #include "context.h"
 
+void
+set_c_expr_source_range (c_expr *expr,
+			 location_t start, location_t finish)
+{
+  expr->src_range.m_start = start;
+  expr->src_range.m_finish = finish;
+  set_source_range (expr->value, start, finish);
+}
+
+void
+set_c_expr_source_range (c_expr *expr,
+			 source_range src_range)
+{
+  expr->src_range = src_range;
+  set_source_range (expr->value, src_range);
+}
+
 \f
 /* Initialization routine for this file.  */
 
@@ -164,6 +181,16 @@ struct GTY (()) c_token {
   location_t location;
   /* The value associated with this token, if any.  */
   tree value;
+
+  source_range get_range () const
+  {
+    return get_range_from_loc (line_table, location);
+  }
+
+  location_t get_finish () const
+  {
+    return get_range ().m_finish;
+  }
 };
 
 /* A parser structure recording information about the state and
@@ -6101,6 +6128,7 @@ c_parser_expr_no_commas (c_parser *parser, struct c_expr *after,
   ret.value = build_modify_expr (op_location, lhs.value, lhs.original_type,
 				 code, exp_location, rhs.value,
 				 rhs.original_type);
+  set_c_expr_source_range (&ret, lhs.get_start (), rhs.get_finish ());
   if (code == NOP_EXPR)
     ret.original_code = MODIFY_EXPR;
   else
@@ -6131,7 +6159,7 @@ c_parser_conditional_expression (c_parser *parser, struct c_expr *after,
 				 tree omp_atomic_lhs)
 {
   struct c_expr cond, exp1, exp2, ret;
-  location_t cond_loc, colon_loc, middle_loc;
+  location_t start, cond_loc, colon_loc, middle_loc;
 
   gcc_assert (!after || c_dialect_objc ());
 
@@ -6139,6 +6167,10 @@ c_parser_conditional_expression (c_parser *parser, struct c_expr *after,
 
   if (c_parser_next_token_is_not (parser, CPP_QUERY))
     return cond;
+  if (cond.value != error_mark_node)
+    start = cond.get_start ();
+  else
+    start = UNKNOWN_LOCATION;
   cond_loc = c_parser_peek_token (parser)->location;
   cond = convert_lvalue_to_rvalue (cond_loc, cond, true, true);
   c_parser_consume_token (parser);
@@ -6214,6 +6246,7 @@ c_parser_conditional_expression (c_parser *parser, struct c_expr *after,
 			   ? t1
 			   : NULL);
     }
+  set_c_expr_source_range (&ret, start, exp2.get_finish ());
   return ret;
 }
 
@@ -6366,6 +6399,7 @@ c_parser_binary_expression (c_parser *parser, struct c_expr *after,
     {
       enum c_parser_prec oprec;
       enum tree_code ocode;
+      source_range src_range;
       if (parser->error)
 	goto out;
       switch (c_parser_peek_token (parser)->type)
@@ -6454,6 +6488,7 @@ c_parser_binary_expression (c_parser *parser, struct c_expr *after,
       switch (ocode)
 	{
 	case TRUTH_ANDIF_EXPR:
+	  src_range = stack[sp].expr.src_range;
 	  stack[sp].expr
 	    = convert_lvalue_to_rvalue (stack[sp].loc,
 					stack[sp].expr, true, true);
@@ -6461,8 +6496,10 @@ c_parser_binary_expression (c_parser *parser, struct c_expr *after,
 	    (stack[sp].loc, default_conversion (stack[sp].expr.value));
 	  c_inhibit_evaluation_warnings += (stack[sp].expr.value
 					    == truthvalue_false_node);
+	  set_c_expr_source_range (&stack[sp].expr, src_range);
 	  break;
 	case TRUTH_ORIF_EXPR:
+	  src_range = stack[sp].expr.src_range;
 	  stack[sp].expr
 	    = convert_lvalue_to_rvalue (stack[sp].loc,
 					stack[sp].expr, true, true);
@@ -6470,6 +6507,7 @@ c_parser_binary_expression (c_parser *parser, struct c_expr *after,
 	    (stack[sp].loc, default_conversion (stack[sp].expr.value));
 	  c_inhibit_evaluation_warnings += (stack[sp].expr.value
 					    == truthvalue_true_node);
+	  set_c_expr_source_range (&stack[sp].expr, src_range);
 	  break;
 	default:
 	  break;
@@ -6538,6 +6576,8 @@ c_parser_cast_expression (c_parser *parser, struct c_expr *after)
 	expr = convert_lvalue_to_rvalue (expr_loc, expr, true, true);
       }
       ret.value = c_cast_expr (cast_loc, type_name, expr.value);
+      if (ret.value && expr.value)
+	set_c_expr_source_range (&ret, cast_loc, expr.get_finish ());
       ret.original_code = ERROR_MARK;
       ret.original_type = NULL;
       return ret;
@@ -6587,6 +6627,7 @@ c_parser_unary_expression (c_parser *parser)
   struct c_expr ret, op;
   location_t op_loc = c_parser_peek_token (parser)->location;
   location_t exp_loc;
+  location_t finish;
   ret.original_code = ERROR_MARK;
   ret.original_type = NULL;
   switch (c_parser_peek_token (parser)->type)
@@ -6626,8 +6667,10 @@ c_parser_unary_expression (c_parser *parser)
       c_parser_consume_token (parser);
       exp_loc = c_parser_peek_token (parser)->location;
       op = c_parser_cast_expression (parser, NULL);
+      finish = op.get_finish ();
       op = convert_lvalue_to_rvalue (exp_loc, op, true, true);
       ret.value = build_indirect_ref (op_loc, op.value, RO_UNARY_STAR);
+      set_c_expr_source_range (&ret, op_loc, finish);
       return ret;
     case CPP_PLUS:
       if (!c_dialect_objc () && !in_system_header_at (input_location))
@@ -6715,8 +6758,15 @@ static struct c_expr
 c_parser_sizeof_expression (c_parser *parser)
 {
   struct c_expr expr;
+  struct c_expr result;
   location_t expr_loc;
   gcc_assert (c_parser_next_token_is_keyword (parser, RID_SIZEOF));
+
+  location_t start;
+  location_t finish = UNKNOWN_LOCATION;
+
+  start = c_parser_peek_token (parser)->location;
+
   c_parser_consume_token (parser);
   c_inhibit_evaluation_warnings++;
   in_sizeof++;
@@ -6730,6 +6780,7 @@ c_parser_sizeof_expression (c_parser *parser)
       expr_loc = c_parser_peek_token (parser)->location;
       type_name = c_parser_type_name (parser);
       c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, "expected %<)%>");
+      finish = parser->tokens_buf[0].location;
       if (type_name == NULL)
 	{
 	  struct c_expr ret;
@@ -6745,17 +6796,19 @@ c_parser_sizeof_expression (c_parser *parser)
 	  expr = c_parser_postfix_expression_after_paren_type (parser,
 							       type_name,
 							       expr_loc);
+	  finish = expr.get_finish ();
 	  goto sizeof_expr;
 	}
       /* sizeof ( type-name ).  */
       c_inhibit_evaluation_warnings--;
       in_sizeof--;
-      return c_expr_sizeof_type (expr_loc, type_name);
+      result = c_expr_sizeof_type (expr_loc, type_name);
     }
   else
     {
       expr_loc = c_parser_peek_token (parser)->location;
       expr = c_parser_unary_expression (parser);
+      finish = expr.get_finish ();
     sizeof_expr:
       c_inhibit_evaluation_warnings--;
       in_sizeof--;
@@ -6763,8 +6816,11 @@ c_parser_sizeof_expression (c_parser *parser)
       if (TREE_CODE (expr.value) == COMPONENT_REF
 	  && DECL_C_BIT_FIELD (TREE_OPERAND (expr.value, 1)))
 	error_at (expr_loc, "%<sizeof%> applied to a bit-field");
-      return c_expr_sizeof_expr (expr_loc, expr);
+      result = c_expr_sizeof_expr (expr_loc, expr);
     }
+  if (finish != UNKNOWN_LOCATION)
+    set_c_expr_source_range (&result, start, finish);
+  return result;
 }
 
 /* Parse an alignof expression.  */
@@ -7184,12 +7240,14 @@ c_parser_postfix_expression (c_parser *parser)
   struct c_expr expr, e1;
   struct c_type_name *t1, *t2;
   location_t loc = c_parser_peek_token (parser)->location;;
+  source_range tok_range = c_parser_peek_token (parser)->get_range ();
   expr.original_code = ERROR_MARK;
   expr.original_type = NULL;
   switch (c_parser_peek_token (parser)->type)
     {
     case CPP_NUMBER:
       expr.value = c_parser_peek_token (parser)->value;
+      set_c_expr_source_range (&expr, tok_range);
       loc = c_parser_peek_token (parser)->location;
       c_parser_consume_token (parser);
       if (TREE_CODE (expr.value) == FIXED_CST
@@ -7204,6 +7262,7 @@ c_parser_postfix_expression (c_parser *parser)
     case CPP_CHAR32:
     case CPP_WCHAR:
       expr.value = c_parser_peek_token (parser)->value;
+      set_c_expr_source_range (&expr, tok_range);
       c_parser_consume_token (parser);
       break;
     case CPP_STRING:
@@ -7212,6 +7271,7 @@ c_parser_postfix_expression (c_parser *parser)
     case CPP_WSTRING:
     case CPP_UTF8STRING:
       expr.value = c_parser_peek_token (parser)->value;
+      set_c_expr_source_range (&expr, tok_range);
       expr.original_code = STRING_CST;
       c_parser_consume_token (parser);
       break;
@@ -7219,6 +7279,7 @@ c_parser_postfix_expression (c_parser *parser)
       gcc_assert (c_dialect_objc ());
       expr.value
 	= objc_build_string_object (c_parser_peek_token (parser)->value);
+      set_c_expr_source_range (&expr, tok_range);
       c_parser_consume_token (parser);
       break;
     case CPP_NAME:
@@ -7232,6 +7293,7 @@ c_parser_postfix_expression (c_parser *parser)
 					     (c_parser_peek_token (parser)->type
 					      == CPP_OPEN_PAREN),
 					     &expr.original_type);
+	    set_c_expr_source_range (&expr, tok_range);
 	    break;
 	  }
 	case C_ID_CLASSNAME:
@@ -7320,6 +7382,7 @@ c_parser_postfix_expression (c_parser *parser)
       else
 	{
 	  /* A parenthesized expression.  */
+	  location_t loc_open_paren = c_parser_peek_token (parser)->location;
 	  c_parser_consume_token (parser);
 	  expr = c_parser_expression (parser);
 	  if (TREE_CODE (expr.value) == MODIFY_EXPR)
@@ -7327,6 +7390,8 @@ c_parser_postfix_expression (c_parser *parser)
 	  if (expr.original_code != C_MAYBE_CONST_EXPR)
 	    expr.original_code = ERROR_MARK;
 	  /* Don't change EXPR.ORIGINAL_TYPE.  */
+	  location_t loc_close_paren = c_parser_peek_token (parser)->location;
+	  set_c_expr_source_range (&expr, loc_open_paren, loc_close_paren);
 	  c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
 				     "expected %<)%>");
 	}
@@ -7917,6 +7982,8 @@ c_parser_postfix_expression_after_primary (c_parser *parser,
   vec<tree, va_gc> *exprlist;
   vec<tree, va_gc> *origtypes = NULL;
   vec<location_t> arg_loc = vNULL;
+  location_t start;
+  location_t finish;
 
   while (true)
     {
@@ -7953,7 +8020,10 @@ c_parser_postfix_expression_after_primary (c_parser *parser,
 		{
 		  c_parser_skip_until_found (parser, CPP_CLOSE_SQUARE,
 					     "expected %<]%>");
+		  start = expr.get_start ();
+		  finish = parser->tokens_buf[0].location;
 		  expr.value = build_array_ref (op_loc, expr.value, idx);
+		  set_c_expr_source_range (&expr, start, finish);
 		}
 	    }
 	  expr.original_code = ERROR_MARK;
@@ -7996,9 +8066,13 @@ c_parser_postfix_expression_after_primary (c_parser *parser,
 			"%<memset%> used with constant zero length parameter; "
 			"this could be due to transposed parameters");
 
+	  start = expr.get_start ();
+	  finish = parser->tokens_buf[0].get_finish ();
 	  expr.value
 	    = c_build_function_call_vec (expr_loc, arg_loc, expr.value,
 					 exprlist, origtypes);
+	  set_c_expr_source_range (&expr, start, finish);
+
 	  expr.original_code = ERROR_MARK;
 	  if (TREE_CODE (expr.value) == INTEGER_CST
 	      && TREE_CODE (orig_expr.value) == FUNCTION_DECL
@@ -8027,8 +8101,11 @@ c_parser_postfix_expression_after_primary (c_parser *parser,
               expr.original_type = NULL;
 	      return expr;
 	    }
+	  start = expr.get_start ();
+	  finish = c_parser_peek_token (parser)->get_finish ();
 	  c_parser_consume_token (parser);
 	  expr.value = build_component_ref (op_loc, expr.value, ident);
+	  set_c_expr_source_range (&expr, start, finish);
 	  expr.original_code = ERROR_MARK;
 	  if (TREE_CODE (expr.value) != COMPONENT_REF)
 	    expr.original_type = NULL;
@@ -8056,12 +8133,15 @@ c_parser_postfix_expression_after_primary (c_parser *parser,
 	      expr.original_type = NULL;
 	      return expr;
 	    }
+	  start = expr.get_start ();
+	  finish = c_parser_peek_token (parser)->get_finish ();
 	  c_parser_consume_token (parser);
 	  expr.value = build_component_ref (op_loc,
 					    build_indirect_ref (op_loc,
 								expr.value,
 								RO_ARROW),
 					    ident);
+	  set_c_expr_source_range (&expr, start, finish);
 	  expr.original_code = ERROR_MARK;
 	  if (TREE_CODE (expr.value) != COMPONENT_REF)
 	    expr.original_type = NULL;
@@ -8077,6 +8157,8 @@ c_parser_postfix_expression_after_primary (c_parser *parser,
 	  break;
 	case CPP_PLUS_PLUS:
 	  /* Postincrement.  */
+	  start = expr.get_start ();
+	  finish = c_parser_peek_token (parser)->get_finish ();
 	  c_parser_consume_token (parser);
 	  /* If the expressions have array notations, we expand them.  */
 	  if (flag_cilkplus
@@ -8088,11 +8170,14 @@ c_parser_postfix_expression_after_primary (c_parser *parser,
 	      expr.value = build_unary_op (op_loc,
 					   POSTINCREMENT_EXPR, expr.value, 0);
 	    }
+	  set_c_expr_source_range (&expr, start, finish);
 	  expr.original_code = ERROR_MARK;
 	  expr.original_type = NULL;
 	  break;
 	case CPP_MINUS_MINUS:
 	  /* Postdecrement.  */
+	  start = expr.get_start ();
+	  finish = c_parser_peek_token (parser)->get_finish ();
 	  c_parser_consume_token (parser);
 	  /* If the expressions have array notations, we expand them.  */
 	  if (flag_cilkplus
@@ -8104,6 +8189,7 @@ c_parser_postfix_expression_after_primary (c_parser *parser,
 	      expr.value = build_unary_op (op_loc,
 					   POSTDECREMENT_EXPR, expr.value, 0);
 	    }
+	  set_c_expr_source_range (&expr, start, finish);
 	  expr.original_code = ERROR_MARK;
 	  expr.original_type = NULL;
 	  break;
diff --git a/gcc/c/c-tree.h b/gcc/c/c-tree.h
index 04991f7..6bc216a 100644
--- a/gcc/c/c-tree.h
+++ b/gcc/c/c-tree.h
@@ -132,6 +132,17 @@ struct c_expr
      The type of an enum constant is a plain integer type, but this
      field will be the enum type.  */
   tree original_type;
+
+  /* The source range of this expression.  This is redundant
+     for node values that have locations, but not all node kinds
+     have locations (e.g. constants, and references to params, locals,
+     etc), so we stash a copy here.  */
+  source_range src_range;
+
+  /* Access to the first and last locations within the source spelling
+     of this expression.  */
+  location_t get_start () const { return src_range.m_start; }
+  location_t get_finish () const { return src_range.m_finish; }
 };
 
 /* Type alias for struct c_expr. This allows to use the structure
@@ -708,4 +719,12 @@ extern void pedwarn_c90 (location_t, int opt, const char *, ...)
 extern bool pedwarn_c99 (location_t, int opt, const char *, ...)
     ATTRIBUTE_GCC_DIAG(3,4);
 
+extern void
+set_c_expr_source_range (c_expr *expr,
+			 location_t start, location_t finish);
+
+extern void
+set_c_expr_source_range (c_expr *expr,
+			 source_range src_range);
+
 #endif /* ! GCC_C_TREE_H */
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 4335a87..0c3fa19 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -3388,6 +3388,12 @@ parser_build_unary_op (location_t loc, enum tree_code code, struct c_expr arg)
     overflow_warning (loc, result.value);
     }
 
+  /* We are typically called when parsing a prefix token at LOC acting on
+     ARG.  Reflect this by updating the source range of the result to
+     start at LOC and end at the end of ARG.  */
+  set_c_expr_source_range (&result,
+			   loc, arg.get_finish ());
+
   return result;
 }
 
@@ -3425,6 +3431,10 @@ parser_build_binary_op (location_t location, enum tree_code code,
   if (location != UNKNOWN_LOCATION)
     protected_set_expr_location (result.value, location);
 
+  set_c_expr_source_range (&result,
+			   arg1.get_start (),
+			   arg2.get_finish ());
+
   /* Check for cases such as x+y<<z which users are likely
      to misinterpret.  */
   if (warn_parentheses)
diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 361e41a..38548c7 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -3673,7 +3673,7 @@ pedwarn_cxx98 (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index ee034e7..b4d3a7d 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -867,7 +867,7 @@ diagnostic_append_note (diagnostic_context *context,
   diagnostic_info diagnostic;
   va_list ap;
   const char *saved_prefix;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_NOTE);
@@ -925,7 +925,7 @@ emit_diagnostic (diagnostic_t kind, location_t location, int opt,
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, gmsgid);
   if (kind == DK_PERMERROR)
@@ -952,7 +952,7 @@ inform (location_t location, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_NOTE);
@@ -981,7 +981,7 @@ inform_n (location_t location, int n, const char *singular_gmsgid,
 {
   diagnostic_info diagnostic;
   va_list ap;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, plural_gmsgid);
   diagnostic_set_info_translated (&diagnostic,
@@ -1000,7 +1000,7 @@ warning (int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
-  rich_location richloc (input_location);
+  rich_location richloc (line_table, input_location);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_WARNING);
@@ -1021,7 +1021,7 @@ warning_at (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_WARNING);
@@ -1059,7 +1059,7 @@ warning_n (location_t location, int opt, int n, const char *singular_gmsgid,
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, plural_gmsgid);
   diagnostic_set_info_translated (&diagnostic,
@@ -1091,7 +1091,7 @@ pedwarn (location_t location, int opt, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,  DK_PEDWARN);
@@ -1114,7 +1114,7 @@ permerror (location_t location, const char *gmsgid, ...)
   diagnostic_info diagnostic;
   va_list ap;
   bool ret;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc,
@@ -1150,7 +1150,7 @@ error (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
-  rich_location richloc (input_location);
+  rich_location richloc (line_table, input_location);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ERROR);
@@ -1166,7 +1166,7 @@ error_n (location_t location, int n, const char *singular_gmsgid,
 {
   diagnostic_info diagnostic;
   va_list ap;
-  rich_location richloc (location);
+  rich_location richloc (line_table, location);
 
   va_start (ap, plural_gmsgid);
   diagnostic_set_info_translated (&diagnostic,
@@ -1182,7 +1182,7 @@ error_at (location_t loc, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
-  rich_location richloc (loc);
+  rich_location richloc (line_table, loc);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ERROR);
@@ -1213,7 +1213,7 @@ sorry (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
-  rich_location richloc (input_location);
+  rich_location richloc (line_table, input_location);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_SORRY);
@@ -1237,7 +1237,7 @@ fatal_error (location_t loc, const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
-  rich_location richloc (loc);
+  rich_location richloc (line_table, loc);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_FATAL);
@@ -1256,7 +1256,7 @@ internal_error (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
-  rich_location richloc (input_location);
+  rich_location richloc (line_table, input_location);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ICE);
@@ -1274,7 +1274,7 @@ internal_error_no_backtrace (const char *gmsgid, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
-  rich_location richloc (input_location);
+  rich_location richloc (line_table, input_location);
 
   va_start (ap, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &ap, &richloc, DK_ICE_NOBT);
@@ -1351,7 +1351,7 @@ real_abort (void)
 DEBUG_FUNCTION void
 source_range::debug (const char *msg) const
 {
-  rich_location richloc (m_start);
+  rich_location richloc (line_table, m_start);
   richloc.add_range (m_start, m_finish, false);
   inform_at_rich_loc (&richloc, "%s", msg);
 }
diff --git a/gcc/fortran/error.c b/gcc/fortran/error.c
index 4b3d31c..b4f7020 100644
--- a/gcc/fortran/error.c
+++ b/gcc/fortran/error.c
@@ -773,7 +773,7 @@ gfc_warning (int opt, const char *gmsgid, va_list ap)
   va_copy (argp, ap);
 
   diagnostic_info diagnostic;
-  rich_location rich_loc (UNKNOWN_LOCATION);
+  rich_location rich_loc (line_table, UNKNOWN_LOCATION);
   bool fatal_errors = global_dc->fatal_errors;
   pretty_printer *pp = global_dc->printer;
   output_buffer *tmp_buffer = pp->buffer;
@@ -1120,7 +1120,7 @@ gfc_warning_now_at (location_t loc, int opt, const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
-  rich_location rich_loc (loc);
+  rich_location rich_loc (line_table, loc);
   bool ret;
 
   va_start (argp, gmsgid);
@@ -1138,7 +1138,7 @@ gfc_warning_now (int opt, const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
-  rich_location rich_loc (UNKNOWN_LOCATION);
+  rich_location rich_loc (line_table, UNKNOWN_LOCATION);
   bool ret;
 
   va_start (argp, gmsgid);
@@ -1158,7 +1158,7 @@ gfc_error_now (const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
-  rich_location rich_loc (UNKNOWN_LOCATION);
+  rich_location rich_loc (line_table, UNKNOWN_LOCATION);
 
   error_buffer.flag = true;
 
@@ -1176,7 +1176,7 @@ gfc_fatal_error (const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
-  rich_location rich_loc (UNKNOWN_LOCATION);
+  rich_location rich_loc (line_table, UNKNOWN_LOCATION);
 
   va_start (argp, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc, DK_FATAL);
@@ -1242,7 +1242,7 @@ gfc_error (const char *gmsgid, va_list ap)
     }
 
   diagnostic_info diagnostic;
-  rich_location richloc (UNKNOWN_LOCATION);
+  rich_location richloc (line_table, UNKNOWN_LOCATION);
   bool fatal_errors = global_dc->fatal_errors;
   pretty_printer *pp = global_dc->printer;
   output_buffer *tmp_buffer = pp->buffer;
@@ -1288,7 +1288,7 @@ gfc_internal_error (const char *gmsgid, ...)
 {
   va_list argp;
   diagnostic_info diagnostic;
-  rich_location rich_loc (UNKNOWN_LOCATION);
+  rich_location rich_loc (line_table, UNKNOWN_LOCATION);
 
   va_start (argp, gmsgid);
   diagnostic_set_info (&diagnostic, gmsgid, &argp, &rich_loc, DK_ICE);
diff --git a/gcc/gcc-rich-location.c b/gcc/gcc-rich-location.c
new file mode 100644
index 0000000..b0ec47b
--- /dev/null
+++ b/gcc/gcc-rich-location.c
@@ -0,0 +1,86 @@
+/* Implementation of gcc_rich_location class
+   Copyright (C) 2014-2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "rtl.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "alias.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree-core.h"
+#include "tree.h"
+#include "diagnostic-core.h"
+#include "gcc-rich-location.h"
+#include "print-tree.h"
+#include "pretty-print.h"
+#include "intl.h"
+#include "cpplib.h"
+#include "diagnostic.h"
+
+/* Extract any source range information from EXPR and write it
+   to *R.  */
+
+static bool
+get_range_for_expr (tree expr, location_range *r)
+{
+  if (EXPR_HAS_RANGE (expr))
+    {
+      source_range sr = EXPR_LOCATION_RANGE (expr);
+
+      /* Do we have meaningful data?  */
+      if (sr.m_start && sr.m_finish)
+	{
+	  r->m_start = expand_location (sr.m_start);
+	  r->m_finish = expand_location (sr.m_finish);
+	  return true;
+	}
+    }
+
+  return false;
+}
+
+/* Add a range to the rich_location, covering expression EXPR. */
+
+void
+gcc_rich_location::add_expr (tree expr)
+{
+  gcc_assert (expr);
+
+  location_range r;
+  r.m_show_caret_p = false;
+  if (get_range_for_expr (expr, &r))
+    add_range (&r);
+}
+
+/* If T is an expression, add a range for it to the rich_location.  */
+
+void
+gcc_rich_location::maybe_add_expr (tree t)
+{
+  if (EXPR_P (t))
+    add_expr (t);
+}
diff --git a/gcc/gcc-rich-location.h b/gcc/gcc-rich-location.h
new file mode 100644
index 0000000..2f9291d
--- /dev/null
+++ b/gcc/gcc-rich-location.h
@@ -0,0 +1,47 @@
+/* Declarations relating to class gcc_rich_location
+   Copyright (C) 2014-2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_RICH_LOCATION_H
+#define GCC_RICH_LOCATION_H
+
+/* A gcc_rich_location is libcpp's rich_location with additional
+   helper methods for working with gcc's types.  */
+class gcc_rich_location : public rich_location
+{
+ public:
+  /* Constructors.  */
+
+  /* Constructing from a location.  */
+  gcc_rich_location (source_location loc) :
+    rich_location (line_table, loc) {}
+
+  /* Constructing from a source_range.  */
+  gcc_rich_location (source_range src_range) :
+    rich_location (src_range) {}
+
+
+  /* Methods for adding ranges via gcc entities.  */
+  void
+  add_expr (tree expr);
+
+  void
+  maybe_add_expr (tree t);
+};
+
+#endif /* GCC_RICH_LOCATION_H */
diff --git a/gcc/genmatch.c b/gcc/genmatch.c
index 1eb8c24..9d74ed7 100644
--- a/gcc/genmatch.c
+++ b/gcc/genmatch.c
@@ -119,7 +119,7 @@ __attribute__((format (printf, 2, 3)))
 #endif
 fatal_at (const cpp_token *tk, const char *msg, ...)
 {
-  rich_location richloc (tk->src_loc);
+  rich_location richloc (line_table, tk->src_loc);
   va_list ap;
   va_start (ap, msg);
   error_cb (NULL, CPP_DL_FATAL, 0, &richloc, msg, &ap);
@@ -132,7 +132,7 @@ __attribute__((format (printf, 2, 3)))
 #endif
 fatal_at (source_location loc, const char *msg, ...)
 {
-  rich_location richloc (loc);
+  rich_location richloc (line_table, loc);
   va_list ap;
   va_start (ap, msg);
   error_cb (NULL, CPP_DL_FATAL, 0, &richloc, msg, &ap);
@@ -145,7 +145,7 @@ __attribute__((format (printf, 2, 3)))
 #endif
 warning_at (const cpp_token *tk, const char *msg, ...)
 {
-  rich_location richloc (tk->src_loc);
+  rich_location richloc (line_table, tk->src_loc);
   va_list ap;
   va_start (ap, msg);
   error_cb (NULL, CPP_DL_WARNING, 0, &richloc, msg, &ap);
@@ -158,7 +158,7 @@ __attribute__((format (printf, 2, 3)))
 #endif
 warning_at (source_location loc, const char *msg, ...)
 {
-  rich_location richloc (loc);
+  rich_location richloc (line_table, loc);
   va_list ap;
   va_start (ap, msg);
   error_cb (NULL, CPP_DL_WARNING, 0, &richloc, msg, &ap);
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 781801b..8c60c47 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -1739,11 +1739,7 @@ gimple_block (const gimple *g)
 static inline void
 gimple_set_block (gimple *g, tree block)
 {
-  if (block)
-    g->location =
-	COMBINE_LOCATION_DATA (line_table, g->location, block);
-  else
-    g->location = LOCATION_LOCUS (g->location);
+  g->location = set_block (g->location, block);
 }
 
 
diff --git a/gcc/input.c b/gcc/input.c
index 0f6d448..ce84f10 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -887,6 +887,10 @@ dump_line_table_statistics (void)
 	   STAT_LABEL (s.adhoc_table_size));
   fprintf (stderr, "Ad-hoc table entries used:           %5ld\n",
 	   s.adhoc_table_entries_used);
+  fprintf (stderr, "optimized_ranges: %i\n",
+	   line_table->num_optimized_ranges);
+  fprintf (stderr, "unoptimized_ranges: %i\n",
+	   line_table->num_unoptimized_ranges);
 
   fprintf (stderr, "\n");
 }
@@ -917,13 +921,14 @@ write_digit (FILE *stream, int digit)
 
 static void
 write_digit_row (FILE *stream, int indent,
+		 const line_map_ordinary *map,
 		 source_location loc, int max_col, int divisor)
 {
   fprintf (stream, "%*c", indent, ' ');
   fprintf (stream, "|");
   for (int column = 1; column < max_col; column++)
     {
-      source_location column_loc = loc + column;
+      source_location column_loc = loc + (column << map->m_range_bits);
       write_digit (stream, column_loc / divisor);
     }
   fprintf (stream, "\n");
@@ -977,14 +982,20 @@ dump_location_info (FILE *stream)
       fprintf (stream, "  file: %s\n", ORDINARY_MAP_FILE_NAME (map));
       fprintf (stream, "  starting at line: %i\n",
 	       ORDINARY_MAP_STARTING_LINE_NUMBER (map));
+      fprintf (stream, "  column and range bits: %i\n",
+	       map->m_column_and_range_bits);
       fprintf (stream, "  column bits: %i\n",
-	       ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map));
+	       map->m_column_and_range_bits - map->m_range_bits);
+      fprintf (stream, "  range bits: %i\n",
+	       map->m_range_bits);
 
       /* Render the span of source lines that this "map" covers.  */
       for (source_location loc = MAP_START_LOCATION (map);
 	   loc < end_location;
-	   loc++)
+	   loc += (1 << map->m_range_bits) )
 	{
+	  gcc_assert (pure_location_p (line_table, loc) );
+
 	  expanded_location exploc
 	    = linemap_expand_location (line_table, map, loc);
 
@@ -1008,8 +1019,7 @@ dump_location_info (FILE *stream)
 		 Render the locations *within* the line, by underlining
 		 it, showing the source_location numeric values
 		 at each column.  */
-	      int max_col
-		= (1 << ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map)) - 1;
+	      int max_col = (1 << map->m_column_and_range_bits) - 1;
 	      if (max_col > line_size)
 		max_col = line_size + 1;
 
@@ -1017,17 +1027,17 @@ dump_location_info (FILE *stream)
 
 	      /* Thousands.  */
 	      if (end_location > 999)
-		write_digit_row (stream, indent, loc, max_col, 1000);
+		write_digit_row (stream, indent, map, loc, max_col, 1000);
 
 	      /* Hundreds.  */
 	      if (end_location > 99)
-		write_digit_row (stream, indent, loc, max_col, 100);
+		write_digit_row (stream, indent, map, loc, max_col, 100);
 
 	      /* Tens.  */
-	      write_digit_row (stream, indent, loc, max_col, 10);
+	      write_digit_row (stream, indent, map, loc, max_col, 10);
 
 	      /* Units.  */
-	      write_digit_row (stream, indent, loc, max_col, 1);
+	      write_digit_row (stream, indent, map, loc, max_col, 1);
 	    }
 	}
       fprintf (stream, "\n");
diff --git a/gcc/print-tree.c b/gcc/print-tree.c
index 1b584b8..cb0f1fd 100644
--- a/gcc/print-tree.c
+++ b/gcc/print-tree.c
@@ -938,6 +938,27 @@ print_node (FILE *file, const char *prefix, tree node, int indent)
       expanded_location xloc = expand_location (EXPR_LOCATION (node));
       indent_to (file, indent+4);
       fprintf (file, "%s:%d:%d", xloc.file, xloc.line, xloc.column);
+
+      /* Print the range, if any */
+      source_range r = EXPR_LOCATION_RANGE (node);
+      if (r.m_start)
+	{
+	  xloc = expand_location (r.m_start);
+	  fprintf (file, " start: %s:%d:%d", xloc.file, xloc.line, xloc.column);
+	}
+      else
+	{
+	  fprintf (file, " start: unknown");
+	}
+      if (r.m_finish)
+	{
+	  xloc = expand_location (r.m_finish);
+	  fprintf (file, " finish: %s:%d:%d", xloc.file, xloc.line, xloc.column);
+	}
+      else
+	{
+	  fprintf (file, " finish: unknown");
+	}
     }
 
   fprintf (file, ">");
diff --git a/gcc/rtl-error.c b/gcc/rtl-error.c
index 96da2bd..088bb8a 100644
--- a/gcc/rtl-error.c
+++ b/gcc/rtl-error.c
@@ -67,7 +67,7 @@ diagnostic_for_asm (const rtx_insn *insn, const char *msg, va_list *args_ptr,
 		    diagnostic_t kind)
 {
   diagnostic_info diagnostic;
-  rich_location richloc (location_for_asm (insn));
+  rich_location richloc (line_table, location_for_asm (insn));
 
   diagnostic_set_info (&diagnostic, msg, args_ptr,
 		       &richloc, kind);
diff --git a/gcc/testsuite/gcc.dg/diagnostic-token-ranges.c b/gcc/testsuite/gcc.dg/diagnostic-token-ranges.c
new file mode 100644
index 0000000..ac969e3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/diagnostic-token-ranges.c
@@ -0,0 +1,120 @@
+/* { dg-options "-fdiagnostics-show-caret -Wc++-compat" } */
+
+/* Verify that various diagnostics show source code ranges.  */
+
+/* These ones merely use token ranges; they don't use tree ranges.  */
+
+void undeclared_identifier (void)
+{
+  name; /* { dg-error "'name' undeclared" } */
+/*
+{ dg-begin-multiline-output "" }
+   name;
+   ^~~~
+{ dg-end-multiline-output "" }
+*/
+}
+
+void unknown_type_name (void)
+{
+  foo bar; /* { dg-error "unknown type name 'foo'" } */
+/*
+{ dg-begin-multiline-output "" }
+   foo bar;
+   ^~~
+{ dg-end-multiline-output "" }
+*/
+
+  qux *baz; /* { dg-error "unknown type name 'qux'" } */
+/*
+{ dg-begin-multiline-output "" }
+   qux *baz;
+   ^~~
+{ dg-end-multiline-output "" }
+*/
+}
+
+void test_identifier_conflicts_with_cplusplus (void)
+{
+  int new; /* { dg-warning "identifier 'new' conflicts with" } */
+/*
+{ dg-begin-multiline-output "" }
+   int new;
+       ^~~
+{ dg-end-multiline-output "" }
+*/
+}
+
+extern void
+bogus_varargs (...); /* { dg-error "ISO C requires a named argument before '...'" } */
+/*
+{ dg-begin-multiline-output "" }
+ bogus_varargs (...);
+                ^~~
+{ dg-end-multiline-output "" }
+*/
+
+extern void
+foo (unknown_type param); /* { dg-error "unknown type name 'unknown_type'" } */
+/*
+{ dg-begin-multiline-output "" }
+ foo (unknown_type param);
+      ^~~~~~~~~~~~
+{ dg-end-multiline-output "" }
+*/
+
+void wide_string_literal_in_asm (void)
+{
+  asm (L"nop"); /* { dg-error "wide string literal in 'asm'" } */
+/*
+{ dg-begin-multiline-output "" }
+   asm (L"nop");
+        ^~~~~~
+{ dg-end-multiline-output "" }
+*/
+}
+
+void break_and_continue_in_wrong_places (void)
+{
+  if (0)
+    break; /* { dg-error "break statement not within loop or switch" } */
+/* { dg-begin-multiline-output "" }
+     break;
+     ^~~~~
+   { dg-end-multiline-output "" } */
+
+  if (1)
+    ;
+  else
+    continue; /* { dg-error "continue statement not within a loop" } */
+/* { dg-begin-multiline-output "" }
+     continue;
+     ^~~~~~~~
+    { dg-end-multiline-output "" } */
+}
+
+/* Various examples of bad type decls.  */
+
+int float bogus; /* { dg-error "two or more data types in declaration specifiers" } */
+/* { dg-begin-multiline-output "" }
+ int float bogus;
+     ^~~~~
+    { dg-end-multiline-output "" } */
+
+long long long bogus2; /* { dg-error "'long long long' is too long for GCC" } */
+/* { dg-begin-multiline-output "" }
+ long long long bogus2;
+           ^~~~
+    { dg-end-multiline-output "" } */
+
+long short bogus3; /* { dg-error "both 'long' and 'short' in declaration specifiers" } */
+/* { dg-begin-multiline-output "" }
+ long short bogus3;
+      ^~~~~
+    { dg-end-multiline-output "" } */
+
+signed unsigned bogus4; /* { dg-error "both 'signed' and 'unsigned' in declaration specifiers" } */
+/* { dg-begin-multiline-output "" }
+ signed unsigned bogus4;
+        ^~~~~~~~
+    { dg-end-multiline-output "" } */
diff --git a/gcc/testsuite/gcc.dg/diagnostic-tree-expr-ranges-2.c b/gcc/testsuite/gcc.dg/diagnostic-tree-expr-ranges-2.c
new file mode 100644
index 0000000..302e233
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/diagnostic-tree-expr-ranges-2.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-Wuninitialized -fdiagnostics-show-caret" } */
+
+int test_uninit_1 (void)
+{
+  int result;
+  return result;  /* { dg-warning "uninitialized" } */
+/* { dg-begin-multiline-output "" }
+   return result;
+          ^~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+int test_uninit_2 (void)
+{
+  int result;
+  result += 3; /* { dg-warning "uninitialized" } */
+/* { dg-begin-multiline-output "" }
+   result += 3;
+   ~~~~~~~^~~~
+   { dg-end-multiline-output "" } */
+  return result;
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
new file mode 100644
index 0000000..5485aaf
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
@@ -0,0 +1,422 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdiagnostics-show-caret" } */
+
+/* This is a collection of unittests to verify that we're correctly
+   capturing the source code ranges of various kinds of expression.
+
+   It uses the various "diagnostic_test_*_expression_range_plugin"
+   plugins which handles "__emit_expression_range" by generating a warning
+   at the given source range of the input argument.  Each of the
+   different plugins do this at a different phase of the internal
+   representation (tree, gimple, etc), so we can verify that the
+   source code range information is valid at each phase.
+
+   We want to accept an expression of any type.  To do this in C, we
+   use variadic arguments, but C requires at least one argument before
+   the ellipsis, so we have a dummy one.  */
+
+extern void __emit_expression_range (int dummy, ...);
+
+int global;
+
+void test_parentheses (int a, int b)
+{
+  __emit_expression_range (0, (a + b) ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, (a + b) );
+                               ~~~^~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, (a + b) * (a - b) ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, (a + b) * (a - b) );
+                               ~~~~~~~~^~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, !(a && b) ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, !(a && b) );
+                               ^~~~~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+/* Postfix expressions.  ************************************************/
+
+void test_array_reference (int *arr)
+{
+  __emit_expression_range (0, arr[100] ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, arr[100] );
+                               ~~~^~~~~
+   { dg-end-multiline-output "" } */
+}
+
+int test_function_call (int p, int q, int r)
+{
+  __emit_expression_range (0, test_function_call (p, q, r) ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, test_function_call (p, q, r) );
+                               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+  return 0;
+}
+
+struct test_struct
+{
+  int field;
+};
+
+int test_structure_references (struct test_struct *ptr)
+{
+  struct test_struct local;
+  local.field = 42;
+
+  __emit_expression_range (0, local.field ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, local.field );
+                               ~~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, ptr->field ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, ptr->field );
+                               ~~~^~~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+int test_postfix_incdec (int i)
+{
+  __emit_expression_range (0, i++ ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, i++ );
+                               ~^~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, i-- ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, i-- );
+                               ~^~
+   { dg-end-multiline-output "" } */
+}
+
+/* Unary operators.  ****************************************************/
+
+int test_prefix_incdec (int i)
+{
+  __emit_expression_range (0, ++i ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, ++i );
+                               ^~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, --i ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, --i );
+                               ^~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_address_operator (void)
+{
+  __emit_expression_range (0, &global ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, &global );
+                               ^~~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_indirection (int *ptr)
+{
+  __emit_expression_range (0, *ptr ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, *ptr );
+                               ^~~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_unary_minus (int i)
+{
+  __emit_expression_range (0, -i ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, -i );
+                               ^~
+   { dg-end-multiline-output "" } */
+}
+
+void test_ones_complement (int i)
+{
+  __emit_expression_range (0, ~i ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, ~i );
+                               ^~
+   { dg-end-multiline-output "" } */
+}
+
+void test_logical_negation (int flag)
+{
+  __emit_expression_range (0, !flag ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, !flag );
+                               ^~~~~
+   { dg-end-multiline-output "" } */
+}
+
+/* Casts.  ****************************************************/
+
+void test_cast (void *ptr)
+{
+  __emit_expression_range (0, (int *)ptr ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, (int *)ptr );
+                               ^~~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+}
+
+/* Binary operators.  *******************************************/
+
+void test_multiplicative_operators (int lhs, int rhs)
+{
+  __emit_expression_range (0, lhs * rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs * rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs / rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs / rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs % rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs % rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_additive_operators (int lhs, int rhs)
+{
+  __emit_expression_range (0, lhs + rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs + rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs - rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs - rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_shift_operators (int lhs, int rhs)
+{
+  __emit_expression_range (0, lhs << rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs << rhs );
+                               ~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs >> rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs >> rhs );
+                               ~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_relational_operators (int lhs, int rhs)
+{
+  __emit_expression_range (0, lhs < rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs < rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs > rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs > rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs <= rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs <= rhs );
+                               ~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs >= rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs >= rhs );
+                               ~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_equality_operators (int lhs, int rhs)
+{
+  __emit_expression_range (0, lhs == rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs == rhs );
+                               ~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs != rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs != rhs );
+                               ~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_bitwise_binary_operators (int lhs, int rhs)
+{
+  __emit_expression_range (0, lhs & rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs & rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs ^ rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs ^ rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs | rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs | rhs );
+                               ~~~~^~~~~
+   { dg-end-multiline-output "" } */
+}
+
+void test_logical_operators (int lhs, int rhs)
+{
+  __emit_expression_range (0, lhs && rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs && rhs );
+                               ~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, lhs || rhs ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, lhs || rhs );
+                               ~~~~^~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+/* Conditional operator.  *******************************************/
+
+void test_conditional_operators (int flag, int on_true, int on_false)
+{
+  __emit_expression_range (0, flag ? on_true : on_false ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, flag ? on_true : on_false );
+                               ~~~~~~~~~~~~~~~^~~~~~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+/* Assignment expressions.  *******************************************/
+
+void test_assignment_expressions (int dest, int other)
+{
+  __emit_expression_range (0, dest = other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest = other );
+                               ~~~~~^~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest *= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest *= other );
+                               ~~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest /= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest /= other );
+                               ~~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest %= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest %= other );
+                               ~~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest += other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest += other );
+                               ~~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest -= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest -= other );
+                               ~~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest <<= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest <<= other );
+                               ~~~~~^~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest >>= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest >>= other );
+                               ~~~~~^~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest &= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest &= other );
+                               ~~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest ^= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest ^= other );
+                               ~~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0, dest |= other ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, dest |= other );
+                               ~~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+/* Comma operator.  *******************************************/
+
+void test_comma_operator (int a, int b)
+{
+  __emit_expression_range (0, (a++, a + b) ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, (a++, a + b) );
+                               ~~~~^~~~~~~~
+   { dg-end-multiline-output "" } */
+}
+
+/* Examples of non-trivial expressions.  ****************************/
+
+extern double sqrt (double x);
+
+void test_quadratic (double a, double b, double c)
+{
+  __emit_expression_range (0, b * b - 4 * a * c ); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __emit_expression_range (0, b * b - 4 * a * c );
+                               ~~~~~~^~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0,
+     (-b + sqrt (b * b - 4 * a * c))
+     / (2 * a)); /* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+      (-b + sqrt (b * b - 4 * a * c))
+      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+      / (2 * a));
+      ^~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-trees-1.c b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-trees-1.c
new file mode 100644
index 0000000..7473a07
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-show-trees-1.c
@@ -0,0 +1,65 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdiagnostics-show-caret" } */
+
+/* This is an example file for use with
+   diagnostic_plugin_show_trees.c.
+
+   The plugin handles "__show_tree" by recursively dumping
+   the internal structure of the second input argument.
+
+   We want to accept an expression of any type.  To do this in C, we
+   use variadic arguments, but C requires at least one argument before
+   the ellipsis, so we have a dummy one.  */
+
+extern void __show_tree (int dummy, ...);
+
+extern double sqrt (double x);
+
+void test_quadratic (double a, double b, double c)
+{
+  __show_tree (0,
+     (-b + sqrt (b * b - 4 * a * c))
+     / (2 * a));
+
+/* { dg-begin-multiline-output "" }
+      (-b + sqrt (b * b - 4 * a * c))
+      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+      / (2 * a));
+      ^~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+/* { dg-begin-multiline-output "" }
+      (-b + sqrt (b * b - 4 * a * c))
+      ~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+/* { dg-begin-multiline-output "" }
+      (-b + sqrt (b * b - 4 * a * c))
+            ^~~~~~~~~~~~~~~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+/* { dg-begin-multiline-output "" }
+      (-b + sqrt (b * b - 4 * a * c))
+                  ~~~~~~^~~~~~~~~~~
+   { dg-end-multiline-output "" } */
+
+/* { dg-begin-multiline-output "" }
+      (-b + sqrt (b * b - 4 * a * c))
+                  ~~^~~
+   { dg-end-multiline-output "" } */
+
+/* { dg-begin-multiline-output "" }
+      (-b + sqrt (b * b - 4 * a * c))
+                          ~~~~~~^~~
+   { dg-end-multiline-output "" } */
+
+/* { dg-begin-multiline-output "" }
+      (-b + sqrt (b * b - 4 * a * c))
+                          ~~^~~
+   { dg-end-multiline-output "" } */
+
+/* { dg-begin-multiline-output "" }
+      / (2 * a));
+        ~~~^~~~
+   { dg-end-multiline-output "" } */
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_show_trees.c b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_show_trees.c
new file mode 100644
index 0000000..5a911c1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_show_trees.c
@@ -0,0 +1,174 @@
+/* This plugin recursively dumps the source-code location ranges of
+   expressions, at the pre-gimplification tree stage.  */
+/* { dg-options "-O" } */
+
+#include "gcc-plugin.h"
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "stringpool.h"
+#include "toplev.h"
+#include "basic-block.h"
+#include "hash-table.h"
+#include "vec.h"
+#include "ggc.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "internal-fn.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "tree.h"
+#include "tree-pass.h"
+#include "intl.h"
+#include "plugin-version.h"
+#include "diagnostic.h"
+#include "context.h"
+#include "gcc-rich-location.h"
+#include "print-tree.h"
+
+/*
+  Hack: fails with linker error:
+./diagnostic_plugin_show_trees.so: undefined symbol: _ZN17gcc_rich_location8add_exprEP9tree_node
+  since nothing in the tree is using gcc_rich_location::add_expr yet.
+
+  I've tried various workarounds (adding DEBUG_FUNCTION to the
+  method, taking its address), but can't seem to fix it that way.
+  So as a nasty workaround, the following material is copied&pasted
+  from gcc-rich-location.c: */
+
+static bool
+get_range_for_expr (tree expr, location_range *r)
+{
+  if (EXPR_HAS_RANGE (expr))
+    {
+      source_range sr = EXPR_LOCATION_RANGE (expr);
+
+      /* Do we have meaningful data?  */
+      if (sr.m_start && sr.m_finish)
+	{
+	  r->m_start = expand_location (sr.m_start);
+	  r->m_finish = expand_location (sr.m_finish);
+	  return true;
+	}
+    }
+
+  return false;
+}
+
+/* Add a range to the rich_location, covering expression EXPR. */
+
+void
+gcc_rich_location::add_expr (tree expr)
+{
+  gcc_assert (expr);
+
+  location_range r;
+  r.m_show_caret_p = false;
+  if (get_range_for_expr (expr, &r))
+    add_range (&r);
+}
+
+/* FIXME: end of material taken from gcc-rich-location.c */
+
+int plugin_is_GPL_compatible;
+
+static void
+show_tree (tree node)
+{
+  if (!CAN_HAVE_RANGE_P (node))
+    return;
+
+  gcc_rich_location richloc (EXPR_LOCATION (node));
+  richloc.add_expr (node);
+
+  if (richloc.get_num_locations () < 2)
+    {
+      error_at_rich_loc (&richloc, "range not found");
+      return;
+    }
+
+  enum tree_code code = TREE_CODE (node);
+
+  location_range *range = richloc.get_range (1);
+  inform_at_rich_loc (&richloc,
+		      "%s at range %i:%i-%i:%i",
+		      get_tree_code_name (code),
+		      range->m_start.line,
+		      range->m_start.column,
+		      range->m_finish.line,
+		      range->m_finish.column);
+
+  /* Recurse.  */
+  int min_idx = 0;
+  int max_idx = TREE_OPERAND_LENGTH (node);
+  switch (code)
+    {
+    case CALL_EXPR:
+      min_idx = 3;
+      break;
+
+    default:
+      break;
+    }
+
+  for (int i = min_idx; i < max_idx; i++)
+    show_tree (TREE_OPERAND (node, i));
+}
+
+tree
+cb_walk_tree_fn (tree * tp, int * walk_subtrees,
+		 void * data ATTRIBUTE_UNUSED)
+{
+  if (TREE_CODE (*tp) != CALL_EXPR)
+    return NULL_TREE;
+
+  tree call_expr = *tp;
+  tree fn = CALL_EXPR_FN (call_expr);
+  if (TREE_CODE (fn) != ADDR_EXPR)
+    return NULL_TREE;
+  fn = TREE_OPERAND (fn, 0);
+  if (TREE_CODE (fn) != FUNCTION_DECL)
+    return NULL_TREE;
+  if (strcmp (IDENTIFIER_POINTER (DECL_NAME (fn)), "__show_tree"))
+    return NULL_TREE;
+
+  /* Get arg 1; print it! */
+  tree arg = CALL_EXPR_ARG (call_expr, 1);
+
+  show_tree (arg);
+
+  return NULL_TREE;
+}
+
+static void
+callback (void *gcc_data, void *user_data)
+{
+  tree fndecl = (tree)gcc_data;
+  walk_tree (&DECL_SAVED_TREE (fndecl), cb_walk_tree_fn, NULL, NULL);
+}
+
+int
+plugin_init (struct plugin_name_args *plugin_info,
+	     struct plugin_gcc_version *version)
+{
+  struct register_pass_info pass_info;
+  const char *plugin_name = plugin_info->base_name;
+  int argc = plugin_info->argc;
+  struct plugin_argument *argv = plugin_info->argv;
+
+  if (!plugin_default_version_check (version, &gcc_version))
+    return 1;
+
+  register_callback (plugin_name,
+		     PLUGIN_PRE_GENERICIZE,
+		     callback,
+		     NULL);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
index 8f5724e..158c612 100644
--- a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
@@ -109,7 +109,8 @@ get_loc (unsigned int line_num, unsigned int col_num)
 
   /* Convert from 0-based column numbers to 1-based column numbers.  */
   source_location loc
-    = linemap_position_for_line_and_column (line_map,
+    = linemap_position_for_line_and_column (line_table,
+					    line_map,
 					    line_num, col_num + 1);
 
   return loc;
@@ -163,7 +164,7 @@ test_show_locus (function *fun)
   if (0 == strcmp (fnname, "test_simple"))
     {
       const int line = fnstart_line + 2;
-      rich_location richloc (get_loc (line, 15));
+      rich_location richloc (line_table, get_loc (line, 15));
       richloc.add_range (get_loc (line, 10), get_loc (line, 14), false);
       richloc.add_range (get_loc (line, 16), get_loc (line, 16), false);
       warning_at_rich_loc (&richloc, 0, "test");
@@ -172,7 +173,7 @@ test_show_locus (function *fun)
   if (0 == strcmp (fnname, "test_simple_2"))
     {
       const int line = fnstart_line + 2;
-      rich_location richloc (get_loc (line, 24));
+      rich_location richloc (line_table, get_loc (line, 24));
       richloc.add_range (get_loc (line, 6),
 			 get_loc (line, 22), false);
       richloc.add_range (get_loc (line, 26),
@@ -183,7 +184,7 @@ test_show_locus (function *fun)
   if (0 == strcmp (fnname, "test_multiline"))
     {
       const int line = fnstart_line + 2;
-      rich_location richloc (get_loc (line + 1, 7));
+      rich_location richloc (line_table, get_loc (line + 1, 7));
       richloc.add_range (get_loc (line, 7),
 			 get_loc (line, 23), false);
       richloc.add_range (get_loc (line + 1, 9),
@@ -194,7 +195,7 @@ test_show_locus (function *fun)
   if (0 == strcmp (fnname, "test_many_lines"))
     {
       const int line = fnstart_line + 2;
-      rich_location richloc (get_loc (line + 5, 7));
+      rich_location richloc (line_table, get_loc (line + 5, 7));
       richloc.add_range (get_loc (line, 7),
 			 get_loc (line + 4, 65), false);
       richloc.add_range (get_loc (line + 5, 9),
@@ -223,7 +224,7 @@ test_show_locus (function *fun)
       source_range src_range;
       src_range.m_start = get_loc (line, 12);
       src_range.m_finish = get_loc (line, 20);
-      rich_location richloc (caret);
+      rich_location richloc (line_table, caret);
       richloc.set_range (0, src_range, true, false);
       warning_at_rich_loc (&richloc, 0, "test");
     }
@@ -237,7 +238,7 @@ test_show_locus (function *fun)
       source_range src_range;
       src_range.m_start = get_loc (line, 90);
       src_range.m_finish = get_loc (line, 98);
-      rich_location richloc (caret);
+      rich_location richloc (line_table, caret);
       richloc.set_range (0, src_range, true, false);
       warning_at_rich_loc (&richloc, 0, "test");
     }
@@ -248,7 +249,7 @@ test_show_locus (function *fun)
       const int line = fnstart_line + 2;
       location_t caret_a = get_loc (line, 7);
       location_t caret_b = get_loc (line, 11);
-      rich_location richloc (caret_a);
+      rich_location richloc (line_table, caret_a);
       richloc.add_range (caret_b, caret_b, true);
       global_dc->caret_chars[0] = 'A';
       global_dc->caret_chars[1] = 'B';
@@ -269,7 +270,7 @@ test_show_locus (function *fun)
       const int line = fnstart_line + 3;
       location_t caret_a = get_loc (line, 5);
       location_t caret_b = get_loc (line - 1, 19);
-      rich_location richloc (caret_a);
+      rich_location richloc (line_table, caret_a);
       richloc.add_range (caret_b, caret_b, true);
       global_dc->caret_chars[0] = '1';
       global_dc->caret_chars[1] = '2';
@@ -304,11 +305,6 @@ plugin_init (struct plugin_name_args *plugin_info,
   if (!plugin_default_version_check (version, &gcc_version))
     return 1;
 
-  /* For now, tell the dc to expect ranges and thus to colorize the source
-     lines, not just the carets/underlines.  This will be redundant
-     once the C frontend generates ranges.  */
-  global_dc->colorize_source_p = true;
-
   for (int i = 0; i < argc; i++)
     {
       if (0 == strcmp (argv[i].key, "color"))
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c
new file mode 100644
index 0000000..89cc95a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c
@@ -0,0 +1,98 @@
+/* This plugin verifies the source-code location ranges of
+   expressions, at the pre-gimplification tree stage.  */
+/* { dg-options "-O" } */
+
+#include "gcc-plugin.h"
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "stringpool.h"
+#include "toplev.h"
+#include "basic-block.h"
+#include "hash-table.h"
+#include "vec.h"
+#include "ggc.h"
+#include "basic-block.h"
+#include "tree-ssa-alias.h"
+#include "internal-fn.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "tree.h"
+#include "tree-pass.h"
+#include "intl.h"
+#include "plugin-version.h"
+#include "diagnostic.h"
+#include "context.h"
+#include "print-tree.h"
+
+int plugin_is_GPL_compatible;
+
+static void
+emit_warning (location_t loc)
+{
+  source_range src_range = get_range_from_loc (line_table, loc);
+  warning_at (loc, 0,
+	      "tree range %i:%i-%i:%i",
+	      LOCATION_LINE (src_range.m_start),
+	      LOCATION_COLUMN (src_range.m_start),
+	      LOCATION_LINE (src_range.m_finish),
+	      LOCATION_COLUMN (src_range.m_finish));
+}
+
+tree
+cb_walk_tree_fn (tree * tp, int * walk_subtrees,
+		 void * data ATTRIBUTE_UNUSED)
+{
+  if (TREE_CODE (*tp) != CALL_EXPR)
+    return NULL_TREE;
+
+  tree call_expr = *tp;
+  tree fn = CALL_EXPR_FN (call_expr);
+  if (TREE_CODE (fn) != ADDR_EXPR)
+    return NULL_TREE;
+  fn = TREE_OPERAND (fn, 0);
+  if (TREE_CODE (fn) != FUNCTION_DECL)
+    return NULL_TREE;
+  if (strcmp (IDENTIFIER_POINTER (DECL_NAME (fn)), "__emit_expression_range"))
+    return NULL_TREE;
+
+  /* Get arg 1; print it! */
+  tree arg = CALL_EXPR_ARG (call_expr, 1);
+
+  emit_warning (EXPR_LOCATION (arg));
+
+  return NULL_TREE;
+}
+
+static void
+callback (void *gcc_data, void *user_data)
+{
+  tree fndecl = (tree)gcc_data;
+  walk_tree (&DECL_SAVED_TREE (fndecl), cb_walk_tree_fn, NULL, NULL);
+}
+
+int
+plugin_init (struct plugin_name_args *plugin_info,
+	     struct plugin_gcc_version *version)
+{
+  struct register_pass_info pass_info;
+  const char *plugin_name = plugin_info->base_name;
+  int argc = plugin_info->argc;
+  struct plugin_argument *argv = plugin_info->argv;
+
+  if (!plugin_default_version_check (version, &gcc_version))
+    return 1;
+
+  register_callback (plugin_name,
+		     PLUGIN_PRE_GENERICIZE,
+		     callback,
+		     NULL);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/plugin/plugin.exp b/gcc/testsuite/gcc.dg/plugin/plugin.exp
index 941bccc..f1155ee 100644
--- a/gcc/testsuite/gcc.dg/plugin/plugin.exp
+++ b/gcc/testsuite/gcc.dg/plugin/plugin.exp
@@ -66,6 +66,10 @@ set plugin_test_list [list \
     { diagnostic_plugin_test_show_locus.c \
 	  diagnostic-test-show-locus-bw.c \
 	  diagnostic-test-show-locus-color.c } \
+    { diagnostic_plugin_test_tree_expression_range.c \
+	  diagnostic-test-expressions-1.c } \
+    { diagnostic_plugin_show_trees.c \
+	  diagnostic-test-show-trees-1.c } \
 ]
 
 foreach plugin_test $plugin_test_list {
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 140e36f..588d89d 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1130,6 +1130,7 @@ general_init (const char *argv0, bool init_signals)
   linemap_init (line_table, BUILTINS_LOCATION);
   line_table->reallocator = realloc_for_line_map;
   line_table->round_alloc_size = ggc_round_alloc_size;
+  line_table->default_range_bits = 5;
   init_ttree ();
 
   /* Initialize register usage now so switches may override.  */
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 5d98eec..0c624aa 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -6719,10 +6719,7 @@ move_block_to_fn (struct function *dest_cfun, basic_block bb,
 	    continue;
 	  if (d->orig_block == NULL_TREE || block == d->orig_block)
 	    {
-	      if (d->new_block == NULL_TREE)
-		locus = LOCATION_LOCUS (locus);
-	      else
-		locus = COMBINE_LOCATION_DATA (line_table, locus, d->new_block);
+	      locus = set_block (locus, d->new_block);
 	      gimple_phi_arg_set_location (phi, i, locus);
 	    }
 	}
@@ -6782,9 +6779,7 @@ move_block_to_fn (struct function *dest_cfun, basic_block bb,
 	tree block = LOCATION_BLOCK (e->goto_locus);
 	if (d->orig_block == NULL_TREE
 	    || block == d->orig_block)
-	  e->goto_locus = d->new_block ?
-	      COMBINE_LOCATION_DATA (line_table, e->goto_locus, d->new_block) :
-	      LOCATION_LOCUS (e->goto_locus);
+	  e->goto_locus = set_block (e->goto_locus, d->new_block);
       }
 }
 
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index 17d97a8..205c869 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -2348,10 +2348,7 @@ copy_phis_for_bb (basic_block bb, copy_body_data *id)
 		  tree *n;
 		  n = id->decl_map->get (LOCATION_BLOCK (locus));
 		  gcc_assert (n);
-		  if (*n)
-		    locus = COMBINE_LOCATION_DATA (line_table, locus, *n);
-		  else
-		    locus = LOCATION_LOCUS (locus);
+		  locus = set_block (locus, *n);
 		}
 	      else
 		locus = LOCATION_LOCUS (locus);
diff --git a/gcc/tree.c b/gcc/tree.c
index 50e1db0..1d770c3 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -11789,10 +11789,7 @@ tree_set_block (tree t, tree b)
 
   if (IS_EXPR_CODE_CLASS (c))
     {
-      if (b)
-	t->exp.locus = COMBINE_LOCATION_DATA (line_table, t->exp.locus, b);
-      else
-	t->exp.locus = LOCATION_LOCUS (t->exp.locus);
+      t->exp.locus = set_block (t->exp.locus, b);
     }
   else
     gcc_unreachable ();
@@ -13813,5 +13810,60 @@ nonnull_arg_p (const_tree arg)
   return false;
 }
 
+/* Given location LOC, strip away any packed range information
+   or ad-hoc information.  */
+
+static location_t
+get_pure_location (location_t loc)
+{
+  if (IS_ADHOC_LOC (loc))
+    loc
+      = line_table->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].locus;
+
+  if (loc >= LINEMAPS_MACRO_LOWEST_LOCATION (line_table))
+    return loc;
+
+  if (loc < RESERVED_LOCATION_COUNT)
+    return loc;
+
+  const line_map *map = linemap_lookup (line_table, loc);
+  const line_map_ordinary *ordmap = linemap_check_ordinary (map);
+
+  return loc & ~((1 << ordmap->m_range_bits) - 1);
+}
+
+/* Combine LOC and BLOCK to a combined adhoc loc, retaining any range
+   information.  */
+
+location_t
+set_block (location_t loc, tree block)
+{
+  location_t pure_loc = get_pure_location (loc);
+  source_range src_range = get_range_from_loc (line_table, loc);
+  return COMBINE_LOCATION_DATA (line_table, pure_loc, src_range, block);
+}
+
+void
+set_source_range (tree expr, location_t start, location_t finish)
+{
+  source_range src_range;
+  src_range.m_start = start;
+  src_range.m_finish = finish;
+  set_source_range (expr, src_range);
+}
+
+void
+set_source_range (tree expr, source_range src_range)
+{
+  if (!EXPR_P (expr))
+    return;
+
+  location_t pure_loc = get_pure_location (EXPR_LOCATION (expr));
+  location_t adhoc = COMBINE_LOCATION_DATA (line_table,
+					    pure_loc,
+					    src_range,
+					    NULL);
+  SET_EXPR_LOCATION (expr, adhoc);
+}
 
 #include "gt-tree.h"
diff --git a/gcc/tree.h b/gcc/tree.h
index 1bb59f2..0b9c3b9 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1096,10 +1096,25 @@ extern void omp_clause_range_check_failed (const_tree, const char *, int,
 #define EXPR_FILENAME(NODE) LOCATION_FILE (EXPR_CHECK ((NODE))->exp.locus)
 #define EXPR_LINENO(NODE) LOCATION_LINE (EXPR_CHECK (NODE)->exp.locus)
 
+#define CAN_HAVE_RANGE_P(NODE) (CAN_HAVE_LOCATION_P (NODE))
+#define EXPR_LOCATION_RANGE(NODE) (get_expr_source_range (EXPR_CHECK ((NODE))))
+
+#define EXPR_HAS_RANGE(NODE) \
+    (CAN_HAVE_RANGE_P (NODE) \
+     ? EXPR_LOCATION_RANGE (NODE).m_start != UNKNOWN_LOCATION \
+     : false)
+
 /* True if a tree is an expression or statement that can have a
    location.  */
 #define CAN_HAVE_LOCATION_P(NODE) ((NODE) && EXPR_P (NODE))
 
+static inline source_range
+get_expr_source_range (tree expr)
+{
+  location_t loc = EXPR_LOCATION (expr);
+  return get_range_from_loc (line_table, loc);
+}
+
 extern void protected_set_expr_location (tree, location_t);
 
 /* In a TARGET_EXPR node.  */
@@ -2172,6 +2187,9 @@ extern machine_mode element_mode (const_tree t);
 #define DECL_IS_BUILTIN(DECL) \
   (LOCATION_LOCUS (DECL_SOURCE_LOCATION (DECL)) <= BUILTINS_LOCATION)
 
+#define DECL_LOCATION_RANGE(NODE) \
+  (get_decl_source_range (DECL_MINIMAL_CHECK (NODE)))
+
 /*  For FIELD_DECLs, this is the RECORD_TYPE, UNION_TYPE, or
     QUAL_UNION_TYPE node that the field is a member of.  For VAR_DECL,
     PARM_DECL, FUNCTION_DECL, LABEL_DECL, RESULT_DECL, and CONST_DECL
@@ -5277,10 +5295,25 @@ type_with_alias_set_p (const_tree t)
   return false;
 }
 
+extern location_t set_block (location_t loc, tree block);
+
 extern void gt_ggc_mx (tree &);
 extern void gt_pch_nx (tree &);
 extern void gt_pch_nx (tree &, gt_pointer_operator, void *);
 
 extern bool nonnull_arg_p (const_tree);
 
+extern void
+set_source_range (tree expr, location_t start, location_t finish);
+
+extern void
+set_source_range (tree expr, source_range src_range);
+
+static inline source_range
+get_decl_source_range (tree decl)
+{
+  location_t loc = DECL_SOURCE_LOCATION (decl);
+  return get_range_from_loc (line_table, loc);
+}
+
 #endif  /* GCC_TREE_H  */
diff --git a/libcpp/errors.c b/libcpp/errors.c
index c351c11..8790e10 100644
--- a/libcpp/errors.c
+++ b/libcpp/errors.c
@@ -57,7 +57,7 @@ cpp_diagnostic (cpp_reader * pfile, int level, int reason,
 
   if (!pfile->cb.error)
     abort ();
-  rich_location richloc (src_loc);
+  rich_location richloc (pfile->line_table, src_loc);
   ret = pfile->cb.error (pfile, level, reason, &richloc, _(msgid), ap);
 
   return ret;
@@ -140,7 +140,7 @@ cpp_diagnostic_with_line (cpp_reader * pfile, int level, int reason,
   
   if (!pfile->cb.error)
     abort ();
-  rich_location richloc (src_loc);
+  rich_location richloc (pfile->line_table, src_loc);
   richloc.override_column (column);
   ret = pfile->cb.error (pfile, level, reason, &richloc, _(msgid), ap);
 
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index a2bdfa0..f5c2a21 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -237,7 +237,8 @@ struct GTY(()) cpp_identifier {
 /* A preprocessing token.  This has been carefully packed and should
    occupy 16 bytes on 32-bit hosts and 24 bytes on 64-bit hosts.  */
 struct GTY(()) cpp_token {
-  source_location src_loc;	/* Location of first char of token.  */
+  source_location src_loc;	/* Location of first char of token,
+				   together with range of full token.  */
   ENUM_BITFIELD(cpp_ttype) type : CHAR_BIT;  /* token type */
   unsigned short flags;		/* flags - see above */
 
diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index c9340a6..e7608f1 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -47,7 +47,8 @@ enum lc_reason
 typedef unsigned int linenum_type;
 
 /* The typedef "source_location" is a key within the location database,
-   identifying a source location or macro expansion.
+   identifying a source location or macro expansion, along with range
+   information, and (optionally) a pointer for use by gcc.
 
    This key only has meaning in relation to a line_maps instance.  Within
    gcc there is a single line_maps instance: "line_table", declared in
@@ -69,13 +70,48 @@ typedef unsigned int linenum_type;
              |  ordmap[0]->start_location)   | first line in ordmap 0
   -----------+-------------------------------+-------------------------------
              | ordmap[1]->start_location     | First line in ordmap 1
-             | ordmap[1]->start_location+1   | First column in that line
-             | ordmap[1]->start_location+2   | 2nd column in that line
-             |                               | Subsequent lines are offset by
-             |                               | (1 << column_bits),
-             |                               | e.g. 128 for 7 bits, with a
-             |                               | column value of 0 representing
-             |                               | "the whole line".
+             | ordmap[1]->start_location+32  | First column in that line
+             |   (assuming range_bits == 5)  |
+             | ordmap[1]->start_location+64  | 2nd column in that line
+             | ordmap[1]->start_location+4096| Second line in ordmap 1
+             |   (assuming column_bits == 12)
+             |
+             |   Subsequent lines are offset by (1 << column_bits),
+             |   e.g. 4096 for 12 bits, with a column value of 0 representing
+             |   "the whole line".
+             |
+             |   Within a line, the low "range_bits" (typically 5) are used for
+             |   storing short ranges, so that there's an offset of
+             |     (1 << range_bits) between individual columns within a line,
+             |   typically 32.
+             |   The low range_bits store the offset of the end point from the
+             |   start point, and the start point is found by masking away
+             |   the range bits.
+             |
+             |   For example:
+             |      ordmap[1]->start_location+64    "2nd column in that line"
+             |   above means a caret at that location, with a range
+             |   starting and finishing at the same place (the range bits
+             |   are 0), a range of length 1.
+             |
+             |   By contrast:
+             |      ordmap[1]->start_location+68
+             |   has range bits 0x4, meaning a caret with a range starting at
+             |   that location, but with endpoint 4 columns further on: a range
+             |   of length 5.
+             |
+             |   Ranges that have caret != start, or have an endpoint too
+             |   far away to fit in range_bits are instead stored as ad-hoc
+             |   locations.  Hence for range_bits == 5 we can compactly store
+             |   tokens of length <= 32 without needing to use the ad-hoc
+             |   table.
+             |
+             |   This packing scheme means we effectively have
+             |     (column_bits - range_bits)
+             |   of bits for the columns, typically (12 - 5) = 7, for 128
+             |   columns; longer line widths are accomodated by starting a
+             |   new ordmap with a higher column_bits.
+             |
              | ordmap[2]->start_location-1   | Final location in ordmap 1
   -----------+-------------------------------+-------------------------------
              | ordmap[2]->start_location     | First line in ordmap 2
@@ -127,8 +163,101 @@ typedef unsigned int linenum_type;
   0xffffffff | UINT_MAX                      |
   -----------+-------------------------------+-------------------------------
 
-  To see how this works in practice, see the worked example in
-  libcpp/location-example.txt.  */
+   Examples of location encoding.
+
+   Packed ranges
+   =============
+
+   Consider encoding the location of a token "foo", seen underlined here
+   on line 523, within an ordinary line_map that starts at line 500:
+
+                 11111111112
+        12345678901234567890
+     522
+     523   return foo + bar;
+                  ^~~
+     524
+
+   The location's caret and start are both at line 523, column 11; the
+   location's finish is on the same line, at column 13 (an offset of 2
+   columns, for length 3).
+
+   Line 523 is offset 23 from the starting line of the ordinary line_map.
+
+   caret == start, and the offset of the finish fits within 5 bits, so
+   this can be stored as a packed range.
+
+   This is encoded as:
+      ordmap->start
+         + (line_offset << ordmap->m_column_and_range_bits)
+         + (column << ordmap->m_range_bits)
+         + (range_offset);
+   i.e. (for line offset 23, column 11, range offset 2):
+      ordmap->start
+         + (23 << 12)
+         + (11 << 5)
+         + 2;
+   i.e.:
+      ordmap->start + 0x17162
+   assuming that the line_map uses the default of 7 bits for columns and
+   5 bits for packed range (giving 12 bits for m_column_and_range_bits).
+
+
+   "Pure" locations
+   ================
+
+   These are a special case of the above, where
+      caret == start == finish
+   They are stored as packed ranges with offset == 0.
+   For example, the location of the "f" of "foo" could be stored
+   as above, but with range offset 0, giving:
+      ordmap->start
+         + (23 << 12)
+         + (11 << 5)
+         + 0;
+   i.e.:
+      ordmap->start + 0x17160
+
+
+   Unoptimized ranges
+   ==================
+
+   Consider encoding the location of the binary expression
+   below:
+
+                 11111111112
+        12345678901234567890
+     521
+     523   return foo + bar;
+                  ~~~~^~~~~
+     523
+
+   The location's caret is at the "+", line 523 column 15, but starts
+   earlier, at the "f" of "foo" at column 11.  The finish is at the "r"
+   of "bar" at column 19.
+
+   This can't be stored as a packed range since start != caret.
+   Hence it is stored as an ad-hoc location e.g. 0x80000003.
+
+   Stripping off the top bit gives us an index into the ad-hoc
+   lookaside table:
+
+     line_table->location_adhoc_data_map.data[0x3]
+
+   from which the caret, start and finish can be looked up,
+   encoded as "pure" locations:
+
+     start  == ordmap->start + (23 << 12) + (11 << 5)
+            == ordmap->start + 0x17160  (as above; the "f" of "foo")
+
+     caret  == ordmap->start + (23 << 12) + (15 << 5)
+            == ordmap->start + 0x171e0
+
+     finish == ordmap->start + (23 << 12) + (19 << 5)
+            == ordmap->start + 0x17260
+
+   To further see how source_location works in practice, see the
+   worked example in libcpp/location-example.txt.  */
 typedef unsigned int source_location;
 
 /* A range of source locations.
@@ -217,8 +346,9 @@ struct GTY((tag ("0"), desc ("%h.reason == LC_ENTER_MACRO ? 2 : 1"))) line_map {
    
    Physical source file TO_FILE at line TO_LINE at column 0 is represented
    by the logical START_LOCATION.  TO_LINE+L at column C is represented by
-   START_LOCATION+(L*(1<<column_bits))+C, as long as C<(1<<column_bits),
-   and the result_location is less than the next line_map's start_location.
+   START_LOCATION+(L*(1<<m_column_and_range_bits))+(C*1<<m_range_bits), as
+   long as C<(1<<effective range bits), and the result_location is less than
+   the next line_map's start_location.
    (The top line is line 1 and the leftmost column is column 1; line/column 0
    means "entire file/line" or "unknown line/column" or "not applicable".)
 
@@ -238,8 +368,24 @@ struct GTY((tag ("1"))) line_map_ordinary : public line_map {
      cpp_buffer.  */
   unsigned char sysp;
 
-  /* Number of the low-order source_location bits used for a column number.  */
-  unsigned int column_bits : 8;
+  /* Number of the low-order source_location bits used for column numbers
+     and ranges.  */
+  unsigned int m_column_and_range_bits : 8;
+
+  /* Number of the low-order "column" bits used for storing short ranges
+     inline, rather than in the ad-hoc table.
+     MSB                                                                 LSB
+     31                                                                    0
+     +-------------------------+-------------------------------------------+
+     |                         |<---map->column_and_range_bits (e.g. 12)-->|
+     +-------------------------+-----------------------+-------------------+
+     |                         | column_and_range_bits | map->range_bits   |
+     |                         |   - range_bits        |                   |
+     +-------------------------+-----------------------+-------------------+
+     | row bits                | effective column bits | short range bits  |
+     |                         |    (e.g. 7)           |   (e.g. 5)        |
+     +-------------------------+-----------------------+-------------------+ */
+  unsigned int m_range_bits : 8;
 };
 
 /* This is the highest possible source location encoded within an
@@ -435,15 +581,6 @@ ORDINARY_MAP_IN_SYSTEM_HEADER_P (const line_map_ordinary *ord_map)
   return ord_map->sysp;
 }
 
-/* Get the number of the low-order source_location bits used for a
-   column number within ordinary map MAP.  */
-
-inline unsigned char
-ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (const line_map_ordinary *ord_map)
-{
-  return ord_map->column_bits;
-}
-
 /* Get the filename of ordinary map MAP.  */
 
 inline const char *
@@ -524,9 +661,11 @@ struct GTY(()) maps_info_macro {
   unsigned int cache;
 };
 
-/* Data structure to associate an arbitrary data to a source location.  */
+/* Data structure to associate a source_range together with an arbitrary
+   data pointer with a source location.  */
 struct GTY(()) location_adhoc_data {
   source_location locus;
+  source_range src_range;
   void * GTY((skip)) data;
 };
 
@@ -588,6 +727,12 @@ struct GTY(()) line_maps {
 
   /* True if we've seen a #line or # 44 "file" directive.  */
   bool seen_line_directive;
+
+  /* The default value of range_bits in ordinary line maps.  */
+  unsigned int default_range_bits;
+
+  unsigned int num_optimized_ranges;
+  unsigned int num_unoptimized_ranges;
 };
 
 /* Returns the number of allocated maps so far. MAP_KIND shall be TRUE
@@ -825,11 +970,15 @@ LINEMAPS_LAST_ALLOCATED_MACRO_MAP (const line_maps *set)
 
 extern void location_adhoc_data_fini (struct line_maps *);
 extern source_location get_combined_adhoc_loc (struct line_maps *,
-					       source_location, void *);
+					       source_location,
+					       source_range,
+					       void *);
 extern void *get_data_from_adhoc_loc (struct line_maps *, source_location);
 extern source_location get_location_from_adhoc_loc (struct line_maps *,
 						    source_location);
 
+extern source_range get_range_from_loc (line_maps *set, source_location loc);
+
 /* Get whether location LOC is an ad-hoc location.  */
 
 inline bool
@@ -838,14 +987,21 @@ IS_ADHOC_LOC (source_location loc)
   return (loc & MAX_SOURCE_LOCATION) != loc;
 }
 
+/* Get whether location LOC is a "pure" location, or
+   whether it is an ad-hoc location, or embeds range information.  */
+
+bool
+pure_location_p (line_maps *set, source_location loc);
+
 /* Combine LOC and BLOCK, giving a combined adhoc location.  */
 
 inline source_location
 COMBINE_LOCATION_DATA (struct line_maps *set,
 		       source_location loc,
+		       source_range src_range,
 		       void *block)
 {
-  return get_combined_adhoc_loc (set, loc, block);
+  return get_combined_adhoc_loc (set, loc, src_range, block);
 }
 
 extern void rebuild_location_adhoc_htab (struct line_maps *);
@@ -931,7 +1087,7 @@ inline linenum_type
 SOURCE_LINE (const line_map_ordinary *ord_map, source_location loc)
 {
   return ((loc - ord_map->start_location)
-	  >> ord_map->column_bits) + ord_map->to_line;
+	  >> ord_map->m_column_and_range_bits) + ord_map->to_line;
 }
 
 /* Convert a map and source_location to source column number.  */
@@ -939,7 +1095,7 @@ inline linenum_type
 SOURCE_COLUMN (const line_map_ordinary *ord_map, source_location loc)
 {
   return ((loc - ord_map->start_location)
-	  & ((1 << ord_map->column_bits) - 1));
+	  & ((1 << ord_map->m_column_and_range_bits) - 1)) >> ord_map->m_range_bits;
 }
 
 /* Return the location of the last source line within an ordinary
@@ -949,7 +1105,7 @@ LAST_SOURCE_LINE_LOCATION (const line_map_ordinary *map)
 {
   return (((map[1].start_location - 1
 	    - map->start_location)
-	   & ~((1 << map->column_bits) - 1))
+	   & ~((1 << map->m_column_and_range_bits) - 1))
 	  + map->start_location);
 }
 
@@ -999,7 +1155,8 @@ linemap_position_for_column (struct line_maps *, unsigned int);
 /* Encode and return a source location from a given line and
    column.  */
 source_location
-linemap_position_for_line_and_column (const line_map_ordinary *,
+linemap_position_for_line_and_column (line_maps *set,
+				      const line_map_ordinary *,
 				      linenum_type, unsigned int);
 
 /* Encode and return a source_location starting from location LOC and
@@ -1187,7 +1344,7 @@ class rich_location
   /* Constructors.  */
 
   /* Constructing from a location.  */
-  rich_location (source_location loc);
+  rich_location (line_maps *set, source_location loc);
 
   /* Constructing from a source_range.  */
   rich_location (source_range src_range);
diff --git a/libcpp/lex.c b/libcpp/lex.c
index 7e97bc2..d9b428a 100644
--- a/libcpp/lex.c
+++ b/libcpp/lex.c
@@ -2723,6 +2723,19 @@ _cpp_lex_direct (cpp_reader *pfile)
       break;
     }
 
+  source_range tok_range;
+  tok_range.m_start = result->src_loc;
+  if (result->src_loc >= RESERVED_LOCATION_COUNT)
+    tok_range.m_finish
+      = linemap_position_for_column (pfile->line_table,
+				     CPP_BUF_COLUMN (buffer, buffer->cur));
+  else
+    tok_range.m_finish = tok_range.m_start;
+
+  result->src_loc = COMBINE_LOCATION_DATA (pfile->line_table,
+					   result->src_loc,
+					   tok_range, NULL);
+
   return result;
 }
 
diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index 3c19f93..c5aa422 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -27,9 +27,9 @@ along with this program; see the file COPYING3.  If not see
 #include "hashtab.h"
 
 /* Do not track column numbers higher than this one.  As a result, the
-   range of column_bits is [7, 18] (or 0 if column numbers are
+   range of column_bits is [12, 18] (or 0 if column numbers are
    disabled).  */
-const unsigned int LINE_MAP_MAX_COLUMN_NUMBER = (1U << 17);
+const unsigned int LINE_MAP_MAX_COLUMN_NUMBER = (1U << 12);
 
 /* Do not track column numbers if locations get higher than this.  */
 const source_location LINE_MAP_MAX_LOCATION_WITH_COLS = 0x60000000;
@@ -46,7 +46,7 @@ static const line_map_macro* linemap_macro_map_lookup (struct line_maps *,
 static source_location linemap_macro_map_loc_to_def_point
 (const line_map_macro *, source_location);
 static source_location linemap_macro_map_loc_unwind_toward_spelling
-(const line_map_macro *, source_location);
+(line_maps *set, const line_map_macro *, source_location);
 static source_location linemap_macro_map_loc_to_exp_point
 (const line_map_macro *, source_location);
 static source_location linemap_macro_loc_to_spelling_point
@@ -69,7 +69,10 @@ location_adhoc_data_hash (const void *l)
 {
   const struct location_adhoc_data *lb =
       (const struct location_adhoc_data *) l;
-  return (hashval_t) lb->locus + (size_t) lb->data;
+  return ((hashval_t) lb->locus
+	  + (hashval_t) lb->src_range.m_start
+	  + (hashval_t) lb->src_range.m_finish
+	  + (size_t) lb->data);
 }
 
 /* Compare function for location_adhoc_data hashtable.  */
@@ -81,7 +84,10 @@ location_adhoc_data_eq (const void *l1, const void *l2)
       (const struct location_adhoc_data *) l1;
   const struct location_adhoc_data *lb2 =
       (const struct location_adhoc_data *) l2;
-  return lb1->locus == lb2->locus && lb1->data == lb2->data;
+  return (lb1->locus == lb2->locus
+	  && lb1->src_range.m_start == lb2->src_range.m_start
+	  && lb1->src_range.m_finish == lb2->src_range.m_finish
+	  && lb1->data == lb2->data);
 }
 
 /* Update the hashtable when location_adhoc_data is reallocated.  */
@@ -106,23 +112,103 @@ rebuild_location_adhoc_htab (struct line_maps *set)
 		    set->location_adhoc_data_map.data + i, INSERT);
 }
 
+/* Helper function for get_combined_adhoc_loc.
+   Can the given LOCUS + SRC_RANGE and DATA pointer be stored compactly
+   within a source_location, without needing to use an ad-hoc location.  */
+
+static bool
+can_be_stored_compactly_p (struct line_maps *set,
+			   source_location locus,
+			   source_range src_range,
+			   void *data)
+{
+  /* If there's an ad-hoc pointer, we can't store it directly in the
+     source_location, we need the lookaside.  */
+  if (data)
+    return false;
+
+  /* We only store ranges that begin at the locus and that are sufficiently
+     "sane".  */
+  if (src_range.m_start != locus)
+    return false;
+
+  if (src_range.m_finish < src_range.m_start)
+    return false;
+
+  if (src_range.m_start < RESERVED_LOCATION_COUNT)
+    return false;
+
+  if (locus >= LINE_MAP_MAX_LOCATION_WITH_COLS)
+    return false;
+
+  /* All 3 locations must be within ordinary maps, typically, the same
+     ordinary map.  */
+  source_location lowest_macro_loc = LINEMAPS_MACRO_LOWEST_LOCATION (set);
+  if (locus >= lowest_macro_loc)
+    return false;
+  if (src_range.m_start >= lowest_macro_loc)
+    return false;
+  if (src_range.m_finish >= lowest_macro_loc)
+    return false;
+
+  /* Passed all tests.  */
+  return true;
+}
+
 /* Combine LOCUS and DATA to a combined adhoc loc.  */
 
 source_location
 get_combined_adhoc_loc (struct line_maps *set,
-			source_location locus, void *data)
+			source_location locus,
+			source_range src_range,
+			void *data)
 {
   struct location_adhoc_data lb;
   struct location_adhoc_data **slot;
 
-  linemap_assert (data);
-
   if (IS_ADHOC_LOC (locus))
     locus
       = set->location_adhoc_data_map.data[locus & MAX_SOURCE_LOCATION].locus;
   if (locus == 0 && data == NULL)
     return 0;
+
+  /* Any ordinary locations ought to be "pure" at this point: no
+     compressed ranges.  */
+  linemap_assert (locus < RESERVED_LOCATION_COUNT
+		  || locus >= LINE_MAP_MAX_LOCATION_WITH_COLS
+		  || locus >= LINEMAPS_MACRO_LOWEST_LOCATION (set)
+		  || pure_location_p (set, locus));
+
+  /* Consider short-range optimization.  */
+  if (can_be_stored_compactly_p (set, locus, src_range, data))
+    {
+      /* The low bits ought to be clear.  */
+      linemap_assert (pure_location_p (set, locus));
+      const line_map *map = linemap_lookup (set, locus);
+      const line_map_ordinary *ordmap = linemap_check_ordinary (map);
+      unsigned int int_diff = src_range.m_finish - src_range.m_start;
+      unsigned int col_diff = (int_diff >> ordmap->m_range_bits);
+      if (col_diff < (1U << ordmap->m_range_bits))
+	{
+	  source_location packed = locus | col_diff;
+	  set->num_optimized_ranges++;
+	  return packed;
+	}
+    }
+
+  /* We can also compactly store the reserved locations
+     when locus == start == finish (and data is NULL).  */
+  if (locus < RESERVED_LOCATION_COUNT
+      && locus == src_range.m_start
+      && locus == src_range.m_finish
+      && !data)
+    return locus;
+
+  if (!data)
+    set->num_unoptimized_ranges++;
+
   lb.locus = locus;
+  lb.src_range = src_range;
   lb.data = data;
   slot = (struct location_adhoc_data **)
       htab_find_slot (set->location_adhoc_data_map.htab, &lb, INSERT);
@@ -177,6 +263,60 @@ get_location_from_adhoc_loc (struct line_maps *set, source_location loc)
   return set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].locus;
 }
 
+/* Return the source_range for adhoc location LOC.  */
+
+static source_range
+get_range_from_adhoc_loc (struct line_maps *set, source_location loc)
+{
+  linemap_assert (IS_ADHOC_LOC (loc));
+  return set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].src_range;
+}
+
+/* Get the source_range of location LOC, either from the ad-hoc
+   lookaside table, or embedded inside LOC itself.  */
+
+source_range
+get_range_from_loc (struct line_maps *set,
+		    source_location loc)
+{
+  if (IS_ADHOC_LOC (loc))
+    return get_range_from_adhoc_loc (set, loc);
+
+  /* For ordinary maps, extract packed range.  */
+  if (loc >= RESERVED_LOCATION_COUNT
+      && loc < LINEMAPS_MACRO_LOWEST_LOCATION (set)
+      && loc <= LINE_MAP_MAX_LOCATION_WITH_COLS)
+    {
+      const line_map *map = linemap_lookup (set, loc);
+      const line_map_ordinary *ordmap = linemap_check_ordinary (map);
+      source_range result;
+      int offset = loc & ((1 << ordmap->m_range_bits) - 1);
+      result.m_start = loc - offset;
+      result.m_finish = result.m_start + (offset << ordmap->m_range_bits);
+      return result;
+    }
+
+  return source_range::from_location (loc);
+}
+
+/* Get whether location LOC is a "pure" location, or
+   whether it is an ad-hoc location, or embeds range information.  */
+
+bool
+pure_location_p (line_maps *set, source_location loc)
+{
+  if (IS_ADHOC_LOC (loc))
+    return false;
+
+  const line_map *map = linemap_lookup (set, loc);
+  const line_map_ordinary *ordmap = linemap_check_ordinary (map);
+
+  if (loc & ((1U << ordmap->m_range_bits) - 1))
+    return false;
+
+  return true;
+}
+
 /* Finalize the location_adhoc_data structure.  */
 void
 location_adhoc_data_fini (struct line_maps *set)
@@ -319,7 +459,19 @@ const struct line_map *
 linemap_add (struct line_maps *set, enum lc_reason reason,
 	     unsigned int sysp, const char *to_file, linenum_type to_line)
 {
-  source_location start_location = set->highest_location + 1;
+  /* Generate a start_location above the current highest_location.
+     If possible, make the low range bits be zero.  */
+  source_location start_location;
+  if (set->highest_location < LINE_MAP_MAX_LOCATION_WITH_COLS)
+    {
+      start_location = set->highest_location + (1 << set->default_range_bits);
+      if (set->default_range_bits)
+	start_location &= ~((1 << set->default_range_bits) - 1);
+      linemap_assert (0 == (start_location
+			    & ((1 << set->default_range_bits) - 1)));
+    }
+  else
+    start_location = set->highest_location + 1;
 
   linemap_assert (!(LINEMAPS_ORDINARY_USED (set)
 		    && (start_location
@@ -398,11 +550,18 @@ linemap_add (struct line_maps *set, enum lc_reason reason,
   map->to_file = to_file;
   map->to_line = to_line;
   LINEMAPS_ORDINARY_CACHE (set) = LINEMAPS_ORDINARY_USED (set) - 1;
-  map->column_bits = 0;
+  map->m_column_and_range_bits = 0;
+  map->m_range_bits = 0;
   set->highest_location = start_location;
   set->highest_line = start_location;
   set->max_column_hint = 0;
 
+  /* This assertion is placed after set->highest_location has
+     been updated, since the latter affects
+     linemap_location_from_macro_expansion_p, which ultimately affects
+     pure_location_p.  */
+  linemap_assert (pure_location_p (set, start_location));
+
   if (reason == LC_ENTER)
     {
       map->included_from =
@@ -549,13 +708,14 @@ linemap_line_start (struct line_maps *set, linenum_type to_line,
     SOURCE_LINE (map, set->highest_line);
   int line_delta = to_line - last_line;
   bool add_map = false;
+  linemap_assert (map->m_column_and_range_bits >= map->m_range_bits);
+  int effective_column_bits = map->m_column_and_range_bits - map->m_range_bits;
 
   if (line_delta < 0
       || (line_delta > 10
-	  && line_delta * ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map) > 1000)
-      || (max_column_hint >= (1U << ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map)))
-      || (max_column_hint <= 80
-	  && ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map) >= 10)
+	  && line_delta * map->m_column_and_range_bits > 1000)
+      || (max_column_hint >= (1U << effective_column_bits))
+      || (max_column_hint <= 80 && effective_column_bits >= 10)
       || (highest > LINE_MAP_MAX_LOCATION_WITH_COLS
 	  && (set->max_column_hint || highest >= LINE_MAP_MAX_SOURCE_LOCATION)))
     add_map = true;
@@ -564,22 +724,27 @@ linemap_line_start (struct line_maps *set, linenum_type to_line,
   if (add_map)
     {
       int column_bits;
+      int range_bits;
       if (max_column_hint > LINE_MAP_MAX_COLUMN_NUMBER
 	  || highest > LINE_MAP_MAX_LOCATION_WITH_COLS)
 	{
 	  /* If the column number is ridiculous or we've allocated a huge
-	     number of source_locations, give up on column numbers. */
+	     number of source_locations, give up on column numbers
+	     (and on packed ranges).  */
 	  max_column_hint = 0;
 	  column_bits = 0;
+	  range_bits = 0;
 	  if (highest > LINE_MAP_MAX_SOURCE_LOCATION)
 	    return 0;
 	}
       else
 	{
 	  column_bits = 7;
+	  range_bits = set->default_range_bits;
 	  while (max_column_hint >= (1U << column_bits))
 	    column_bits++;
 	  max_column_hint = 1U << column_bits;
+	  column_bits += range_bits;
 	}
       /* Allocate the new line_map.  However, if the current map only has a
 	 single line we can sometimes just increase its column_bits instead. */
@@ -592,14 +757,14 @@ linemap_line_start (struct line_maps *set, linenum_type to_line,
 				ORDINARY_MAP_IN_SYSTEM_HEADER_P (map),
 				ORDINARY_MAP_FILE_NAME (map),
 				to_line)));
-      map->column_bits = column_bits;
+      map->m_column_and_range_bits = column_bits;
+      map->m_range_bits = range_bits;
       r = (MAP_START_LOCATION (map)
 	   + ((to_line - ORDINARY_MAP_STARTING_LINE_NUMBER (map))
 	      << column_bits));
     }
   else
-    r = highest - SOURCE_COLUMN (map, highest)
-      + (line_delta << ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map));
+    r = set->highest_line + (line_delta << map->m_column_and_range_bits);
 
   /* Locations of ordinary tokens are always lower than locations of
      macro tokens.  */
@@ -610,6 +775,18 @@ linemap_line_start (struct line_maps *set, linenum_type to_line,
   if (r > set->highest_location)
     set->highest_location = r;
   set->max_column_hint = max_column_hint;
+
+  /* At this point, we expect one of:
+     (a) the normal case: a "pure" location with 0 range bits, or
+     (b) we've gone past LINE_MAP_MAX_LOCATION_WITH_COLS so can't track
+        columns anymore (or ranges), or
+     (c) we're in a region with a column hint exceeding
+        LINE_MAP_MAX_COLUMN_NUMBER, so column-tracking is off,
+	with column_bits == 0.  */
+  linemap_assert (pure_location_p (set, r)
+		  || r >= LINE_MAP_MAX_LOCATION_WITH_COLS
+		  || map->m_column_and_range_bits == 0);
+  linemap_assert (SOURCE_LINE (map, r) == to_line);
   return r;
 }
 
@@ -640,7 +817,8 @@ linemap_position_for_column (struct line_maps *set, unsigned int to_column)
 	  r = linemap_line_start (set, SOURCE_LINE (map, r), to_column + 50);
 	}
     }
-  r = r + to_column;
+  line_map_ordinary *map = LINEMAPS_LAST_ORDINARY_MAP (set);
+  r = r + (to_column << map->m_range_bits);
   if (r >= set->highest_location)
     set->highest_location = r;
   return r;
@@ -650,16 +828,25 @@ linemap_position_for_column (struct line_maps *set, unsigned int to_column)
    column.  */
 
 source_location
-linemap_position_for_line_and_column (const line_map_ordinary *ord_map,
+linemap_position_for_line_and_column (line_maps *set,
+				      const line_map_ordinary *ord_map,
 				      linenum_type line,
 				      unsigned column)
 {
   linemap_assert (ORDINARY_MAP_STARTING_LINE_NUMBER (ord_map) <= line);
 
-  return (MAP_START_LOCATION (ord_map)
-	  + ((line - ORDINARY_MAP_STARTING_LINE_NUMBER (ord_map))
-	     << ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (ord_map))
-	  + (column & ((1 << ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (ord_map)) - 1)));
+  source_location r = MAP_START_LOCATION (ord_map);
+  r += ((line - ORDINARY_MAP_STARTING_LINE_NUMBER (ord_map))
+	<< ord_map->m_column_and_range_bits);
+  if (r <= LINE_MAP_MAX_LOCATION_WITH_COLS)
+    r += ((column & ((1 << ord_map->m_column_and_range_bits) - 1))
+	  << ord_map->m_range_bits);
+  source_location upper_limit = LINEMAPS_MACRO_LOWEST_LOCATION (set);
+  if (r >= upper_limit)
+    r = upper_limit - 1;
+  if (r > set->highest_location)
+    set->highest_location = r;
+  return r;
 }
 
 /* Encode and return a source_location starting from location LOC and
@@ -673,6 +860,9 @@ linemap_position_for_loc_and_offset (struct line_maps *set,
 {
   const line_map_ordinary * map = NULL;
 
+  if (IS_ADHOC_LOC (loc))
+    loc = set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].locus;
+
   /* This function does not support virtual locations yet.  */
   if (linemap_assert_fails
       (!linemap_location_from_macro_expansion_p (set, loc)))
@@ -711,11 +901,11 @@ linemap_position_for_loc_and_offset (struct line_maps *set,
     }
 
   offset += column;
-  if (linemap_assert_fails (offset < (1u << map->column_bits)))
+  if (linemap_assert_fails (offset < (1u << map->m_column_and_range_bits)))
     return loc;
 
   source_location r = 
-    linemap_position_for_line_and_column (map, line, offset);
+    linemap_position_for_line_and_column (set, map, line, offset);
   if (linemap_assert_fails (r <= set->highest_location)
       || linemap_assert_fails (map == linemap_lookup (set, r)))
     return loc;
@@ -893,14 +1083,19 @@ linemap_macro_map_loc_to_def_point (const line_map_macro *map,
    In other words, this returns the xI location presented in the
    comments of line_map_macro above.  */
 source_location
-linemap_macro_map_loc_unwind_toward_spelling (const line_map_macro* map,
+linemap_macro_map_loc_unwind_toward_spelling (line_maps *set,
+					      const line_map_macro* map,
 					      source_location location)
 {
   unsigned token_no;
 
+  if (IS_ADHOC_LOC (location))
+    location = get_location_from_adhoc_loc (set, location);
+
   linemap_assert (linemap_macro_expansion_map_p (map)
 		  && location >= MAP_START_LOCATION (map));
   linemap_assert (location >= RESERVED_LOCATION_COUNT);
+  linemap_assert (!IS_ADHOC_LOC (location));
 
   token_no = location - MAP_START_LOCATION (map);
   linemap_assert (token_no < MACRO_MAP_NUM_MACRO_TOKENS (map));
@@ -1010,7 +1205,7 @@ linemap_location_in_system_header_p (struct line_maps *set,
 
 	      /* It's a token resulting from a macro expansion.  */
 	      source_location loc =
-		linemap_macro_map_loc_unwind_toward_spelling (macro_map, location);
+		linemap_macro_map_loc_unwind_toward_spelling (set, macro_map, location);
 	      if (loc < RESERVED_LOCATION_COUNT)
 		/* This token might come from a built-in macro.  Let's
 		   look at where that macro got expanded.  */
@@ -1183,11 +1378,6 @@ linemap_macro_loc_to_spelling_point (struct line_maps *set,
 				     const line_map_ordinary **original_map)
 {
   struct line_map *map;
-
-  if (IS_ADHOC_LOC (location))
-    location = set->location_adhoc_data_map.data[location
-						 & MAX_SOURCE_LOCATION].locus;
-
   linemap_assert (set && location >= RESERVED_LOCATION_COUNT);
 
   while (true)
@@ -1198,7 +1388,7 @@ linemap_macro_loc_to_spelling_point (struct line_maps *set,
 
       location
 	= linemap_macro_map_loc_unwind_toward_spelling
-	    (linemap_check_macro (map),
+	    (set, linemap_check_macro (map),
 	     location);
     }
 
@@ -1341,10 +1531,11 @@ linemap_resolve_location (struct line_maps *set,
 			  enum location_resolution_kind lrk,
 			  const line_map_ordinary **map)
 {
+  source_location locus = loc;
   if (IS_ADHOC_LOC (loc))
-    loc = set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].locus;
+    locus = set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].locus;
 
-  if (loc < RESERVED_LOCATION_COUNT)
+  if (locus < RESERVED_LOCATION_COUNT)
     {
       /* A reserved location wasn't encoded in a map.  Let's return a
 	 NULL map here, just like what linemap_ordinary_map_lookup
@@ -1396,7 +1587,7 @@ linemap_unwind_toward_expansion (struct line_maps *set,
     loc = set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].locus;
 
   resolved_location =
-    linemap_macro_map_loc_unwind_toward_spelling (macro_map, loc);
+    linemap_macro_map_loc_unwind_toward_spelling (set, macro_map, loc);
   resolved_map = linemap_lookup (set, resolved_location);
 
   if (!linemap_macro_expansion_map_p (resolved_map))
@@ -1478,9 +1669,9 @@ linemap_expand_location (struct line_maps *set,
   memset (&xloc, 0, sizeof (xloc));
   if (IS_ADHOC_LOC (loc))
     {
-      loc = set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].locus;
       xloc.data
 	= set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].data;
+      loc = set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].locus;
     }
 
   if (loc < RESERVED_LOCATION_COUNT)
@@ -1760,13 +1951,14 @@ line_table_dump (FILE *stream, struct line_maps *set, unsigned int num_ordinary,
 
 /* Construct a rich_location with location LOC as its initial range.  */
 
-rich_location::rich_location (source_location loc) :
+rich_location::rich_location (line_maps *set, source_location loc) :
   m_loc (loc),
   m_num_ranges (0),
   m_have_expanded_location (false)
 {
-  /* Set up the 0th range: */
-  add_range (loc, loc, true);
+  /* Set up the 0th range, extracting any range from LOC.  */
+  source_range src_range = get_range_from_loc (set, loc);
+  add_range (src_range, true);
   m_ranges[0].m_caret = lazily_expand_location ();
 }
 
diff --git a/libcpp/location-example.txt b/libcpp/location-example.txt
index a5f95b2..14b5c2e 100644
--- a/libcpp/location-example.txt
+++ b/libcpp/location-example.txt
@@ -30,142 +30,154 @@ RESERVED LOCATIONS
   source_location interval: 0 <= loc < 2
 
 ORDINARY MAP: 0
-  source_location interval: 2 <= loc < 3
+  source_location interval: 32 <= loc < 64
   file: test.c
   starting at line: 1
-  column bits: 7
-test.c:  1|loc:    2|#include "test.h"
-                    |00000001111111111
-                    |34567890123456789
+  column bits: 12
+  range bits: 5
+test.c:  1|loc:   32|#include "test.h"
+                    |69269258258148147
+                    |46802468024680246
 
 ORDINARY MAP: 1
-  source_location interval: 3 <= loc < 4
+  source_location interval: 64 <= loc < 96
   file: <built-in>
   starting at line: 0
   column bits: 0
+  range bits: 0
 
 ORDINARY MAP: 2
-  source_location interval: 4 <= loc < 5
+  source_location interval: 96 <= loc < 128
   file: <command-line>
   starting at line: 0
   column bits: 0
+  range bits: 0
 
 ORDINARY MAP: 3
-  source_location interval: 5 <= loc < 5005
+  source_location interval: 128 <= loc < 160128
   file: /usr/include/stdc-predef.h
   starting at line: 1
-  column bits: 7
+  column bits: 12
+  range bits: 5
 (contents of /usr/include/stdc-predef.h snipped for brevity)
 
 ORDINARY MAP: 4
-  source_location interval: 5005 <= loc < 5006
+  source_location interval: 160128 <= loc < 160160
   file: <command-line>
-  starting at line: 1
-  column bits: 7
+  starting at line: 32
+  column bits: 12
+  range bits: 5
 
 ORDINARY MAP: 5
-  source_location interval: 5006 <= loc < 5134
+  source_location interval: 160160 <= loc < 164256
   file: test.c
   starting at line: 1
-  column bits: 7
-test.c:  1|loc: 5006|#include "test.h"
-                    |55555555555555555
+  column bits: 12
+  range bits: 5
+test.c:  1|loc:160160|#include "test.h"
                     |00000000000000000
-                    |00011111111112222
-                    |78901234567890123
+                    |12223334445556667
+                    |92582581481470470
+                    |24680246802468024
 
 ORDINARY MAP: 6
-  source_location interval: 5134 <= loc < 5416
+  source_location interval: 164256 <= loc < 173280
   file: test.h
   starting at line: 1
-  column bits: 7
-test.h:  1|loc: 5134|extern int foo ();
-                    |555555555555555555
-                    |111111111111111111
-                    |333334444444444555
-                    |567890123456789012
-test.h:  2|loc: 5262|
+  column bits: 12
+  range bits: 5
+test.h:  1|loc:164256|extern int foo ();
+                    |444444444444444444
+                    |233344455566677788
+                    |825814814704703603
+                    |802468024680246802
+test.h:  2|loc:168352|
                     |
                     |
                     |
                     |
-test.h:  3|loc: 5390|#define PLUS(A, B) A + B
-                    |555555555555555555555555
-                    |333333333444444444444444
-                    |999999999000000000011111
-                    |123456789012345678901234
+test.h:  3|loc:172448|#define PLUS(A, B) A + B
+                    |222222222222222223333333
+                    |455566677788889990001112
+                    |814704703603692692582581
+                    |024680246802468024680246
 
 ORDINARY MAP: 7
-  source_location interval: 5416 <= loc < 6314
+  source_location interval: 173280 <= loc < 202016
   file: test.c
   starting at line: 2
-  column bits: 7
-test.c:  2|loc: 5416|
+  column bits: 12
+  range bits: 5
+test.c:  2|loc:173280|
                     |
                     |
                     |
                     |
-test.c:  3|loc: 5544|int
-                    |555
-                    |555
+test.c:  3|loc:177376|int
+                    |777
                     |444
-                    |567
-test.c:  4|loc: 5672|main (int argc, char **argv)
-                    |5555555555555555555555555555
-                    |6666666666666666666666666667
-                    |7777777888888888899999999990
-                    |3456789012345678901234567890
-test.c:  5|loc: 5800|{
+                    |047
+                    |802
+test.c:  4|loc:181472|main (int argc, char **argv)
+                    |1111111111111111222222222222
+                    |5556666777888999000111222333
+                    |0360369269258258148147047036
+                    |4680246802468024680246802468
+test.c:  5|loc:185568|{
                     |5
-                    |8
-                    |0
-                    |1
-test.c:  6|loc: 5928|  int a = PLUS (1,2);
-                    |555555555555555555555
-                    |999999999999999999999
-                    |233333333334444444444
-                    |901234567890123456789
-test.c:  7|loc: 6056|  int b = PLUS (3,4);
-                    |666666666666666666666
-                    |000000000000000000000
-                    |555666666666677777777
-                    |789012345678901234567
-test.c:  8|loc: 6184|  return 0;
-                    |66666666666
-                    |11111111111
-                    |88888999999
-                    |56789012345
-test.c:  9|loc: 6312|}
                     |6
-                    |3
+                    |0
+                    |0
+test.c:  6|loc:189664|  int a = PLUS (1,2);
+                    |999999999900000000000
+                    |677788899900011122233
+                    |926925825814814704703
+                    |680246802468024680246
+test.c:  7|loc:193760|  int b = PLUS (3,4);
+                    |333333344444444444444
+                    |788899900011122233344
+                    |925825814814704703603
+                    |246802468024680246802
+test.c:  8|loc:197856|  return 0;
+                    |77778888888
+                    |89990001112
+                    |82581481470
+                    |80246802468
+test.c:  9|loc:201952|}
                     |1
-                    |3
+                    |9
+                    |8
+                    |4
 
 UNALLOCATED LOCATIONS
-  source_location interval: 6314 <= loc < 2147483633
+  source_location interval: 202016 <= loc < 2147483633
 
 MACRO 1: PLUS (7 tokens)
   source_location interval: 2147483633 <= loc < 2147483640
-test.c:7:11: note: expansion point is location 6067
+test.c:7:11: note: expansion point is location 194115
    int b = PLUS (3,4);
-           ^
+           ^~~~
+
   map->start_location: 2147483633
   macro_locations:
-    0: 6073, 5410
-test.c:7:17: note: token 0 has x-location == 6073
+    0: 194304, 173088
+test.c:7:17: note: token 0 has x-location == 194304
    int b = PLUS (3,4);
                  ^
-test.c:7:17: note: token 0 has y-location == 5410
-    1: 5412, 5412
+
+test.c:7:17: note: token 0 has y-location == 173088
+    1: 173152, 173152
 In file included from test.c:1:0:
-test.h:3:22: note: token 1 has x-location == y-location == 5412
+test.h:3:22: note: token 1 has x-location == y-location == 173152
  #define PLUS(A, B) A + B
                       ^
-    2: 6075, 5414
-test.c:7:19: note: token 2 has x-location == 6075
+
+    2: 194368, 173216
+test.c:7:19: note: token 2 has x-location == 194368
    int b = PLUS (3,4);
                    ^
-test.c:7:19: note: token 2 has y-location == 5414
+
+test.c:7:19: note: token 2 has y-location == 173216
     3: 0, 2947526575
 cc1: note: token 3 has x-location == 0
 cc1: note: token 3 has y-location == 2947526575
@@ -178,26 +190,30 @@ x-location == y-location == 2947526575 encodes token # 800042942
 
 MACRO 0: PLUS (7 tokens)
   source_location interval: 2147483640 <= loc < 2147483647
-test.c:6:11: note: expansion point is location 5939
+test.c:6:11: note: expansion point is location 190019
    int a = PLUS (1,2);
-           ^
+           ^~~~
+
   map->start_location: 2147483640
   macro_locations:
-    0: 5945, 5410
-test.c:6:17: note: token 0 has x-location == 5945
+    0: 190208, 173088
+test.c:6:17: note: token 0 has x-location == 190208
    int a = PLUS (1,2);
                  ^
-test.c:6:17: note: token 0 has y-location == 5410
-    1: 5412, 5412
+
+test.c:6:17: note: token 0 has y-location == 173088
+    1: 173152, 173152
 In file included from test.c:1:0:
-test.h:3:22: note: token 1 has x-location == y-location == 5412
+test.h:3:22: note: token 1 has x-location == y-location == 173152
  #define PLUS(A, B) A + B
                       ^
-    2: 5947, 5414
-test.c:6:19: note: token 2 has x-location == 5947
+
+    2: 190272, 173216
+test.c:6:19: note: token 2 has x-location == 190272
    int a = PLUS (1,2);
                    ^
-test.c:6:19: note: token 2 has y-location == 5414
+
+test.c:6:19: note: token 2 has y-location == 173216
     3: 0, 2947526575
 cc1: note: token 3 has x-location == 0
 cc1: note: token 3 has y-location == 2947526575
-- 
1.8.5.3


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH] v3 of diagnostic_show_locus and rich_location (was Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2))
  2015-09-25 20:39     ` [PATCH] v3 of diagnostic_show_locus and rich_location (was Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2)) David Malcolm
  2015-09-25 20:42       ` Manuel López-Ibáñez
  2015-09-27 14:19       ` Dodji Seketeli
@ 2015-12-29 20:55       ` Mike Stump
  2016-01-06 15:37         ` David Malcolm
  2 siblings, 1 reply; 83+ messages in thread
From: Mike Stump @ 2015-12-29 20:55 UTC (permalink / raw)
  To: David Malcolm; +Cc: GCC Patches

On Sep 25, 2015, at 1:11 PM, David Malcolm <dmalcolm@redhat.com> wrote:
> +layout::layout (diagnostic_context * context,
>> +		const diagnostic_info *diagnostic)
>> 
>> [...]
>> 
>> +      if (loc_range->m_finish.file != m_exploc.file)
>> +	continue;
>> +      if (loc_range->m_show_caret_p)
>> +	if (loc_range->m_finish.file != m_exploc.file)
>> +	  continue;
>> 
>> I think the second if clause is redundant.
> 
> Good catch; thanks.

And one other nit.  You don’t validate that the range finishes on or after the start.  Later in the code, you require this to be the case:

bool
layout_range::contains_point (int row, int column) const
{
  gcc_assert (m_start.m_line <= m_finish.m_line);

this code cannot tolerate a range with that property.  So, either, such a range should never be generated, or, if it is to be generated, at least we should declare it awkward.  The below patch declares it to be awkward.  Without this, we ice on completely sane and normal code:

  #define max(i, j) sel((j), (i), (i) < (j))
  yu = max(a2, a3);

giving a valid warning.  In the code, we start on the last line, and finish on the first line.  The underlying problem is that we don’t track macro contexts properly.  sel is a compiler built-in, so, it might be funnier that just a normal function call.  This is from a trunk compiler from 20151201.

So, I was wondering if the problem has been fixed, or, if we should put the below in now, or, would you prefer to try and do up the changes to better track macro expansions?




diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
index 9e51b95..bea8423 100644
--- a/gcc/diagnostic-show-locus.c
+++ b/gcc/diagnostic-show-locus.c
@@ -455,6 +455,9 @@ layout::layout (diagnostic_context * context,
       if (loc_range->m_show_caret_p)
        if (loc_range->m_caret.file != m_exploc.file)
          continue;
+      /* A range that finishes before it starts is awkward.  */
+      if (loc_range->m_start.line > loc_range->m_finish.line)
+       continue;
 
       /* Passed all the tests; add the range to m_layout_ranges so that
         it will be printed.  */

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH] v3 of diagnostic_show_locus and rich_location (was Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2))
  2015-12-29 20:55       ` [PATCH] v3 of diagnostic_show_locus and rich_location (was Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2)) Mike Stump
@ 2016-01-06 15:37         ` David Malcolm
  0 siblings, 0 replies; 83+ messages in thread
From: David Malcolm @ 2016-01-06 15:37 UTC (permalink / raw)
  To: Mike Stump; +Cc: GCC Patches

On Tue, 2015-12-29 at 12:53 -0800, Mike Stump wrote:
> On Sep 25, 2015, at 1:11 PM, David Malcolm <dmalcolm@redhat.com>
> wrote:
> > +layout::layout (diagnostic_context * context,
> >> +		const diagnostic_info *diagnostic)
> >> 
> >> [...]
> >> 
> >> +      if (loc_range->m_finish.file != m_exploc.file)
> >> +	continue;
> >> +      if (loc_range->m_show_caret_p)
> >> +	if (loc_range->m_finish.file != m_exploc.file)
> >> +	  continue;
> >> 
> >> I think the second if clause is redundant.
> > 
> > Good catch; thanks.
> 
> And one other nit.  You don’t validate that the range finishes on or
> after the start.  Later in the code, you require this to be the case:
> 
> bool
> layout_range::contains_point (int row, int column) const
> {
>   gcc_assert (m_start.m_line <= m_finish.m_line);
> 
> this code cannot tolerate a range with that property.  So, either,
> such a range should never be generated, or, if it is to be generated,
> at least we should declare it awkward.  The below patch declares it to
> be awkward.  Without this, we ice on completely sane and normal code:
> 
>   #define max(i, j) sel((j), (i), (i) < (j))
>   yu = max(a2, a3);
> 
> giving a valid warning.  In the code, we start on the last line, and
> finish on the first line.  The underlying problem is that we don’t
> track macro contexts properly.  sel is a compiler built-in, so, it
> might be funnier that just a normal function call.  This is from a
> trunk compiler from 20151201.
> 
> So, I was wondering if the problem has been fixed, or, if we should
> put the below in now, or, would you prefer to try and do up the
> changes to better track macro expansions?

This is PR 68473.

I committed a workaround for it, similar to your one, as r231919 on
2015-12-22; I've been experimenting with a "deeper" fix for it that
would better respect macro expansions, but that might have to wait until
gcc 7.


> diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
> index 9e51b95..bea8423 100644
> --- a/gcc/diagnostic-show-locus.c
> +++ b/gcc/diagnostic-show-locus.c
> @@ -455,6 +455,9 @@ layout::layout (diagnostic_context * context,
>        if (loc_range->m_show_caret_p)
>         if (loc_range->m_caret.file != m_exploc.file)
>           continue;
> +      /* A range that finishes before it starts is awkward.  */
> +      if (loc_range->m_start.line > loc_range->m_finish.line)
> +       continue;
>  
>        /* Passed all the tests; add the range to m_layout_ranges so
> that
>          it will be printed.  */
> 


^ permalink raw reply	[flat|nested] 83+ messages in thread

end of thread, other threads:[~2016-01-06 15:37 UTC | newest]

Thread overview: 83+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-22 21:09 [PATCH 0/5] RFC: Overhaul of diagnostics (v2) David Malcolm
2015-09-22 21:09 ` [PATCH 1/5] Testsuite: add dg-{begin|end}-multiline-output commands David Malcolm
2015-09-25 17:22   ` Jeff Law
2015-09-27  1:29     ` Bernhard Reutner-Fischer
2015-09-22 21:10 ` [PATCH 5/5] Add plugin to recursively dump the source-ranges in a tree (v2) David Malcolm
2015-09-28  8:23   ` Dodji Seketeli
2015-09-22 21:10 ` [PATCH 3/5] Implement token range tracking within libcpp and the C FE (v2) David Malcolm
2015-09-25  9:58   ` Dodji Seketeli
2015-09-25 14:53     ` David Malcolm
2015-09-25 16:15       ` Dodji Seketeli
2015-09-22 21:33 ` [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2) David Malcolm
2015-09-25  9:49   ` Dodji Seketeli
2015-09-25 12:34     ` Manuel López-Ibáñez
2015-09-25 16:21       ` Dodji Seketeli
2015-09-25 20:39     ` [PATCH] v3 of diagnostic_show_locus and rich_location (was Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2)) David Malcolm
2015-09-25 20:42       ` Manuel López-Ibáñez
2015-09-25 21:14         ` Manuel López-Ibáñez
2015-09-25 22:10           ` Manuel López-Ibáñez
2015-09-26  4:51             ` David Malcolm
2015-09-26  6:18               ` Manuel López-Ibáñez
2015-09-25 22:40           ` David Malcolm
2015-09-26  6:41             ` Manuel López-Ibáñez
2015-09-27 14:19       ` Dodji Seketeli
2015-10-12 15:45         ` [PATCH] v4 of diagnostic_show_locus and rich_location David Malcolm
2015-10-12 16:37           ` Manuel López-Ibáñez
2015-10-13 18:09             ` David Malcolm
2015-12-29 20:55       ` [PATCH] v3 of diagnostic_show_locus and rich_location (was Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2)) Mike Stump
2016-01-06 15:37         ` David Malcolm
2015-09-22 22:23 ` [PATCH 4/5] Implement tree expression tracking in C FE (v2) David Malcolm
2015-09-25 14:22   ` Dodji Seketeli
2015-09-25 15:04     ` David Malcolm
2015-09-25 16:36       ` Dodji Seketeli
2015-09-23 13:36 ` [PATCH 0/5] RFC: Overhaul of diagnostics (v2) Michael Matz
2015-09-23 13:43   ` Richard Biener
2015-09-23 13:53     ` Michael Matz
2015-09-23 15:51       ` Jeff Law
2015-09-24  2:39     ` David Malcolm
2015-09-24  9:03       ` Richard Biener
2015-09-25 16:50         ` Jeff Law
2015-10-13 15:33         ` Benchmarks of v2 (was Re: [PATCH 0/5] RFC: Overhaul of diagnostics (v2)) David Malcolm
2015-10-14  9:00           ` Richard Biener
2015-10-14 12:49             ` Michael Matz
2015-10-16 15:57             ` David Malcolm
2015-10-19 14:59               ` Michael Matz
2015-10-22 15:05                 ` David Malcolm
2015-11-13 16:02             ` David Malcolm
2015-10-23 20:25 ` [PATCH 00/10] Overhaul of diagnostics (v5) David Malcolm
2015-10-23 20:24   ` [PATCH 06/10] Track expression ranges in C frontend David Malcolm
2015-10-30  8:01     ` Jeff Law
2015-11-02 19:14       ` Status of rich location work (was Re: [PATCH 06/10] Track expression ranges in C frontend) David Malcolm
2015-11-02 19:53         ` David Malcolm
2015-11-02 22:26         ` Jeff Law
2015-11-06  7:12         ` Dodji Seketeli
2015-11-13 16:37         ` libcpp/C FE source range patch committed (r230331) David Malcolm
2015-10-23 20:24   ` [PATCH 03/10] libstdc++v3: Explicitly disable carets and colorization within testsuite David Malcolm
2015-10-23 21:10     ` Jeff Law
2015-10-23 20:24   ` [PATCH 01/10] Improvements to description of source_location in line-map.h David Malcolm
2015-10-23 21:02     ` Jeff Law
2015-10-23 20:25   ` [PATCH 04/10] Reimplement diagnostic_show_locus, introducing rich_location classes (v5) David Malcolm
2015-10-27 23:12     ` Jeff Law
2015-10-28 17:52       ` David Malcolm
2015-10-28 17:51         ` [PATCH 4b] diagnostic-show-locus.c changes: Insertions David Malcolm
2015-10-30  4:53           ` Jeff Law
2015-10-30 19:42             ` David Malcolm
2015-11-06 19:59             ` David Malcolm
2015-10-28 17:51         ` [PATCH 4a] diagnostic-show-locus.c changes: Deletions David Malcolm
2015-10-28 17:59         ` [PATCH 4c] Other changes: everything apart from diagnostic-show-locus.c changes David Malcolm
2015-10-30  4:49         ` [PATCH 04/10] Reimplement diagnostic_show_locus, introducing rich_location classes (v5) Jeff Law
2015-10-23 20:26   ` [PATCH 08/10] Wire things up so that libcpp users get token underlines David Malcolm
2015-10-30  6:15     ` Jeff Law
2015-10-23 20:26   ` [PATCH 05/10] Add ranges to libcpp tokens (via ad-hoc data, unoptimized) David Malcolm
2015-10-27 21:29     ` Jeff Law
2015-10-23 20:26   ` [PATCH 10/10] Compress short ranges into source_location David Malcolm
2015-10-30  6:07     ` Jeff Law
2015-11-04 20:42     ` Dodji Seketeli
2015-10-23 20:26   ` [PATCH 02/10] Add stats on adhoc table to dump_line_table_statistics David Malcolm
2015-10-23 21:07     ` Jeff Law
2015-10-23 20:26   ` [PATCH 07/10] Add plugin to recursively dump the source-ranges in a tree (v2) David Malcolm
2015-10-27 21:32     ` Jeff Law
2015-10-23 20:29   ` [PATCH 09/10] Delay some resolution of ad-hoc locations, preserving ranges David Malcolm
2015-10-27 22:15     ` Jeff Law
2015-10-23 21:25   ` [PATCH 00/10] Overhaul of diagnostics (v5) Jeff Law
2015-10-23 21:25     ` David Malcolm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).