public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: David Malcolm <dmalcolm@redhat.com>
To: Jakub Jelinek <jakub@redhat.com>
Cc: Richard Biener <richard.guenther@gmail.com>,
	       GCC Patches	 <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH 07/22] Implement token range tracking within libcpp and C/C++ FEs
Date: Thu, 17 Sep 2015 16:54:00 -0000	[thread overview]
Message-ID: <1442508556.25945.26.camel@surprise> (raw)
In-Reply-To: <1442434912.32352.119.camel@surprise>

On Wed, 2015-09-16 at 16:21 -0400, David Malcolm wrote:
> On Tue, 2015-09-15 at 12:48 +0200, Jakub Jelinek wrote:
> > On Tue, Sep 15, 2015 at 12:33:58PM +0200, Richard Biener wrote:
> > > On Tue, Sep 15, 2015 at 12:20 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> > > > On Tue, Sep 15, 2015 at 12:14:22PM +0200, Richard Biener wrote:
> > > >> > diff --git a/gcc/cp/parser.h b/gcc/cp/parser.h
> > > >> > index 760467c..c7558a0 100644
> > > >> > --- a/gcc/cp/parser.h
> > > >> > +++ b/gcc/cp/parser.h
> > > >> > @@ -61,6 +61,8 @@ struct GTY (()) cp_token {
> > > >> >    BOOL_BITFIELD purged_p : 1;
> > > >> >    /* The location at which this token was found.  */
> > > >> >    location_t location;
> > > >> > +  /* The source range at which this token was found.  */
> > > >> > +  source_range range;
> > > >>
> > > >> Is it just me or does location now feel somewhat redundant with range?  Can't we
> > > >> compress that somehow?
> > > >
> > > > For a token I'd expect it is redundant, I don't see how it would be useful
> > > > for a single preprocessing token to have more than start and end locations.
> > > > But generally, for expressions, 3 locations make sense.
> > > > If you have
> > > > abc + def
> > > > ~~~~^~~~~
> > > > then having a range is useful.  In any case, I'm surprised that the ranges aren't encoded in
> > > > location_t (the data structures behind it, where we already stick also
> > > > BLOCK pointer).
> > > 
> > > Probably lack of encoding space ... I suppose upping location_t to
> > > 64bits coud solve
> > > some of that (with its own drawback on increasing size of core structures).
> > 
> > What I had in mind was just add
> >   source_location start, end;
> > to location_adhoc_data struct and use !IS_ADHOC_LOC locations to represent
> > just plain locations without block and without range (including the cases
> > where the range has both start and end equal to the locus) and IS_ADHOC_LOC
> > locations for the cases where either we have non-NULL block, or we have
> > some other range, or both.  But I haven't spent any time on that, so just
> > wondering if such an encoding has been considered.
> 
> I've been attempting to implement that.
> 
> Am I right in thinking that the ad-hoc locations never get purged? i.e.
> that once we've registered an ad-hoc location, then is that slot within
> location_adhoc_data_map is forever associated with that (locus, block)
> pair?  [or in the proposed model, the (locus, src_range, block)
> triple?].
> 
> If so, it may make more sense to put the ranges into ad-hoc locations,
> but only *after tokenization*: in this approach, the src_range would be
> a field within the tokens (like in patch 07/22), in the hope that the
> tokens are short-lived  (which AIUI is the case for libcpp and C, though
> not for C++), presumably also killing the "location" field within
> tokens.  We then stuff the range into the location_t when building trees
> (maybe putting a src_range into c_expr to further delay populating
> location_adhoc_data_map).
> 
> That way we avoid bloating the location_adhoc_data_map during lexing,
> whilst preserving the range information, and we can stuff the ranges
> into the 32-bit location_t within tree/gimple etc (albeit paying a cost
> within the location_adhoc_data_map).
> 
> Thoughts?  Hope this sounds sane.
> Dave

FWIW, I have a (very messy) implementation of this working for the C
frontend, which gives us source ranges on expressions without needing to
add any new tree nodes, or add any fields to existing tree structs.

The approach I'm using:

* ranges are stored directly as fields within cpp_token and c_token
(maybe we can ignore cp_token for now)

* ranges are stashed in the C FE, both (a) within the "struct c_expr"
and (b) within the location_t of each tree expression node as a new
field in the adhoc map.

Doing it twice may seem slightly redundant, but I think both are needed:
  (a) doing it within c_expr allows us to support ranges for constants
and VAR_DECL etc during parsing, without needing any kind of new tree
wrapper node
  (b) doing it in the ad-hoc map allows the ranges for expressions to
survive the parse and be usable in diagnostics later.

So this gives us (in the C FE): ranges for everything during parsing,
and ranges for expressions afterwards, with no new tree nodes or new
fields within tree nodes.

I'm working on cleaning it up into a much more minimal set of patches
that I hope are reviewable.

Hopefully this sounds like a good approach

I've also been looking at ways to mitigate bloat of the ad-hoc map, by
using some extra bits of location_t for representing short ranges
directly.

Dave

  reply	other threads:[~2015-09-17 16:49 UTC|newest]

Thread overview: 133+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-10 20:12 [PATCH 00/22] RFC: Overhaul of diagnostics David Malcolm
2015-09-10 20:12 ` [PATCH 01/22] Change of location_get_source_line signature David Malcolm
2015-09-14 19:28   ` Jeff Law
2015-09-15 17:02     ` David Malcolm
2015-09-10 20:13 ` [PATCH 13/22] gcc-rich-location.[ch]: add methods for working with tree ranges David Malcolm
2015-09-10 20:13 ` [PATCH 03/22] Move diagnostic_show_locus and friends out into a new source file David Malcolm
2015-09-14 19:37   ` Jeff Law
2015-09-18 18:31     ` David Malcolm
2015-09-10 20:13 ` [PATCH 09/22] C frontend: store and use token ranges in c_declspecs David Malcolm
2015-09-10 20:13 ` [PATCH 20/22] Use rich locations in c-family/c-format.c David Malcolm
2015-09-10 20:13 ` [PATCH 11/22] Objective C: c/c-parser.c: use token ranges in two places David Malcolm
2015-09-10 20:13 ` [PATCH 06/22] PR/62314: add ability to add fixit-hints David Malcolm
2015-09-10 20:13 ` [PATCH 08/22] C frontend: use token ranges in various diagnostics David Malcolm
2015-09-10 20:13 ` [PATCH 10/22] C++ FE: Use token ranges for " David Malcolm
2015-09-10 20:13 ` [PATCH 02/22] Testsuite: add dg-{begin|end}-multiline-output commands David Malcolm
2015-09-14 19:35   ` Jeff Law
2015-09-14 22:17     ` Bernhard Reutner-Fischer
2015-09-14 22:45       ` Jeff Law
2015-09-15 17:53         ` dejagnu version update? Mike Stump
2015-09-15 19:23           ` David Malcolm
2015-09-15 20:29             ` Jeff Law
2015-09-15 21:15               ` Bernhard Reutner-Fischer
2017-05-13 10:38                 ` Bernhard Reutner-Fischer
2017-05-13 11:06                   ` Jakub Jelinek
2017-05-13 21:12                     ` Jeff Law
2017-05-14 23:10                       ` NightStrike
2017-05-15  8:14                         ` Richard Biener
2017-05-15 19:24                           ` Mike Stump
2017-05-15 20:52                             ` Andreas Schwab
2017-05-16  9:56                     ` Jonathan Wakely
2017-05-16 12:16                       ` Bernhard Reutner-Fischer
2017-05-16 12:35                         ` Jonathan Wakely
2017-05-16 12:55                           ` Bernhard Reutner-Fischer
2017-05-16 18:41                             ` Matthias Klose
2017-05-16 19:09                           ` Mike Stump
2018-08-04 16:32                             ` Bernhard Reutner-Fischer
2018-08-06 14:33                               ` Jonathan Wakely
2018-08-06 15:26                               ` Mike Stump
2018-08-07 16:34                                 ` Segher Boessenkool
2018-08-08 11:18                                   ` Bernhard Reutner-Fischer
2018-08-08 13:35                                     ` Richard Earnshaw (lists)
2018-08-08 14:37                                     ` Michael Matz
2018-08-08 16:45                                     ` Segher Boessenkool
2021-10-27 23:00                               ` Bernhard Reutner-Fischer
2021-10-28 19:11                                 ` Jeff Law
2021-10-29  0:41                                   ` [PATCH] Bump required minimum DejaGnu version to 1.5.3 Bernhard Reutner-Fischer
2021-10-29  7:32                                     ` Richard Biener
2021-11-04 11:55                                       ` Segher Boessenkool
2021-11-04 12:22                                         ` Martin Liška
2021-11-04 19:09                                           ` Segher Boessenkool
2021-11-05  9:33                                             ` Richard Biener
2021-11-05 11:39                                               ` Jonathan Wakely
2021-11-04 12:41                                         ` Richard Biener
2021-11-04 13:50                                           ` Jonathan Wakely
2015-09-15 19:53           ` dejagnu version update? Bernhard Reutner-Fischer
2015-09-15 20:05             ` Jeff Law
2015-09-15 23:12               ` Mike Stump
2015-09-16  7:41                 ` Andreas Schwab
2015-09-16 16:19                   ` Mike Stump
2015-09-16 16:32                     ` Ramana Radhakrishnan
2015-09-16 16:39                       ` Jeff Law
2015-09-16 17:26                         ` Trevor Saunders
2015-09-16 17:46                         ` David Malcolm
2015-09-16 19:09                           ` Bernhard Reutner-Fischer
2015-09-16 19:51                             ` Mike Stump
2015-09-17  0:07                           ` Segher Boessenkool
2015-09-17 13:57                         ` Richard Earnshaw
2015-09-16 18:04                       ` Mike Stump
2015-09-16 18:58                         ` Bernhard Reutner-Fischer
2015-09-16 19:37                         ` Ramana Radhakrishnan
2015-09-16 13:17             ` Matthias Klose
2015-09-16 15:46               ` Bernhard Reutner-Fischer
2015-09-10 20:28 ` [PATCH 07/22] Implement token range tracking within libcpp and C/C++ FEs David Malcolm
2015-09-11 14:08   ` Michael Matz
2015-09-14 19:41     ` Jeff Law
2015-09-15 10:20   ` Richard Biener
2015-09-15 10:28     ` Jakub Jelinek
2015-09-15 10:48       ` Richard Biener
2015-09-15 11:01         ` Jakub Jelinek
2015-09-16 20:29           ` David Malcolm
2015-09-17 16:54             ` David Malcolm [this message]
2015-09-17 19:15               ` Jeff Law
2015-09-17 20:06                 ` David Malcolm
2015-09-17 19:25         ` Jeff Law
2015-09-15 12:09       ` Manuel López-Ibáñez
2015-09-15 12:18         ` Richard Biener
2015-09-15 12:57           ` Manuel López-Ibáñez
2015-09-17 19:11             ` Jeff Law
2015-09-17 19:13           ` Jeff Law
2015-09-15 13:53       ` David Malcolm
2015-09-10 20:29 ` [PATCH 05/22] Add overloads of inform, warning_at, etc that take a source_range David Malcolm
2015-09-10 20:29 ` [PATCH 04/22] Reimplement diagnostic_show_locus, introducing rich_location classes David Malcolm
2015-09-11 13:44   ` Michael Matz
2015-09-11 14:12   ` Michael Matz
2015-09-11 15:15     ` David Malcolm
2015-09-10 20:30 ` [PATCH 14/22] C: capture tree ranges for various expressions David Malcolm
2015-09-10 20:30 ` [PATCH 12/22] Add source-ranges for trees David Malcolm
2015-09-10 20:30 ` [PATCH 15/22] Add plugin to recursively dump the source-ranges in a tree David Malcolm
2015-09-11  3:19   ` Martin Sebor
2015-09-10 20:31 ` [PATCH 18/22] Track locations within string literals in tree_string David Malcolm
2015-09-10 20:31 ` [PATCH 19/22] gcc-rich-location.[ch]: add debug methods for cpp_string_location David Malcolm
2015-09-10 20:32 ` [PATCH 21/22] Use Levenshtein distance for various misspellings in C frontend David Malcolm
2015-09-10 21:11   ` Andi Kleen
2015-09-11 15:31   ` Manuel López-Ibáñez
2015-09-15 15:25     ` [PATCH WIP] Use Levenshtein distance for various misspellings in C frontend v2 David Malcolm
2015-09-15 16:25       ` Manuel López-Ibáñez
2015-09-16  8:45       ` Richard Biener
2015-09-16 13:33         ` Michael Matz
2015-09-16 14:00           ` Richard Biener
2015-09-16 15:49             ` Manuel López-Ibáñez
2015-09-17  8:46               ` Richard Biener
2015-09-17 19:32         ` Jeff Law
2015-09-17 20:05           ` David Malcolm
2015-09-17 20:52             ` Manuel López-Ibáñez
2015-10-30 12:30           ` [PATCH 0/2] Levenshtein-based suggestions (v3) David Malcolm
2015-10-30 12:30             ` [PATCH 2/2] C FE: suggest corrections for misspelled field names David Malcolm
2015-10-30 12:36             ` [PATCH 1/2] Implement Levenshtein distance David Malcolm
2015-11-02 10:56               ` Mikael Morin
2015-11-02  6:44             ` [PATCH 0/2] Levenshtein-based suggestions (v3) Jeff Law
2015-11-13  2:08               ` David Malcolm
2015-11-13  6:57                 ` Marek Polacek
2015-11-13 12:16                   ` David Malcolm
2015-11-13 15:11                     ` Marek Polacek
2015-11-13 15:44                       ` Bernd Schmidt
2015-11-13 15:53                         ` Marek Polacek
2015-11-13 15:56                           ` Jakub Jelinek
2015-11-13 16:02                             ` Marek Polacek
2015-09-10 20:32 ` [PATCH 16/22] C/C++ frontend: use tree ranges in various diagnostics David Malcolm
2015-09-10 20:32 ` [PATCH 17/22] libcpp: add location tracking within string literals David Malcolm
2015-09-10 20:50 ` [PATCH 22/22] Add fixit hints to spellchecker suggestions David Malcolm
2015-09-14 17:49 ` [PATCH 00/22] RFC: Overhaul of diagnostics Bernd Schmidt
2015-09-14 19:44   ` Jeff Law
2015-09-15  1:11     ` David Malcolm

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1442508556.25945.26.camel@surprise \
    --to=dmalcolm@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).