public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: David Malcolm <dmalcolm@redhat.com>
To: Jeff Law <law@redhat.com>
Cc: gcc-patches@gcc.gnu.org,
	Richard Biener <richard.guenther@gmail.com>,
	       Dodji Seketeli <dodji@redhat.com>
Subject: Status of rich location work (was Re: [PATCH 06/10] Track expression ranges in C frontend)
Date: Mon, 02 Nov 2015 19:14:00 -0000	[thread overview]
Message-ID: <1446491668.2994.84.camel@surprise> (raw)
In-Reply-To: <56330AEF.7080202@redhat.com>

On Fri, 2015-10-30 at 00:15 -0600, Jeff Law wrote:
> On 10/23/2015 02:41 PM, David Malcolm wrote:
> > As in the previous version of this patch
> >   "Implement tree expression tracking in C FE (v2)"
> > the patch now captures ranges for all C expressions during parsing within
> > a new field of c_expr, and for all tree nodes with a location_t, it stores
> > them in ad-hoc locations for later use.
> >
> > Hence compound expressions get ranges; see:
> >    https://dmalcolm.fedorapeople.org/gcc/2015-09-22/diagnostic-test-expressions-1.html
> >
> > and for this example:
> >
> >    int test (int foo)
> >    {
> >      return foo * 100;
> >             ^^^   ^^^
> >    }
> >
> > we have access to the ranges of "foo" and "100" during C parsing via
> > the c_expr, but once we have GENERIC, all we have is a VAR_DECL and an
> > INTEGER_CST (the former's location is in at the top of the
> > function, and the latter has no location).
> >
> > gcc/ChangeLog:
> > 	* Makefile.in (OBJS): Add gcc-rich-location.o.
> > 	* gcc-rich-location.c: New file.
> > 	* gcc-rich-location.h: New file.
> > 	* print-tree.c (print_node): Print any source range information.
> > 	* tree.c (set_source_range): New functions.
> > 	* tree.h (CAN_HAVE_RANGE_P): New.
> > 	(EXPR_LOCATION_RANGE): New.
> > 	(EXPR_HAS_RANGE): New.
> > 	(get_expr_source_range): New inline function.
> > 	(DECL_LOCATION_RANGE): New.
> > 	(set_source_range): New decls.
> > 	(get_decl_source_range): New inline function.
> >
> > gcc/c-family/ChangeLog:
> > 	* c-common.c (c_fully_fold_internal): Capture existing souce_range,
> > 	and store it on the result.
> >
> > gcc/c/ChangeLog:
> > 	* c-parser.c (set_c_expr_source_range): New functions.
> > 	(c_token::get_range): New method.
> > 	(c_token::get_finish): New method.
> > 	(c_parser_expr_no_commas): Call set_c_expr_source_range on the ret
> > 	based on the range from the start of the LHS to the end of the
> > 	RHS.
> > 	(c_parser_conditional_expression): Likewise, based on the range
> > 	from the start of the cond.value to the end of exp2.value.
> > 	(c_parser_binary_expression): Call set_c_expr_source_range on
> > 	the stack values for TRUTH_ANDIF_EXPR and TRUTH_ORIF_EXPR.
> > 	(c_parser_cast_expression): Call set_c_expr_source_range on ret
> > 	based on the cast_loc through to the end of the expr.
> > 	(c_parser_unary_expression): Likewise, based on the
> > 	op_loc through to the end of op.
> > 	(c_parser_sizeof_expression) Likewise, based on the start of the
> > 	sizeof token through to either the closing paren or the end of
> > 	expr.
> > 	(c_parser_postfix_expression): Likewise, using the token range,
> > 	or from the open paren through to the close paren for
> > 	parenthesized expressions.
> > 	(c_parser_postfix_expression_after_primary): Likewise, for
> > 	various kinds of expression.
> > 	* c-tree.h (struct c_expr): Add field "src_range".
> > 	(c_expr::get_start): New method.
> > 	(c_expr::get_finish): New method.
> > 	(set_c_expr_source_range): New decls.
> > 	* c-typeck.c (parser_build_unary_op): Call set_c_expr_source_range
> > 	on ret for prefix unary ops.
> > 	(parser_build_binary_op): Likewise, running from the start of
> > 	arg1.value through to the end of arg2.value.
> >
> > gcc/testsuite/ChangeLog:
> > 	* gcc.dg/plugin/diagnostic-test-expressions-1.c: New file.
> > 	* gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c:
> > 	New file.
> > 	* gcc.dg/plugin/plugin.exp (plugin_test_list): Add
> > 	diagnostic_plugin_test_tree_expression_range.c and
> > 	diagnostic-test-expressions-1.c.
> 
> >   /* Initialization routine for this file.  */
> >
> > @@ -6085,6 +6112,9 @@ c_parser_expr_no_commas (c_parser *parser, struct c_expr *after,
> >     ret.value = build_modify_expr (op_location, lhs.value, lhs.original_type,
> >   				 code, exp_location, rhs.value,
> >   				 rhs.original_type);
> > +  set_c_expr_source_range (&ret,
> > +			   lhs.get_start (),
> > +			   rhs.get_finish ());
> One line if it fits.
> 
> 
> > @@ -6198,6 +6232,9 @@ c_parser_conditional_expression (c_parser *parser, struct c_expr *after,
> >   			   ? t1
> >   			   : NULL);
> >       }
> > +  set_c_expr_source_range (&ret,
> > +			   start,
> > +			   exp2.get_finish ());
> Here too.
> 
> > @@ -6522,6 +6564,10 @@ c_parser_cast_expression (c_parser *parser, struct c_expr *after)
> >   	expr = convert_lvalue_to_rvalue (expr_loc, expr, true, true);
> >         }
> >         ret.value = c_cast_expr (cast_loc, type_name, expr.value);
> > +      if (ret.value && expr.value)
> > +	set_c_expr_source_range (&ret,
> > +				 cast_loc,
> > +				 expr.get_finish ());
> And here?
> 
> With the nits fixed, this is OK.
> 
> I think that covers this iteration of the rich location work and that 
> you'll continue working with Jason on extending this into the C++ front-end.

Here's a summary of the current status of this work [1]:

Patches 1-4 of the kit: these Jeff has approved, with some pre-approved
nit fixes in 4.  I see these as relatively low risk, and plan to commit
these today/tomorrow.

Patches 5-10: Jeff approved these also (again with some nits). These
feel higher-risk to me, owing to the potential for performance
regressions; I haven't yet answered at least one of Richi's performance
questions (impact on time taken to generate the C++ PCH file); the last
performance testing I did can be seen here:
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02283.html
where the right-most column is this kit.

CCing Richi to keep him in the loop for the above.  Richi, is there any
other specific testing you'd want me to do for this?
Or is it OK to commit, and to see what impact it has on your daily
performance testing?  (and to revert if the impact is unacceptable).

Talking about risks: the reduction of the space for ordinary maps by a
factor of 32, by taking up 5 bits for the packed range information
optimization (patch 10):
 https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02539.html
CCing Dodji: Dodji; is this reasonable?

I did some experiments back in July on this; instrumented builds of
openldap and openssl to see how much space we have in location_t:
https://dmalcolm.fedorapeople.org/gcc/2015-07-23/openldap.csv
https://dmalcolm.fedorapeople.org/gcc/2015-07-24/openldap.csv

(these are space-separated:
           SRPM name
           sourcefile
           maximal ordinary location_t
           minimal macro location_t)

openldap build #files: 906
maximal ordinary location_t was:
sourcefile='/builddir/build/BUILD/openldap-2.4.40/openldap-2.4.40/servers/slapd/bconfig.c'
          max_ordinary_location=0x0081bd1b
          (and min_macro_location=0x7ffe5903
minimal macro location_t was:
sourcefile='/builddir/build/BUILD/openldap-2.4.40/openldap-2.4.40/servers/slapd/aclparse.c'
          min_macro_location=0x7ffe57e2
          (with max_ordinary_location=0x00719775)

openssl-1.0.1k-8.fc22.src.rpm.x86_64:
      #files: 1495
max_ordinary_location=0x00be3726
 (openssl-1.0.1k/apps/s_client.c)
 with min_macro_location=0x7ffe7b6b

min_macro_location=0x7ffdf069 
 (openssl-1.0.1k/apps/speed.c)
 with max_ordinary_location=0x00a1abdf

In all of the above cases, we had enough room to do the bit-packing
optimization, but this is just two projects (albeit real-world C code).

Comparing the gap between maximal ordinary map location and minimal
macro map location, and seeing how much we can expand the ordinary map
locations, the openldap build had:
  (0x7ffe57e2 - 0x0081bd1b) / 0x0081bd1b  == factor of 251 i.e.
7 bits of space available

openssl build had:
  (0x7ffdf069 - 0x00be3726) / 0x00be3726  == factor of 171 i.e. 7 bits
of space available

hence allocating 5 bits to packing ranges is (I hope) reasonable.


Jeff: I'm working on expression ranges in the C++ FE; is that a
prerequisite for patches 5-10, or can 5-10 go ahead without the C++
work?  (assuming the other issues above are acceptable).

Hope this all makes sense and sounds sane
Dave

[1] Together the kit gives us:
* patch 4: infrastructure for printing underlines in
diagnostic_show_locus and for multiple ranges
* patches 5-10: the "source_location" (aka location_t) type becomes a
caret plus a range; the tokens coming from libcpp gain ranges, so
everything using libcpp gains at least underlines of tokens; the C
frontend generates sane ranges for expressions as it parses them, better
showing the user how the parser "sees" their code.

Hence we ought to get underlined ranges for many diagnostics in C and C
++ with this (e.g. input_location gives an underline covering the range
of the token starting at the caret).  The "caret" should remain
unchanged from the status quo, so e.g. debugging locations shouldn't be
affected by the addition of ranges.

I'm anticipating that we'd need some followup patches to pick better
ranges for some diagnostics, analogous to the way we convert "warning"
to "warning_at" for where input_location isn't the best location; I'd
expect these followup patches to be relative simple and low-risk.


  reply	other threads:[~2015-11-02 19:14 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-22 21:09 [PATCH 0/5] RFC: Overhaul of diagnostics (v2) David Malcolm
2015-09-22 21:09 ` [PATCH 1/5] Testsuite: add dg-{begin|end}-multiline-output commands David Malcolm
2015-09-25 17:22   ` Jeff Law
2015-09-27  1:29     ` Bernhard Reutner-Fischer
2015-09-22 21:10 ` [PATCH 3/5] Implement token range tracking within libcpp and the C FE (v2) David Malcolm
2015-09-25  9:58   ` Dodji Seketeli
2015-09-25 14:53     ` David Malcolm
2015-09-25 16:15       ` Dodji Seketeli
2015-09-22 21:10 ` [PATCH 5/5] Add plugin to recursively dump the source-ranges in a tree (v2) David Malcolm
2015-09-28  8:23   ` Dodji Seketeli
2015-09-22 21:33 ` [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2) David Malcolm
2015-09-25  9:49   ` Dodji Seketeli
2015-09-25 12:34     ` Manuel López-Ibáñez
2015-09-25 16:21       ` Dodji Seketeli
2015-09-25 20:39     ` [PATCH] v3 of diagnostic_show_locus and rich_location (was Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2)) David Malcolm
2015-09-25 20:42       ` Manuel López-Ibáñez
2015-09-25 21:14         ` Manuel López-Ibáñez
2015-09-25 22:10           ` Manuel López-Ibáñez
2015-09-26  4:51             ` David Malcolm
2015-09-26  6:18               ` Manuel López-Ibáñez
2015-09-25 22:40           ` David Malcolm
2015-09-26  6:41             ` Manuel López-Ibáñez
2015-09-27 14:19       ` Dodji Seketeli
2015-10-12 15:45         ` [PATCH] v4 of diagnostic_show_locus and rich_location David Malcolm
2015-10-12 16:37           ` Manuel López-Ibáñez
2015-10-13 18:09             ` David Malcolm
2015-12-29 20:55       ` [PATCH] v3 of diagnostic_show_locus and rich_location (was Re: [PATCH 2/5] Reimplement diagnostic_show_locus, introducing rich_location classes (v2)) Mike Stump
2016-01-06 15:37         ` David Malcolm
2015-09-22 22:23 ` [PATCH 4/5] Implement tree expression tracking in C FE (v2) David Malcolm
2015-09-25 14:22   ` Dodji Seketeli
2015-09-25 15:04     ` David Malcolm
2015-09-25 16:36       ` Dodji Seketeli
2015-09-23 13:36 ` [PATCH 0/5] RFC: Overhaul of diagnostics (v2) Michael Matz
2015-09-23 13:43   ` Richard Biener
2015-09-23 13:53     ` Michael Matz
2015-09-23 15:51       ` Jeff Law
2015-09-24  2:39     ` David Malcolm
2015-09-24  9:03       ` Richard Biener
2015-09-25 16:50         ` Jeff Law
2015-10-13 15:33         ` Benchmarks of v2 (was Re: [PATCH 0/5] RFC: Overhaul of diagnostics (v2)) David Malcolm
2015-10-14  9:00           ` Richard Biener
2015-10-14 12:49             ` Michael Matz
2015-10-16 15:57             ` David Malcolm
2015-10-19 14:59               ` Michael Matz
2015-10-22 15:05                 ` David Malcolm
2015-11-13 16:02             ` David Malcolm
2015-10-23 20:25 ` [PATCH 00/10] Overhaul of diagnostics (v5) David Malcolm
2015-10-23 20:24   ` [PATCH 01/10] Improvements to description of source_location in line-map.h David Malcolm
2015-10-23 21:02     ` Jeff Law
2015-10-23 20:24   ` [PATCH 06/10] Track expression ranges in C frontend David Malcolm
2015-10-30  8:01     ` Jeff Law
2015-11-02 19:14       ` David Malcolm [this message]
2015-11-02 19:53         ` Status of rich location work (was Re: [PATCH 06/10] Track expression ranges in C frontend) David Malcolm
2015-11-02 22:26         ` Jeff Law
2015-11-06  7:12         ` Dodji Seketeli
2015-11-13 16:37         ` libcpp/C FE source range patch committed (r230331) David Malcolm
2015-10-23 20:24   ` [PATCH 03/10] libstdc++v3: Explicitly disable carets and colorization within testsuite David Malcolm
2015-10-23 21:10     ` Jeff Law
2015-10-23 20:25   ` [PATCH 04/10] Reimplement diagnostic_show_locus, introducing rich_location classes (v5) David Malcolm
2015-10-27 23:12     ` Jeff Law
2015-10-28 17:52       ` David Malcolm
2015-10-28 17:51         ` [PATCH 4b] diagnostic-show-locus.c changes: Insertions David Malcolm
2015-10-30  4:53           ` Jeff Law
2015-10-30 19:42             ` David Malcolm
2015-11-06 19:59             ` David Malcolm
2015-10-28 17:51         ` [PATCH 4a] diagnostic-show-locus.c changes: Deletions David Malcolm
2015-10-28 17:59         ` [PATCH 4c] Other changes: everything apart from diagnostic-show-locus.c changes David Malcolm
2015-10-30  4:49         ` [PATCH 04/10] Reimplement diagnostic_show_locus, introducing rich_location classes (v5) Jeff Law
2015-10-23 20:26   ` [PATCH 07/10] Add plugin to recursively dump the source-ranges in a tree (v2) David Malcolm
2015-10-27 21:32     ` Jeff Law
2015-10-23 20:26   ` [PATCH 05/10] Add ranges to libcpp tokens (via ad-hoc data, unoptimized) David Malcolm
2015-10-27 21:29     ` Jeff Law
2015-10-23 20:26   ` [PATCH 02/10] Add stats on adhoc table to dump_line_table_statistics David Malcolm
2015-10-23 21:07     ` Jeff Law
2015-10-23 20:26   ` [PATCH 10/10] Compress short ranges into source_location David Malcolm
2015-10-30  6:07     ` Jeff Law
2015-11-04 20:42     ` Dodji Seketeli
2015-10-23 20:26   ` [PATCH 08/10] Wire things up so that libcpp users get token underlines David Malcolm
2015-10-30  6:15     ` Jeff Law
2015-10-23 20:29   ` [PATCH 09/10] Delay some resolution of ad-hoc locations, preserving ranges David Malcolm
2015-10-27 22:15     ` Jeff Law
2015-10-23 21:25   ` [PATCH 00/10] Overhaul of diagnostics (v5) Jeff Law
2015-10-23 21:25     ` David Malcolm

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1446491668.2994.84.camel@surprise \
    --to=dmalcolm@redhat.com \
    --cc=dodji@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=law@redhat.com \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).