public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Help with rich_location and GIMPLE stmts
@ 2017-05-15 13:36 Martin Liška
  2017-05-16 19:14 ` David Malcolm
  0 siblings, 1 reply; 5+ messages in thread
From: Martin Liška @ 2017-05-15 13:36 UTC (permalink / raw)
  To: GCC Development; +Cc: David Malcolm

Hi.

I sent this email to David some time ago, but it should be probably answered
on gcc mailing list.

I have idea one to improve gcov tool and I'm interested in more precise locations for gimple
statements. For gcov purpose, we dump location in ipa-profile pass, which is an early IPA
pass and this data is used by gcov tool to map statements (blocks) to lines of code.

I did a small experiment on the place we emit the location data:
		  inform (gimple_location (stmt), "output_location");

and it shows for:
$ cat m2.c
unsigned int
UuT (void)
{ unsigned int true_var = 1; unsigned int false_var = 0; unsigned int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* count(#####) */ return ret; }

int
main (int argc, char **argv)
{
  UuT ();
  return 0;
}

$ gcc --coverage m2.c
m2.c: In function ‘main’:
m2.c:8:3: note: output_location
   UuT ();
   ^~~~~~
# .MEM_2 = VDEF <.MEM_1(D)>
UuT ();
m2.c:9:10: note: output_location
   return 0;
          ^
_3 = 0;
m2.c: In function ‘UuT’:
m2.c:3:16: note: output_location
 { unsigned int true_var = 1; unsigned int false_var = 0; unsigned int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* count(#####) */ return ret; }
                ^~~~~~~~
true_var_3 = 1;
m2.c:3:43: note: output_location
 { unsigned int true_var = 1; unsigned int false_var = 0; unsigned int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* count(#####) */ return ret; }
                                           ^~~~~~~~~
false_var_4 = 0;
m2.c:3:71: note: output_location
 { unsigned int true_var = 1; unsigned int false_var = 0; unsigned int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* count(#####) */ return ret; }
                                                                       ^~~
ret_5 = 0;
m2.c:3:83: note: output_location
 { unsigned int true_var = 1; unsigned int false_var = 0; unsigned int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* count(#####) */ return ret; }
                                                                                   ^
if (true_var_3 != 0)
m2.c:3:114: note: output_location
 { unsigned int true_var = 1; unsigned int false_var = 0; unsigned int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* count(#####) */ return ret; }
                                                                                                                  ^
if (false_var_4 != 0)
m2.c:3:145: note: output_location
 { unsigned int true_var = 1; unsigned int false_var = 0; unsigned int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* count(#####) */ return ret; }
                                                                                                                                             ~~~~^~~~~
ret_7 = 111;
m2.c:3:182: note: output_location
 { unsigned int true_var = 1; unsigned int false_var = 0; unsigned int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* count(#####) */ return ret; }
                                                                                                                                                                                  ~~~~^~~~~
ret_6 = 999;
m2.c:3:215: note: output_location
 { unsigned int true_var = 1; unsigned int false_var = 0; unsigned int ret = 0; if (true_var) /* count(1) */ { if (false_var) /* count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /* count(#####) */ return ret; }
                                                                                                                                                                                                                       ^~~
_8 = ret_2;
m2.c:3:215: note: output_location
# VUSE <.MEM_9(D)>
return _8;

Which is not optimal, for some assignments I see just LHS (false_var_4 = 0), for return statements only a returned value is displayed. For conditions, only condition beginning is showed.
Is this known behavior or do I miss something?

Thanks,
Martin

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Help with rich_location and GIMPLE stmts
  2017-05-15 13:36 Help with rich_location and GIMPLE stmts Martin Liška
@ 2017-05-16 19:14 ` David Malcolm
  2017-05-18 11:22   ` Martin Liška
  0 siblings, 1 reply; 5+ messages in thread
From: David Malcolm @ 2017-05-16 19:14 UTC (permalink / raw)
  To: Martin Liška, GCC Development

On Mon, 2017-05-15 at 15:36 +0200, Martin Liška wrote:
> Hi.
> 
> I sent this email to David some time ago, but it should be probably
> answered
> on gcc mailing list.

> I have idea one to improve gcov tool and I'm interested in more
> precise locations for gimple
> statements. For gcov purpose, we dump location in ipa-profile pass,
> which is an early IPA
> pass and this data is used by gcov tool to map statements (blocks) to
> lines of code.
> 
> I did a small experiment on the place we emit the location data:
> 		  inform (gimple_location (stmt), "output_location");
> 
> and it shows for:
> $ cat m2.c
> unsigned int
> UuT (void)
> { unsigned int true_var = 1; unsigned int false_var = 0; unsigned int
> ret = 0; if (true_var) /* count(1) */ { if (false_var) /* count(1) */
> ret = 111; /* count(#####) */ } else ret = 999; /* count(#####) */
> return ret; }
> 
> int
> main (int argc, char **argv)
> {
>   UuT ();
>   return 0;
> }
> 
> $ gcc --coverage m2.c
> m2.c: In function ‘main’:
> m2.c:8:3: note: output_location
>    UuT ();
>    ^~~~~~
> # .MEM_2 = VDEF <.MEM_1(D)>
> UuT ();
> m2.c:9:10: note: output_location
>    return 0;
>           ^
> _3 = 0;
> m2.c: In function ‘UuT’:
> m2.c:3:16: note: output_location
>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
> count(#####) */ return ret; }
>                 ^~~~~~~~
> true_var_3 = 1;
> m2.c:3:43: note: output_location
>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
> count(#####) */ return ret; }
>                                            ^~~~~~~~~
> false_var_4 = 0;
> m2.c:3:71: note: output_location
>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
> count(#####) */ return ret; }
>                                                                      
>   ^~~
> ret_5 = 0;
> m2.c:3:83: note: output_location
>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
> count(#####) */ return ret; }
>                                                                      
>               ^
> if (true_var_3 != 0)
> m2.c:3:114: note: output_location
>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
> count(#####) */ return ret; }
>                                                                      
>                                              ^
> if (false_var_4 != 0)
> m2.c:3:145: note: output_location
>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
> count(#####) */ return ret; }
>                                                                      
>                                                                      
>    ~~~~^~~~~
> ret_7 = 111;
> m2.c:3:182: note: output_location
>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
> count(#####) */ return ret; }
>                                                                      
>                                                                      
>                                         ~~~~^~~~~
> ret_6 = 999;
> m2.c:3:215: note: output_location
>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
> count(#####) */ return ret; }
>                                                                      
>                                                                      
>                                                                      
>         ^~~
> _8 = ret_2;
> m2.c:3:215: note: output_location
> # VUSE <.MEM_9(D)>
> return _8;
> 
> Which is not optimal, for some assignments I see just LHS 
> (false_var_4 = 0),

My first though was: are there assignments for which this isn't the
case?  The only one I see is the:
  ret = 999;
  ~~~~^~~~~

Are the locations for these assignments coming through from the
frontend?

I believe that in the C frontend these are assignment-expression, and
hence handled by c_parser_expr_no_commas; in particular the use of
op_location and the call:

  set_c_expr_source_range (&ret, lhs.get_start (), rhs.get_finish ());

ought to be setting up the caret of the assignment to be on the
operator token, and for the start/finish to range from the start of the
lhs to the end of the rhs i.e. what we see for:

  ret = 999;
  ~~~~^~~~~


> for return statements only a returned value is displayed. 

Is this running on SSA form?  If so, I wonder if you're running into
something like this:

  retval_N = PHI <lots of values>;
  return retval_N;

where it's using the location of that "return retval_N;" for all of the
return statements in the function, rather than the individual source
locations.

> For conditions, only condition beginning is showed.
> Is this known behavior or do I miss
> something?

c_parser_if_statement has:

  loc = c_parser_peek_token (parser)->location;

which is that of the open-paren.  Maybe we should build a location
covering the range of the "if ( expression )" part of the if-statement?

Note that the C++ frontend may do different things than the C frontend
for any and all of these.  When I added range-based locations to them,
I attempted to preserve the caret locations to avoid affecting stepping
behavior in the debugger (since the caret location is used when
stripping away range information); we could revisit some of the
decisions.

BTW, in the subject line, you mention a "rich_location", but I think
you mean a source_location (aka location_t): this is a caret location
along with range information; a rich_location is one or more of them,
potentially with fix-it hints.  See "Example B" and "Example C" in line
-map.h.  Admittedly the terminology there is a little muddled (sorry).


Dave

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Help with rich_location and GIMPLE stmts
  2017-05-16 19:14 ` David Malcolm
@ 2017-05-18 11:22   ` Martin Liška
  2017-05-19 12:14     ` Marek Polacek
  0 siblings, 1 reply; 5+ messages in thread
From: Martin Liška @ 2017-05-18 11:22 UTC (permalink / raw)
  To: David Malcolm, GCC Development; +Cc: Marek Polacek

On 05/16/2017 09:14 PM, David Malcolm wrote:
> On Mon, 2017-05-15 at 15:36 +0200, Martin Liška wrote:
>> Hi.
>>
>> I sent this email to David some time ago, but it should be probably
>> answered
>> on gcc mailing list.
> 
>> I have idea one to improve gcov tool and I'm interested in more
>> precise locations for gimple
>> statements. For gcov purpose, we dump location in ipa-profile pass,
>> which is an early IPA
>> pass and this data is used by gcov tool to map statements (blocks) to
>> lines of code.
>>
>> I did a small experiment on the place we emit the location data:
>> 		  inform (gimple_location (stmt), "output_location");
>>
>> and it shows for:
>> $ cat m2.c
>> unsigned int
>> UuT (void)
>> { unsigned int true_var = 1; unsigned int false_var = 0; unsigned int
>> ret = 0; if (true_var) /* count(1) */ { if (false_var) /* count(1) */
>> ret = 111; /* count(#####) */ } else ret = 999; /* count(#####) */
>> return ret; }
>>
>> int
>> main (int argc, char **argv)
>> {
>>   UuT ();
>>   return 0;
>> }
>>
>> $ gcc --coverage m2.c
>> m2.c: In function ‘main’:
>> m2.c:8:3: note: output_location
>>    UuT ();
>>    ^~~~~~
>> # .MEM_2 = VDEF <.MEM_1(D)>
>> UuT ();
>> m2.c:9:10: note: output_location
>>    return 0;
>>           ^
>> _3 = 0;
>> m2.c: In function ‘UuT’:
>> m2.c:3:16: note: output_location
>>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
>> count(#####) */ return ret; }
>>                 ^~~~~~~~
>> true_var_3 = 1;
>> m2.c:3:43: note: output_location
>>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
>> count(#####) */ return ret; }
>>                                            ^~~~~~~~~
>> false_var_4 = 0;
>> m2.c:3:71: note: output_location
>>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
>> count(#####) */ return ret; }
>>                                                                      
>>   ^~~
>> ret_5 = 0;
>> m2.c:3:83: note: output_location
>>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
>> count(#####) */ return ret; }
>>                                                                      
>>               ^
>> if (true_var_3 != 0)
>> m2.c:3:114: note: output_location
>>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
>> count(#####) */ return ret; }
>>                                                                      
>>                                              ^
>> if (false_var_4 != 0)
>> m2.c:3:145: note: output_location
>>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
>> count(#####) */ return ret; }
>>                                                                      
>>                                                                      
>>    ~~~~^~~~~
>> ret_7 = 111;
>> m2.c:3:182: note: output_location
>>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
>> count(#####) */ return ret; }
>>                                                                      
>>                                                                      
>>                                         ~~~~^~~~~
>> ret_6 = 999;
>> m2.c:3:215: note: output_location
>>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
>> count(#####) */ return ret; }
>>                                                                      
>>                                                                      
>>                                                                      
>>         ^~~
>> _8 = ret_2;
>> m2.c:3:215: note: output_location
>> # VUSE <.MEM_9(D)>
>> return _8;
>>
>> Which is not optimal, for some assignments I see just LHS 
>> (false_var_4 = 0),
> 
> My first though was: are there assignments for which this isn't the
> case?  The only one I see is the:
>   ret = 999;
>   ~~~~^~~~~
> 
> Are the locations for these assignments coming through from the
> frontend?

Hi.

Actually not all, the default assignments are created in gimplifier and 
location is assigned from DECL_EXPR:

(gdb) p debug_tree(*expr_p)
 <decl_expr 0x7ffff6988c80
    type <void_type 0x7ffff6878f18 void VOID
        align 8 symtab 0 alias set -1 canonical type 0x7ffff6878f18
        pointer_to_this <pointer_type 0x7ffff68800a8>>
    side-effects
    arg 0 <var_decl 0x7ffff7f9ae10 true_var
        type <integer_type 0x7ffff6878690 unsigned int public unsigned SI
            size <integer_cst 0x7ffff6860f18 constant 32>
            unit size <integer_cst 0x7ffff6860f30 constant 4>
            align 32 symtab 0 alias set -1 canonical type 0x7ffff6878690 precision 32 min <integer_cst 0x7ffff6860f48 0> max <integer_cst 0x7ffff6860f00 4294967295>
            pointer_to_this <pointer_type 0x7ffff6885dc8>>
        used unsigned SI file /tmp/m2.c line 4 col 16 size <integer_cst 0x7ffff6860f18 32> unit size <integer_cst 0x7ffff6860f30 4>
        align 32 context <function_decl 0x7ffff697ce00 UuT> initial <integer_cst 0x7ffff698b258 1>
        chain <var_decl 0x7ffff7f9aea0 false_var type <integer_type 0x7ffff6878690 unsigned int>
            used unsigned SI file /tmp/m2.c line 4 col 43 size <integer_cst 0x7ffff6860f18 32> unit size <integer_cst 0x7ffff6860f30 4>
            align 32 context <function_decl 0x7ffff697ce00 UuT> initial <integer_cst 0x7ffff6860f48 0> chain <var_decl 0x7ffff7f9af30 ret>>>
    /tmp/m2.c:4:16 start: /tmp/m2.c:4:16 finish: /tmp/m2.c:4:23>

That explains why only LHS of these assignments is selected.

> 
> I believe that in the C frontend these are assignment-expression, and
> hence handled by c_parser_expr_no_commas; in particular the use of
> op_location and the call:
> 
>   set_c_expr_source_range (&ret, lhs.get_start (), rhs.get_finish ());
> 
> ought to be setting up the caret of the assignment to be on the
> operator token, and for the start/finish to range from the start of the
> lhs to the end of the rhs i.e. what we see for:
> 
>   ret = 999;
>   ~~~~^~~~~

Yep, MODIFY_EXPRs created in FE go this way and it's fine.

> 
> 
>> for return statements only a returned value is displayed. 
> 
> Is this running on SSA form?  If so, I wonder if you're running into
> something like this:
> 
>   retval_N = PHI <lots of values>;
>   return retval_N;
> 
> where it's using the location of that "return retval_N;" for all of the
> return statements in the function, rather than the individual source
> locations.

Yep, but we properly assign each assignment to a SSA name that's going to
be merged in exit BB by PHI node:

_8 = ret_2;
/tmp/m2.c:7:8: note: output_location
 return ret; }
        ^~~

Here the location comes from c_finish_return function where location
comes from a value that's returned.

> 
>> For conditions, only condition beginning is showed.
>> Is this known behavior or do I miss
>> something?
> 
> c_parser_if_statement has:
> 
>   loc = c_parser_peek_token (parser)->location;
> 
> which is that of the open-paren.  Maybe we should build a location
> covering the range of the "if ( expression )" part of the if-statement?

Adding Marek as C FE maintainer to reply the question.

> 
> Note that the C++ frontend may do different things than the C frontend
> for any and all of these.  When I added range-based locations to them,
> I attempted to preserve the caret locations to avoid affecting stepping
> behavior in the debugger (since the caret location is used when
> stripping away range information); we could revisit some of the
> decisions.

Looks both behave the same.

> 
> BTW, in the subject line, you mention a "rich_location", but I think
> you mean a source_location (aka location_t): this is a caret location
> along with range information; a rich_location is one or more of them,
> potentially with fix-it hints.  See "Example B" and "Example C" in line
> -map.h.  Admittedly the terminology there is a little muddled (sorry).

Yep, I was confused by fact that I use inform function to display location_t locations,
and it really builds rich_location:

void
inform (location_t location, const char *gmsgid, ...)
{
  va_list ap;
  va_start (ap, gmsgid);
  rich_location richloc (line_table, location);
  diagnostic_impl (&richloc, -1, gmsgid, &ap, DK_NOTE);
  va_end (ap);
}

Thanks for help,
Martin

> 
> 
> Dave
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Help with rich_location and GIMPLE stmts
  2017-05-18 11:22   ` Martin Liška
@ 2017-05-19 12:14     ` Marek Polacek
  2017-05-19 14:30       ` Martin Liška
  0 siblings, 1 reply; 5+ messages in thread
From: Marek Polacek @ 2017-05-19 12:14 UTC (permalink / raw)
  To: Martin Liška; +Cc: David Malcolm, GCC Development

On Thu, May 18, 2017 at 01:22:02PM +0200, Martin Liška wrote:
> On 05/16/2017 09:14 PM, David Malcolm wrote:
> > On Mon, 2017-05-15 at 15:36 +0200, Martin Liška wrote:
> >> Hi.
> >>
> >> I sent this email to David some time ago, but it should be probably
> >> answered
> >> on gcc mailing list.
> > 
> >> I have idea one to improve gcov tool and I'm interested in more
> >> precise locations for gimple
> >> statements. For gcov purpose, we dump location in ipa-profile pass,
> >> which is an early IPA
> >> pass and this data is used by gcov tool to map statements (blocks) to
> >> lines of code.
> >>
> >> I did a small experiment on the place we emit the location data:
> >> 		  inform (gimple_location (stmt), "output_location");
> >>
> >> and it shows for:
> >> $ cat m2.c
> >> unsigned int
> >> UuT (void)
> >> { unsigned int true_var = 1; unsigned int false_var = 0; unsigned int
> >> ret = 0; if (true_var) /* count(1) */ { if (false_var) /* count(1) */
> >> ret = 111; /* count(#####) */ } else ret = 999; /* count(#####) */
> >> return ret; }
> >>
> >> int
> >> main (int argc, char **argv)
> >> {
> >>   UuT ();
> >>   return 0;
> >> }
> >>
> >> $ gcc --coverage m2.c
> >> m2.c: In function ‘main’:
> >> m2.c:8:3: note: output_location
> >>    UuT ();
> >>    ^~~~~~
> >> # .MEM_2 = VDEF <.MEM_1(D)>
> >> UuT ();
> >> m2.c:9:10: note: output_location
> >>    return 0;
> >>           ^
> >> _3 = 0;
> >> m2.c: In function ‘UuT’:
> >> m2.c:3:16: note: output_location
> >>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
> >> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
> >> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
> >> count(#####) */ return ret; }
> >>                 ^~~~~~~~
> >> true_var_3 = 1;
> >> m2.c:3:43: note: output_location
> >>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
> >> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
> >> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
> >> count(#####) */ return ret; }
> >>                                            ^~~~~~~~~
> >> false_var_4 = 0;
> >> m2.c:3:71: note: output_location
> >>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
> >> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
> >> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
> >> count(#####) */ return ret; }
> >>                                                                      
> >>   ^~~
> >> ret_5 = 0;
> >> m2.c:3:83: note: output_location
> >>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
> >> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
> >> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
> >> count(#####) */ return ret; }
> >>                                                                      
> >>               ^
> >> if (true_var_3 != 0)
> >> m2.c:3:114: note: output_location
> >>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
> >> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
> >> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
> >> count(#####) */ return ret; }
> >>                                                                      
> >>                                              ^
> >> if (false_var_4 != 0)
> >> m2.c:3:145: note: output_location
> >>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
> >> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
> >> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
> >> count(#####) */ return ret; }
> >>                                                                      
> >>                                                                      
> >>    ~~~~^~~~~
> >> ret_7 = 111;
> >> m2.c:3:182: note: output_location
> >>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
> >> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
> >> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
> >> count(#####) */ return ret; }
> >>                                                                      
> >>                                                                      
> >>                                         ~~~~^~~~~
> >> ret_6 = 999;
> >> m2.c:3:215: note: output_location
> >>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
> >> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
> >> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
> >> count(#####) */ return ret; }
> >>                                                                      
> >>                                                                      
> >>                                                                      
> >>         ^~~
> >> _8 = ret_2;
> >> m2.c:3:215: note: output_location
> >> # VUSE <.MEM_9(D)>
> >> return _8;
> >>
> >> Which is not optimal, for some assignments I see just LHS 
> >> (false_var_4 = 0),

Note that

  unsigned int false_var = 0;

is not an assignment-expression, it's an initialization.  Only the
'0' here is parsed as an assignment-expression, but in this case
set_c_expr_source_range isn't called.

> > 
> > My first though was: are there assignments for which this isn't the
> > case?  The only one I see is the:
> >   ret = 999;
> >   ~~~~^~~~~
> > 
> > Are the locations for these assignments coming through from the
> > frontend?
> 
> Hi.
> 
> Actually not all, the default assignments are created in gimplifier and 
> location is assigned from DECL_EXPR:
> 
> (gdb) p debug_tree(*expr_p)
>  <decl_expr 0x7ffff6988c80
>     type <void_type 0x7ffff6878f18 void VOID
>         align 8 symtab 0 alias set -1 canonical type 0x7ffff6878f18
>         pointer_to_this <pointer_type 0x7ffff68800a8>>
>     side-effects
>     arg 0 <var_decl 0x7ffff7f9ae10 true_var
>         type <integer_type 0x7ffff6878690 unsigned int public unsigned SI
>             size <integer_cst 0x7ffff6860f18 constant 32>
>             unit size <integer_cst 0x7ffff6860f30 constant 4>
>             align 32 symtab 0 alias set -1 canonical type 0x7ffff6878690 precision 32 min <integer_cst 0x7ffff6860f48 0> max <integer_cst 0x7ffff6860f00 4294967295>
>             pointer_to_this <pointer_type 0x7ffff6885dc8>>
>         used unsigned SI file /tmp/m2.c line 4 col 16 size <integer_cst 0x7ffff6860f18 32> unit size <integer_cst 0x7ffff6860f30 4>
>         align 32 context <function_decl 0x7ffff697ce00 UuT> initial <integer_cst 0x7ffff698b258 1>
>         chain <var_decl 0x7ffff7f9aea0 false_var type <integer_type 0x7ffff6878690 unsigned int>
>             used unsigned SI file /tmp/m2.c line 4 col 43 size <integer_cst 0x7ffff6860f18 32> unit size <integer_cst 0x7ffff6860f30 4>
>             align 32 context <function_decl 0x7ffff697ce00 UuT> initial <integer_cst 0x7ffff6860f48 0> chain <var_decl 0x7ffff7f9af30 ret>>>
>     /tmp/m2.c:4:16 start: /tmp/m2.c:4:16 finish: /tmp/m2.c:4:23>
> 
> That explains why only LHS of these assignments is selected.
> 
> > 
> > I believe that in the C frontend these are assignment-expression, and
> > hence handled by c_parser_expr_no_commas; in particular the use of
> > op_location and the call:
> > 
> >   set_c_expr_source_range (&ret, lhs.get_start (), rhs.get_finish ());
> > 
> > ought to be setting up the caret of the assignment to be on the
> > operator token, and for the start/finish to range from the start of the
> > lhs to the end of the rhs i.e. what we see for:
> > 
> >   ret = 999;
> >   ~~~~^~~~~
> 
> Yep, MODIFY_EXPRs created in FE go this way and it's fine.
> 
> > 
> > 
> >> for return statements only a returned value is displayed. 
> > 
> > Is this running on SSA form?  If so, I wonder if you're running into
> > something like this:
> > 
> >   retval_N = PHI <lots of values>;
> >   return retval_N;
> > 
> > where it's using the location of that "return retval_N;" for all of the
> > return statements in the function, rather than the individual source
> > locations.
> 
> Yep, but we properly assign each assignment to a SSA name that's going to
> be merged in exit BB by PHI node:
> 
> _8 = ret_2;
> /tmp/m2.c:7:8: note: output_location
>  return ret; }
>         ^~~
> 
> Here the location comes from c_finish_return function where location
> comes from a value that's returned.
> 
> > 
> >> For conditions, only condition beginning is showed.
> >> Is this known behavior or do I miss
> >> something?
> > 
> > c_parser_if_statement has:
> > 
> >   loc = c_parser_peek_token (parser)->location;
> > 
> > which is that of the open-paren.  Maybe we should build a location
> > covering the range of the "if ( expression )" part of the if-statement?
> 
> Adding Marek as C FE maintainer to reply the question.

I suppose we could do better and I'd probably highlight just the expression
part of "if ( expression )".  But not sure how many use cases this range
location would have.

	Marek

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Help with rich_location and GIMPLE stmts
  2017-05-19 12:14     ` Marek Polacek
@ 2017-05-19 14:30       ` Martin Liška
  0 siblings, 0 replies; 5+ messages in thread
From: Martin Liška @ 2017-05-19 14:30 UTC (permalink / raw)
  To: Marek Polacek; +Cc: David Malcolm, GCC Development

On 05/19/2017 02:14 PM, Marek Polacek wrote:
> On Thu, May 18, 2017 at 01:22:02PM +0200, Martin Liška wrote:
>> On 05/16/2017 09:14 PM, David Malcolm wrote:
>>> On Mon, 2017-05-15 at 15:36 +0200, Martin Liška wrote:
>>>> Hi.
>>>>
>>>> I sent this email to David some time ago, but it should be probably
>>>> answered
>>>> on gcc mailing list.
>>>
>>>> I have idea one to improve gcov tool and I'm interested in more
>>>> precise locations for gimple
>>>> statements. For gcov purpose, we dump location in ipa-profile pass,
>>>> which is an early IPA
>>>> pass and this data is used by gcov tool to map statements (blocks) to
>>>> lines of code.
>>>>
>>>> I did a small experiment on the place we emit the location data:
>>>> 		  inform (gimple_location (stmt), "output_location");
>>>>
>>>> and it shows for:
>>>> $ cat m2.c
>>>> unsigned int
>>>> UuT (void)
>>>> { unsigned int true_var = 1; unsigned int false_var = 0; unsigned int
>>>> ret = 0; if (true_var) /* count(1) */ { if (false_var) /* count(1) */
>>>> ret = 111; /* count(#####) */ } else ret = 999; /* count(#####) */
>>>> return ret; }
>>>>
>>>> int
>>>> main (int argc, char **argv)
>>>> {
>>>>   UuT ();
>>>>   return 0;
>>>> }
>>>>
>>>> $ gcc --coverage m2.c
>>>> m2.c: In function ‘main’:
>>>> m2.c:8:3: note: output_location
>>>>    UuT ();
>>>>    ^~~~~~
>>>> # .MEM_2 = VDEF <.MEM_1(D)>
>>>> UuT ();
>>>> m2.c:9:10: note: output_location
>>>>    return 0;
>>>>           ^
>>>> _3 = 0;
>>>> m2.c: In function ‘UuT’:
>>>> m2.c:3:16: note: output_location
>>>>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
>>>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
>>>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
>>>> count(#####) */ return ret; }
>>>>                 ^~~~~~~~
>>>> true_var_3 = 1;
>>>> m2.c:3:43: note: output_location
>>>>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
>>>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
>>>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
>>>> count(#####) */ return ret; }
>>>>                                            ^~~~~~~~~
>>>> false_var_4 = 0;
>>>> m2.c:3:71: note: output_location
>>>>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
>>>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
>>>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
>>>> count(#####) */ return ret; }
>>>>                                                                      
>>>>   ^~~
>>>> ret_5 = 0;
>>>> m2.c:3:83: note: output_location
>>>>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
>>>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
>>>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
>>>> count(#####) */ return ret; }
>>>>                                                                      
>>>>               ^
>>>> if (true_var_3 != 0)
>>>> m2.c:3:114: note: output_location
>>>>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
>>>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
>>>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
>>>> count(#####) */ return ret; }
>>>>                                                                      
>>>>                                              ^
>>>> if (false_var_4 != 0)
>>>> m2.c:3:145: note: output_location
>>>>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
>>>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
>>>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
>>>> count(#####) */ return ret; }
>>>>                                                                      
>>>>                                                                      
>>>>    ~~~~^~~~~
>>>> ret_7 = 111;
>>>> m2.c:3:182: note: output_location
>>>>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
>>>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
>>>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
>>>> count(#####) */ return ret; }
>>>>                                                                      
>>>>                                                                      
>>>>                                         ~~~~^~~~~
>>>> ret_6 = 999;
>>>> m2.c:3:215: note: output_location
>>>>  { unsigned int true_var = 1; unsigned int false_var = 0; unsigned
>>>> int ret = 0; if (true_var) /* count(1) */ { if (false_var) /*
>>>> count(1) */ ret = 111; /* count(#####) */ } else ret = 999; /*
>>>> count(#####) */ return ret; }
>>>>                                                                      
>>>>                                                                      
>>>>                                                                      
>>>>         ^~~
>>>> _8 = ret_2;
>>>> m2.c:3:215: note: output_location
>>>> # VUSE <.MEM_9(D)>
>>>> return _8;
>>>>
>>>> Which is not optimal, for some assignments I see just LHS 
>>>> (false_var_4 = 0),
> 
> Note that
> 
>   unsigned int false_var = 0;
> 
> is not an assignment-expression, it's an initialization.  Only the
> '0' here is parsed as an assignment-expression, but in this case
> set_c_expr_source_range isn't called.

Hello.

Yes, I noticed that it's not called.

> 
>>>
>>> My first though was: are there assignments for which this isn't the
>>> case?  The only one I see is the:
>>>   ret = 999;
>>>   ~~~~^~~~~
>>>
>>> Are the locations for these assignments coming through from the
>>> frontend?
>>
>> Hi.
>>
>> Actually not all, the default assignments are created in gimplifier and 
>> location is assigned from DECL_EXPR:
>>
>> (gdb) p debug_tree(*expr_p)
>>  <decl_expr 0x7ffff6988c80
>>     type <void_type 0x7ffff6878f18 void VOID
>>         align 8 symtab 0 alias set -1 canonical type 0x7ffff6878f18
>>         pointer_to_this <pointer_type 0x7ffff68800a8>>
>>     side-effects
>>     arg 0 <var_decl 0x7ffff7f9ae10 true_var
>>         type <integer_type 0x7ffff6878690 unsigned int public unsigned SI
>>             size <integer_cst 0x7ffff6860f18 constant 32>
>>             unit size <integer_cst 0x7ffff6860f30 constant 4>
>>             align 32 symtab 0 alias set -1 canonical type 0x7ffff6878690 precision 32 min <integer_cst 0x7ffff6860f48 0> max <integer_cst 0x7ffff6860f00 4294967295>
>>             pointer_to_this <pointer_type 0x7ffff6885dc8>>
>>         used unsigned SI file /tmp/m2.c line 4 col 16 size <integer_cst 0x7ffff6860f18 32> unit size <integer_cst 0x7ffff6860f30 4>
>>         align 32 context <function_decl 0x7ffff697ce00 UuT> initial <integer_cst 0x7ffff698b258 1>
>>         chain <var_decl 0x7ffff7f9aea0 false_var type <integer_type 0x7ffff6878690 unsigned int>
>>             used unsigned SI file /tmp/m2.c line 4 col 43 size <integer_cst 0x7ffff6860f18 32> unit size <integer_cst 0x7ffff6860f30 4>
>>             align 32 context <function_decl 0x7ffff697ce00 UuT> initial <integer_cst 0x7ffff6860f48 0> chain <var_decl 0x7ffff7f9af30 ret>>>
>>     /tmp/m2.c:4:16 start: /tmp/m2.c:4:16 finish: /tmp/m2.c:4:23>
>>
>> That explains why only LHS of these assignments is selected.
>>
>>>
>>> I believe that in the C frontend these are assignment-expression, and
>>> hence handled by c_parser_expr_no_commas; in particular the use of
>>> op_location and the call:
>>>
>>>   set_c_expr_source_range (&ret, lhs.get_start (), rhs.get_finish ());
>>>
>>> ought to be setting up the caret of the assignment to be on the
>>> operator token, and for the start/finish to range from the start of the
>>> lhs to the end of the rhs i.e. what we see for:
>>>
>>>   ret = 999;
>>>   ~~~~^~~~~
>>
>> Yep, MODIFY_EXPRs created in FE go this way and it's fine.
>>
>>>
>>>
>>>> for return statements only a returned value is displayed. 
>>>
>>> Is this running on SSA form?  If so, I wonder if you're running into
>>> something like this:
>>>
>>>   retval_N = PHI <lots of values>;
>>>   return retval_N;
>>>
>>> where it's using the location of that "return retval_N;" for all of the
>>> return statements in the function, rather than the individual source
>>> locations.
>>
>> Yep, but we properly assign each assignment to a SSA name that's going to
>> be merged in exit BB by PHI node:
>>
>> _8 = ret_2;
>> /tmp/m2.c:7:8: note: output_location
>>  return ret; }
>>         ^~~
>>
>> Here the location comes from c_finish_return function where location
>> comes from a value that's returned.
>>
>>>
>>>> For conditions, only condition beginning is showed.
>>>> Is this known behavior or do I miss
>>>> something?
>>>
>>> c_parser_if_statement has:
>>>
>>>   loc = c_parser_peek_token (parser)->location;
>>>
>>> which is that of the open-paren.  Maybe we should build a location
>>> covering the range of the "if ( expression )" part of the if-statement?
>>
>> Adding Marek as C FE maintainer to reply the question.
> 
> I suppose we could do better and I'd probably highlight just the expression
> part of "if ( expression )".  But not sure how many use cases this range
> location would have.

Works for me. I guess it can take some time to improve locations of GIMPLE expressions.
Anyhow, I can start enhancing gcov tool even with current locations. Having that, we can
provide users following kind of information:

    1|int b, c, d, e;
      
    2|
      
    3|int main()
      
    4|{
      ^1
    5|   int a = b < 1 ? (c < 3 ? d : c) : e;
                         ^1       ^1  ^0   ^0
    6|   return a;
      
    7|}

Where '^0' means the block (statements) are not executed.

Martin

> 
> 	Marek
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-05-19 14:30 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-15 13:36 Help with rich_location and GIMPLE stmts Martin Liška
2017-05-16 19:14 ` David Malcolm
2017-05-18 11:22   ` Martin Liška
2017-05-19 12:14     ` Marek Polacek
2017-05-19 14:30       ` Martin Liška

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).