[PR24444] speed up locview resolution wiht relaxable frags

public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed

* [PR24444] speed up locview resolution wiht relaxable frags
@ 2019-04-13  8:57 Alexandre Oliva
  2019-04-14 13:22 ` Alan Modra
  0 siblings, 1 reply; 10+ messages in thread
From: Alexandre Oliva @ 2019-04-13  8:57 UTC (permalink / raw)
  To: binutils

Targets such as xtensa incur a much higher overhead to resolve
location view numbers than e.g. x86, because the expressions used to
compute view numbers cannot be resolved soon enough.

Each view number is computed by incrementing the previous view, if
they are both at the same address, or by resetting it to zero
otherwise.  If PV is the previous view number, PL is its location, and
NL is the location of the next view, its number is computed by
evaluating NV = !(NL > PL) * (PV + 1).

set_or_check_view uses resolve_expression to decide whether portions
of this expression can be simplified to constants.  The (NL > PL)
subexpression is one that can often be resolved to a constant,
breaking chains of view number computations at instructions of nonzero
length, but not after alignment that might be unnecessary.

Alas, when nearly every frag ends with a relaxable instruction,
frag_offset_fixed_p will correctly fail to determine a known offset
between two unresolved addresses in neighboring frags, so the
unresolved symbolic operation will be constructed and used in the
computation of most view numbers.  This results in very deep
expressions.

As view numbers get referenced in location view lists, each operand in
the list goes through symbol_clone_if_forward_ref, which recurses on
every subexpression.  If each view number were to be referenced, this
would exhibit O(n^2) behavior, where n is the depth of the view number
expressions, i.e., the length of view number sequences without an
early resolution that cuts the expression short.

I see two ways to go about this:

- make forward_ref a tri-state, so that symbol_clone_if_forward_ref
can avoid revisiting the same nodes over and over when they have
already been visited and the conclusion was that their expression does
not contain forward refs.

- enable address compares used by view numbering to be resolved even
when exact offsets are not known, using new logic to determine when
the location either remained the same or changed for sure, even with
the possibility of relaxation.

This patch implements the latter, assuming fr_fix to be the minimum
size the frag will end up with, even for machine-dependent frags, but
not rs_org: for those, or for negative fr_fix, it conservatively
refrains from resolving the expression.  This enables most view number
expressions to be resolved with a small, reasonable depth.

Now, is the assumption on fr_fix a reasonable one to make across all
supported targets?

Tested on x86_64-linux-gnu, native and cross to xtensa-elf.  No new
testcase; the issue was just performance, not output correctness.  Ok to
install?

for  gas/ChangeLog

	PR gas/24444
	* frags.c (frag_gtoffset_p): New.
	* frags.h (frag_gtoffset_p): Declare it.
	* expr.c (resolve_expression): Use it.
---
 gas/expr.c  |    5 +++-
 gas/frags.c |   74 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 gas/frags.h |    2 ++
 3 files changed, 80 insertions(+), 1 deletion(-)

diff --git a/gas/expr.c b/gas/expr.c
index ee85bda1cc60..3efde88cc0a2 100644
--- a/gas/expr.c
+++ b/gas/expr.c
@@ -2179,7 +2179,10 @@ resolve_expression (expressionS *expressionP)
 		|| op == O_lt || op == O_le || op == O_ge || op == O_gt)
 	       && seg_left == seg_right
 	       && (finalize_syms
-		   || frag_offset_fixed_p (frag_left, frag_right, &frag_off))
+		   || frag_offset_fixed_p (frag_left, frag_right, &frag_off)
+		   || (op == O_gt
+		       && frag_gtoffset_p (left, frag_left,
+					   right, frag_right, &frag_off)))
 	       && (seg_left != reg_section || left == right)
 	       && (seg_left != undefined_section || add_symbol == op_symbol)))
 	{
diff --git a/gas/frags.c b/gas/frags.c
index 90096ff696d7..e2ff79fbcc04 100644
--- a/gas/frags.c
+++ b/gas/frags.c
@@ -460,3 +460,77 @@ frag_offset_fixed_p (const fragS *frag1, const fragS *frag2, offsetT *offset)

   return FALSE;
 }
+
+/* Return TRUE if we can determine whether FRAG1 OFF1 appears after
+   (strict >, not >=) FRAG2 OFF2, assuming it is not before.  Set
+   *OFFSET so that resolve_expression will resolve an O_gt operation
+   between them to false (0) if they are guaranteed to be at the same
+   location, or to true (-1) if they are guaranteed to be at different
+   locations.  Return FALSE conservatively, e.g. if neither result can
+   be guaranteed (yet).
+
+   They are known to be in the same segment, and not the same frag
+   (this is a fallback for frag_offset_fixed_p, that always takes care
+   of this case), and it is expected (from the uses this is designed
+   to simplify, namely location view increments) that frag1 is
+   reachable from frag2 following the fr_next links, rather than the
+   other way round.  */
+
+bfd_boolean
+frag_gtoffset_p (valueT off1, const fragS *frag1,
+		 valueT off2, const fragS *frag2, offsetT *offset)
+{
+  if (frag1 == frag2 || off2 > (valueT)frag2->fr_fix)
+    return FALSE;
+
+  /* If any frag from frag2, inclusive, to frag1, exclusive, has a
+     nonzero fixed length, the addresses won't be the same as long as
+     we reach frag1.  */
+  bfd_boolean maybe_same_addr = off1 == 0 && off2 == (valueT)frag2->fr_fix;
+  bfd_boolean found_varlength = FALSE;
+  const fragS *frag = frag2;
+  for (;;)
+    {
+      switch (frag->fr_type)
+	{
+	case rs_dummy:
+	  break;
+
+	case rs_fill:
+	case rs_align:
+	case rs_align_code:
+	case rs_align_test:
+	case rs_space:
+	case rs_space_nop:
+	case rs_fill_nop:
+	case rs_leb128:
+	case rs_cfa:
+	case rs_dwarf2dbg:
+	case rs_machine_dependent:
+	  found_varlength = TRUE;
+	  break;
+
+	default:
+     /* case rs_broken_word: */
+	case rs_org:
+	  return FALSE;
+	}
+
+      frag = frag->fr_next;
+      if (frag == frag1 || !frag)
+	break;
+      if (maybe_same_addr && frag->fr_fix > 0)
+	maybe_same_addr = FALSE;
+      if (frag->fr_fix < 0)
+	return FALSE;
+    }
+
+  /* If both are TRUE, we don't know whether or not it will resolve to
+     the same address.  */
+  if (maybe_same_addr && found_varlength)
+    return FALSE;
+
+  if (frag)
+    *offset = (off1 - off2 - !maybe_same_addr) * OCTETS_PER_BYTE;
+  return !!frag;
+}
diff --git a/gas/frags.h b/gas/frags.h
index 6ac5d4426c79..9eb066229085 100644
--- a/gas/frags.h
+++ b/gas/frags.h
@@ -154,6 +154,8 @@ char *frag_var (relax_stateT type,
 		char *opcode);

 bfd_boolean frag_offset_fixed_p (const fragS *, const fragS *, offsetT *);
+bfd_boolean frag_gtoffset_p (valueT, const fragS *, valueT, const fragS *,
+			     offsetT *);

 int get_frag_count (void);
 void clear_frag_count (void);

-- 
Alexandre Oliva, freedom fighter   https://FSFLA.org/blogs/lxo
Be the change, be Free!         FSF Latin America board member
GNU Toolchain Engineer                Free Software Evangelist
Hay que enGNUrecerse, pero sin perder la terGNUra jamás-GNUChe

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PR24444] speed up locview resolution wiht relaxable frags
  2019-04-13  8:57 [PR24444] speed up locview resolution wiht relaxable frags Alexandre Oliva
@ 2019-04-14 13:22 ` Alan Modra
  2019-04-16 13:24   ` Alexandre Oliva
  0 siblings, 1 reply; 10+ messages in thread
From: Alan Modra @ 2019-04-14 13:22 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: binutils

On Sat, Apr 13, 2019 at 05:55:34AM -0300, Alexandre Oliva wrote:
> 
> Targets such as xtensa incur a much higher overhead to resolve
> location view numbers than e.g. x86, because the expressions used to
> compute view numbers cannot be resolved soon enough.
> 
> Each view number is computed by incrementing the previous view, if
> they are both at the same address, or by resetting it to zero
> otherwise.  If PV is the previous view number, PL is its location, and
> NL is the location of the next view, its number is computed by
> evaluating NV = !(NL > PL) * (PV + 1).
> 
> set_or_check_view uses resolve_expression to decide whether portions
> of this expression can be simplified to constants.  The (NL > PL)
> subexpression is one that can often be resolved to a constant,
> breaking chains of view number computations at instructions of nonzero
> length, but not after alignment that might be unnecessary.
> 
> Alas, when nearly every frag ends with a relaxable instruction,
> frag_offset_fixed_p will correctly fail to determine a known offset
> between two unresolved addresses in neighboring frags, so the
> unresolved symbolic operation will be constructed and used in the
> computation of most view numbers.  This results in very deep
> expressions.
> 
> As view numbers get referenced in location view lists, each operand in
> the list goes through symbol_clone_if_forward_ref, which recurses on
> every subexpression.  If each view number were to be referenced, this
> would exhibit O(n^2) behavior, where n is the depth of the view number
> expressions, i.e., the length of view number sequences without an
> early resolution that cuts the expression short.
> 
> I see two ways to go about this:
> 
> - make forward_ref a tri-state, so that symbol_clone_if_forward_ref
> can avoid revisiting the same nodes over and over when they have
> already been visited and the conclusion was that their expression does
> not contain forward refs.

This doesn't work.  I threw together a quick patch when I saw this bug
report and the result was a reduction in gas run-time from 89 to 80
minutes on the testcase in the PR.  The problem is simply the depth of
the expressions..

> - enable address compares used by view numbering to be resolved even
> when exact offsets are not known, using new logic to determine when
> the location either remained the same or changed for sure, even with
> the possibility of relaxation.

Whereas this approach does in fact result in shorter expressions,
reducing run-time to 3 seconds.

> This patch implements the latter, assuming fr_fix to be the minimum
> size the frag will end up with, even for machine-dependent frags, but
> not rs_org: for those, or for negative fr_fix, it conservatively
> refrains from resolving the expression.  This enables most view number
> expressions to be resolved with a small, reasonable depth.
> 
> Now, is the assumption on fr_fix a reasonable one to make across all
> supported targets?

Yes, I believe so.  In fact, fr_fix is really unsigned.  Also, it is
an error for rs_org frags to go backwards.

> Tested on x86_64-linux-gnu, native and cross to xtensa-elf.  No new
> testcase; the issue was just performance, not output correctness.  Ok to
> install?

Um, the testcase object file after this patch differs from the
original.  First readelf -wi difference shown below.

@@ -109941,7 +109941,7 @@
  <2><3e4f6>: Abbrev Number: 8 (DW_TAG_inlined_subroutine)
     <3e4f7>   DW_AT_abstract_origin: <0x2e531>
     <3e4fb>   DW_AT_entry_pc    : 0x2a
-    <3e4ff>   DW_AT_GNU_entry_view: 9
+    <3e4ff>   DW_AT_GNU_entry_view: 2
     <3e501>   DW_AT_low_pc      : 0x2a
     <3e505>   DW_AT_high_pc     : 0x0
     <3e509>   DW_AT_call_file   : 3


> 
> 
> for  gas/ChangeLog
> 
> 	PR gas/24444
> 	* frags.c (frag_gtoffset_p): New.
> 	* frags.h (frag_gtoffset_p): Declare it.
> 	* expr.c (resolve_expression): Use it.
> ---
>  gas/expr.c  |    5 +++-
>  gas/frags.c |   74 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  gas/frags.h |    2 ++
>  3 files changed, 80 insertions(+), 1 deletion(-)
> 
> diff --git a/gas/expr.c b/gas/expr.c
> index ee85bda1cc60..3efde88cc0a2 100644
> --- a/gas/expr.c
> +++ b/gas/expr.c
> @@ -2179,7 +2179,10 @@ resolve_expression (expressionS *expressionP)
>  		|| op == O_lt || op == O_le || op == O_ge || op == O_gt)
>  	       && seg_left == seg_right
>  	       && (finalize_syms
> -		   || frag_offset_fixed_p (frag_left, frag_right, &frag_off))
> +		   || frag_offset_fixed_p (frag_left, frag_right, &frag_off)
> +		   || (op == O_gt
> +		       && frag_gtoffset_p (left, frag_left,
> +					   right, frag_right, &frag_off)))
>  	       && (seg_left != reg_section || left == right)
>  	       && (seg_left != undefined_section || add_symbol == op_symbol)))
>  	{
> diff --git a/gas/frags.c b/gas/frags.c
> index 90096ff696d7..e2ff79fbcc04 100644
> --- a/gas/frags.c
> +++ b/gas/frags.c
> @@ -460,3 +460,77 @@ frag_offset_fixed_p (const fragS *frag1, const fragS *frag2, offsetT *offset)
>  
>    return FALSE;
>  }
> +
> +/* Return TRUE if we can determine whether FRAG1 OFF1 appears after
> +   (strict >, not >=) FRAG2 OFF2, assuming it is not before.  Set
> +   *OFFSET so that resolve_expression will resolve an O_gt operation
> +   between them to false (0) if they are guaranteed to be at the same
> +   location, or to true (-1) if they are guaranteed to be at different
> +   locations.  Return FALSE conservatively, e.g. if neither result can
> +   be guaranteed (yet).
> +
> +   They are known to be in the same segment, and not the same frag
> +   (this is a fallback for frag_offset_fixed_p, that always takes care
> +   of this case), and it is expected (from the uses this is designed
> +   to simplify, namely location view increments) that frag1 is
> +   reachable from frag2 following the fr_next links, rather than the
> +   other way round.  */

For the sake of anyone reading this code, please swap the param names,
with frag1/off1 being the first in the frag chain.

> +
> +bfd_boolean
> +frag_gtoffset_p (valueT off1, const fragS *frag1,
> +		 valueT off2, const fragS *frag2, offsetT *offset)
> +{
> +  if (frag1 == frag2 || off2 > (valueT)frag2->fr_fix)
> +    return FALSE;
> +
> +  /* If any frag from frag2, inclusive, to frag1, exclusive, has a
> +     nonzero fixed length, the addresses won't be the same as long as
> +     we reach frag1.  */
> +  bfd_boolean maybe_same_addr = off1 == 0 && off2 == (valueT)frag2->fr_fix;

Ins't it true that if the symbol offsets are not at exactly the above
values, then you already know the result and there's no need to
traverse the frags?  This is assuming you can't .org backwards, which
is the case.

> +  bfd_boolean found_varlength = FALSE;
> +  const fragS *frag = frag2;
> +  for (;;)
> +    {
> +      switch (frag->fr_type)
> +	{
> +	case rs_dummy:
> +	  break;
> +
> +	case rs_fill:
> +	case rs_align:
> +	case rs_align_code:
> +	case rs_align_test:
> +	case rs_space:
> +	case rs_space_nop:
> +	case rs_fill_nop:
> +	case rs_leb128:
> +	case rs_cfa:
> +	case rs_dwarf2dbg:
> +	case rs_machine_dependent:
> +	  found_varlength = TRUE;
> +	  break;
> +
> +	default:
> +     /* case rs_broken_word: */
> +	case rs_org:
> +	  return FALSE;
> +	}
> +
> +      frag = frag->fr_next;
> +      if (frag == frag1 || !frag)
> +	break;
> +      if (maybe_same_addr && frag->fr_fix > 0)
> +	maybe_same_addr = FALSE;
> +      if (frag->fr_fix < 0)
> +	return FALSE;
> +    }
> +
> +  /* If both are TRUE, we don't know whether or not it will resolve to
> +     the same address.  */
> +  if (maybe_same_addr && found_varlength)
> +    return FALSE;
> +
> +  if (frag)
> +    *offset = (off1 - off2 - !maybe_same_addr) * OCTETS_PER_BYTE;
> +  return !!frag;
> +}
> diff --git a/gas/frags.h b/gas/frags.h
> index 6ac5d4426c79..9eb066229085 100644
> --- a/gas/frags.h
> +++ b/gas/frags.h
> @@ -154,6 +154,8 @@ char *frag_var (relax_stateT type,
>  		char *opcode);
>  
>  bfd_boolean frag_offset_fixed_p (const fragS *, const fragS *, offsetT *);
> +bfd_boolean frag_gtoffset_p (valueT, const fragS *, valueT, const fragS *,
> +			     offsetT *);
>  
>  int get_frag_count (void);
>  void clear_frag_count (void);

The following makes the changes I'm suggesting, and simplifies a few
more things.  The object file for the testcase is the same as with
your patch.


/* Return TRUE if we can determine whether FRAG2 OFF2 appears after
   (strict >, not >=) FRAG1 OFF1, assuming it is not before.  Set
   *OFFSET so that resolve_expression will resolve an O_gt operation
   between them to false (0) if they are guaranteed to be at the same
   location, or to true (-1) if they are guaranteed to be at different
   locations.  Return FALSE conservatively, e.g. if neither result can
   be guaranteed (yet).

   They are known to be in the same segment, and not the same frag
   (this is a fallback for frag_offset_fixed_p, that always takes care
   of this case), and it is expected (from the uses this is designed
   to simplify, namely location view increments) that frag2 is
   reachable from frag1 following the fr_next links, rather than the
   other way round.  */

bfd_boolean
frag_gtoffset_p (valueT off2, const fragS *frag2,
		 valueT off1, const fragS *frag1, offsetT *offset)
{
  /* Insanity check.  */
  if (frag2 == frag1 || off1 > (valueT) frag1->fr_fix)
    return FALSE;

  /* If the first symbol offset is at the end of the first frag and
     the second symbol offset at the beginning of the second frag then
     it is possible they are at the same address.  Go looking for a
     non-zero fr_fix in any frag between these frags.  If found then
     we can say the O_gt result will be true.  If no such frag is
     found we assume that frag1 or any of the following frags might
     have a variable tail and thus the answer is unknown.  This isn't
     strictly true; some frags don't have a variable tail, but it
     doesn't seem worth optimizing for those cases.
     We could omit the loop on off1 != frag1->fr_fix || off2 != 0  and
     exit the loop on finding a non-zero fr_fix, but loop until we hit
     frag2 to check our assumptions.  */
  const fragS *frag = frag1;
  offsetT delta = off2 - off1;
  for (;;)
    {
      delta += frag->fr_fix;
      frag = frag->fr_next;
      if (frag == frag2)
	{
	  if (delta == 0)
	    return FALSE;
	  break;
	}
      /* Insanity check.  */
      if (frag == NULL)
	return FALSE;
    }

  *offset = (off2 - off1 - 1) * OCTETS_PER_BYTE;
  return TRUE;
}

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PR24444] speed up locview resolution wiht relaxable frags
  2019-04-14 13:22 ` Alan Modra
@ 2019-04-16 13:24   ` Alexandre Oliva
  2019-04-17  1:32     ` Alan Modra
  2019-04-24 23:18     ` Alan Modra
  0 siblings, 2 replies; 10+ messages in thread
From: Alexandre Oliva @ 2019-04-16 13:24 UTC (permalink / raw)
  To: Alan Modra; +Cc: binutils

On Apr 14, 2019, Alan Modra <amodra@gmail.com> wrote:

> Yes, I believe so.  In fact, fr_fix is really unsigned.  Also, it is
> an error for rs_org frags to go backwards.

Nice, that makes things simpler.

>> Tested on x86_64-linux-gnu, native and cross to xtensa-elf.  No new
>> testcase; the issue was just performance, not output correctness.  Ok to
>> install?

> Um, the testcase object file after this patch differs from the
> original.  First readelf -wi difference shown below.

Which testcase was that?  I didn't catch that one.

>> +  /* If any frag from frag2, inclusive, to frag1, exclusive, has a
>> +     nonzero fixed length, the addresses won't be the same as long as
>> +     we reach frag1.  */
>> +  bfd_boolean maybe_same_addr = off1 == 0 && off2 == (valueT)frag2->fr_fix;

> Ins't it true that if the symbol offsets are not at exactly the above
> values, then you already know the result and there's no need to
> traverse the frags?  This is assuming you can't .org backwards, which
> is the case.

Not entirely, no.  Although the comments state we assume a certain
ordering of frags, the assumption is only valid for O_gt operations
arising from the location view machinery.  It doesn't hurt if we resolve
other O_gt operations if we can find the result, but those could have
frags in the opposite order.

Now, O_gt is not exactly the operation the location view machinery
needs; O_ne (or, after negation, O_eq) would be more like it.  IIRC the
reason I went for O_gt was just that it was resolved early in a lot more
cases than O_eq.  Now, with this kind of fallback, we could actually
introduce another operation kind that (i) resolves to zero if frags are
not in the same (sub)section, and (ii) can safely make the optimization
you suggested for frags in the same (sub)section, with the extra benefit
that we know we won't ever search all the way to the end of the frag
linked list without finding the other frag: we could assert-check that,
and stop the search at the first nonzero-fr_fix.

Would a new op with this semantics (O_incview?, vs reset view) be
preferred?

> The following makes the changes I'm suggesting, and simplifies a few
> more things.

Thanks, I'll wait for a response to the question above before taking
further action on this, but I'll likely integrate your change (credited
in the ChangeLog) in the patch regardless.

-- 
Alexandre Oliva, freedom fighter   https://FSFLA.org/blogs/lxo
Be the change, be Free!         FSF Latin America board member
GNU Toolchain Engineer                Free Software Evangelist
Hay que enGNUrecerse, pero sin perder la terGNUra jamás-GNUChe

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PR24444] speed up locview resolution wiht relaxable frags
  2019-04-16 13:24   ` Alexandre Oliva
@ 2019-04-17  1:32     ` Alan Modra
  2019-04-24 23:18     ` Alan Modra
  1 sibling, 0 replies; 10+ messages in thread
From: Alan Modra @ 2019-04-17  1:32 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: binutils

On Tue, Apr 16, 2019 at 10:24:04AM -0300, Alexandre Oliva wrote:
> On Apr 14, 2019, Alan Modra <amodra@gmail.com> wrote:
> 
> > Yes, I believe so.  In fact, fr_fix is really unsigned.  Also, it is
> > an error for rs_org frags to go backwards.
> 
> Nice, that makes things simpler.
> 
> >> Tested on x86_64-linux-gnu, native and cross to xtensa-elf.  No new
> >> testcase; the issue was just performance, not output correctness.  Ok to
> >> install?
> 
> > Um, the testcase object file after this patch differs from the
> > original.  First readelf -wi difference shown below.
> 
> Which testcase was that?  I didn't catch that one.

The https://sourceware.org/bugzilla/show_bug.cgi?id=24444 reproducer
attachment.

I see differences in DW_AT_GNU_entry_view, in all cases substituting
3 for 10, or 2 for 9 (lower number after your patch).  I don't know
enough to say whether this is a reasonable result.

> 
> >> +  /* If any frag from frag2, inclusive, to frag1, exclusive, has a
> >> +     nonzero fixed length, the addresses won't be the same as long as
> >> +     we reach frag1.  */
> >> +  bfd_boolean maybe_same_addr = off1 == 0 && off2 == (valueT)frag2->fr_fix;
> 
> > Ins't it true that if the symbol offsets are not at exactly the above
> > values, then you already know the result and there's no need to
> > traverse the frags?  This is assuming you can't .org backwards, which
> > is the case.
> 
> Not entirely, no.  Although the comments state we assume a certain
> ordering of frags, the assumption is only valid for O_gt operations
> arising from the location view machinery.  It doesn't hurt if we resolve
> other O_gt operations if we can find the result, but those could have
> frags in the opposite order.

Thanks for reminding me.  I wrote that comment before writing the
simplification of your patch, and there you'll note I did chase down
the second frag..

> Now, O_gt is not exactly the operation the location view machinery
> needs; O_ne (or, after negation, O_eq) would be more like it.  IIRC the
> reason I went for O_gt was just that it was resolved early in a lot more
> cases than O_eq.  Now, with this kind of fallback, we could actually
> introduce another operation kind that (i) resolves to zero if frags are
> not in the same (sub)section, and (ii) can safely make the optimization
> you suggested for frags in the same (sub)section, with the extra benefit
> that we know we won't ever search all the way to the end of the frag
> linked list without finding the other frag: we could assert-check that,
> and stop the search at the first nonzero-fr_fix.
> 
> Would a new op with this semantics (O_incview?, vs reset view) be
> preferred?

No, if O_gt works let's go with that.

> > The following makes the changes I'm suggesting, and simplifies a few
> > more things.
> 
> Thanks, I'll wait for a response to the question above before taking
> further action on this, but I'll likely integrate your change (credited
> in the ChangeLog) in the patch regardless.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PR24444] speed up locview resolution wiht relaxable frags
  2019-04-16 13:24   ` Alexandre Oliva
  2019-04-17  1:32     ` Alan Modra
@ 2019-04-24 23:18     ` Alan Modra
  2019-04-30  5:49       ` Alexandre Oliva
  1 sibling, 1 reply; 10+ messages in thread
From: Alan Modra @ 2019-04-24 23:18 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: binutils

I pushed the following for you.
----
Speed up locview resolution with relaxable frags

Targets such as xtensa incur a much higher overhead to resolve
location view numbers than e.g. x86, because the expressions used to
compute view numbers cannot be resolved soon enough.

Each view number is computed by incrementing the previous view, if
they are both at the same address, or by resetting it to zero
otherwise.  If PV is the previous view number, PL is its location, and
NL is the location of the next view, its number is computed by
evaluating NV = !(NL > PL) * (PV + 1).

set_or_check_view uses resolve_expression to decide whether portions
of this expression can be simplified to constants.  The (NL > PL)
subexpression is one that can often be resolved to a constant,
breaking chains of view number computations at instructions of nonzero
length, but not after alignment that might be unnecessary.

Alas, when nearly every frag ends with a relaxable instruction,
frag_offset_fixed_p will correctly fail to determine a known offset
between two unresolved addresses in neighboring frags, so the
unresolved symbolic operation will be constructed and used in the
computation of most view numbers.  This results in very deep
expressions.

As view numbers get referenced in location view lists, each operand in
the list goes through symbol_clone_if_forward_ref, which recurses on
every subexpression.  If each view number were to be referenced, this
would exhibit O(n^2) behavior, where n is the depth of the view number
expressions, i.e., the length of view number sequences without an
early resolution that cuts the expression short.

This patch enables address compares used by view numbering to be
resolved even when exact offsets are not known, using new logic to
determine when the location either remained the same or changed for
sure, even with the possibility of relaxation.  This enables most view
number expressions to be resolved with a small, reasonable depth.

	PR gas/24444
	* frags.c (frag_gtoffset_p): New.
	* frags.h (frag_gtoffset_p): Declare it.
	* expr.c (resolve_expression): Use it.

diff --git a/gas/ChangeLog b/gas/ChangeLog
index eeabe0ccae..8558ecd7b4 100644
--- a/gas/ChangeLog
+++ b/gas/ChangeLog
@@ -1,3 +1,11 @@
+2019-04-25  Alexandre Oliva  <aoliva@redhat.com>
+	    Alan Modra  <amodra@gmail.com>
+
+	PR gas/24444
+	* frags.c (frag_gtoffset_p): New.
+	* frags.h (frag_gtoffset_p): Declare it.
+	* expr.c (resolve_expression): Use it.
+
 2019-04-24  Alan Modra  <amodra@gmail.com>
 
 	PR 24444
diff --git a/gas/expr.c b/gas/expr.c
index ee85bda1cc..3efde88cc0 100644
--- a/gas/expr.c
+++ b/gas/expr.c
@@ -2179,7 +2179,10 @@ resolve_expression (expressionS *expressionP)
 		|| op == O_lt || op == O_le || op == O_ge || op == O_gt)
 	       && seg_left == seg_right
 	       && (finalize_syms
-		   || frag_offset_fixed_p (frag_left, frag_right, &frag_off))
+		   || frag_offset_fixed_p (frag_left, frag_right, &frag_off)
+		   || (op == O_gt
+		       && frag_gtoffset_p (left, frag_left,
+					   right, frag_right, &frag_off)))
 	       && (seg_left != reg_section || left == right)
 	       && (seg_left != undefined_section || add_symbol == op_symbol)))
 	{
diff --git a/gas/frags.c b/gas/frags.c
index a9fc003fbd..2f21b9db20 100644
--- a/gas/frags.c
+++ b/gas/frags.c
@@ -462,3 +462,58 @@ frag_offset_fixed_p (const fragS *frag1, const fragS *frag2, offsetT *offset)
 
   return FALSE;
 }
+
+/* Return TRUE if we can determine whether FRAG2 OFF2 appears after
+   (strict >, not >=) FRAG1 OFF1, assuming it is not before.  Set
+   *OFFSET so that resolve_expression will resolve an O_gt operation
+   between them to false (0) if they are guaranteed to be at the same
+   location, or to true (-1) if they are guaranteed to be at different
+   locations.  Return FALSE conservatively, e.g. if neither result can
+   be guaranteed (yet).
+
+   They are known to be in the same segment, and not the same frag
+   (this is a fallback for frag_offset_fixed_p, that always takes care
+   of this case), and it is expected (from the uses this is designed
+   to simplify, namely location view increments) that frag2 is
+   reachable from frag1 following the fr_next links, rather than the
+   other way round.  */
+
+bfd_boolean
+frag_gtoffset_p (valueT off2, const fragS *frag2,
+		 valueT off1, const fragS *frag1, offsetT *offset)
+{
+  /* Insanity check.  */
+  if (frag2 == frag1 || off1 > frag1->fr_fix)
+    return FALSE;
+
+  /* If the first symbol offset is at the end of the first frag and
+     the second symbol offset at the beginning of the second frag then
+     it is possible they are at the same address.  Go looking for a
+     non-zero fr_fix in any frag between these frags.  If found then
+     we can say the O_gt result will be true.  If no such frag is
+     found we assume that frag1 or any of the following frags might
+     have a variable tail and thus the answer is unknown.  This isn't
+     strictly true; some frags don't have a variable tail, but it
+     doesn't seem worth optimizing for those cases.  */
+  const fragS *frag = frag1;
+  offsetT delta = off2 - off1;
+  for (;;)
+    {
+      delta += frag->fr_fix;
+      frag = frag->fr_next;
+      if (frag == frag2)
+	{
+	  if (delta == 0)
+	    return FALSE;
+	  break;
+	}
+      /* If we run off the end of the frag chain then we have a case
+	 where frag2 is not after frag1, ie. an O_gt expression not
+	 created for .loc view.  */
+      if (frag == NULL)
+	return FALSE;
+    }
+
+  *offset = (off2 - off1 - delta) * OCTETS_PER_BYTE;
+  return TRUE;
+}
diff --git a/gas/frags.h b/gas/frags.h
index 741ae2ba22..6d84aaab3f 100644
--- a/gas/frags.h
+++ b/gas/frags.h
@@ -154,6 +154,8 @@ char *frag_var (relax_stateT type,
 		char *opcode);
 
 bfd_boolean frag_offset_fixed_p (const fragS *, const fragS *, offsetT *);
+bfd_boolean frag_gtoffset_p (valueT, const fragS *, valueT, const fragS *,
+			     offsetT *);
 
 int get_frag_count (void);
 void clear_frag_count (void);

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PR24444] speed up locview resolution wiht relaxable frags
  2019-04-24 23:18     ` Alan Modra
@ 2019-04-30  5:49       ` Alexandre Oliva
  2019-05-03  1:14         ` Alan Modra
  2019-05-04  3:22         ` Alexandre Oliva
  0 siblings, 2 replies; 10+ messages in thread
From: Alexandre Oliva @ 2019-04-30  5:49 UTC (permalink / raw)
  To: Alan Modra; +Cc: binutils

On Apr 24, 2019, Alan Modra <amodra@gmail.com> wrote:

> I pushed the following for you.

Thanks for taking care of this while I convalesced.

I'm afraid we may still have a problem I hadn't realized before.

Consider two text subsections, say 0 and 1.

We switch from 0 to 1 and output a loc with a view.

Later we switch back, and also output a loc with a view.

Once addresses are resolved, one of these will have necessarily gone
from a higher offset to a lower offset within the text section.  In rare
cases, the subsection boundary may actually have locviews in both
subsections, at the same address.

O_gt would give us an answer that does not match the expectation that we
should reset view when PC changes.

However, it may be worse than that.  Loc view expressions are chained
back to the previous loc view present in the asm input, even if it's in
a different section.

This means O_gt on loc addresses will give us the wrong result when we
switch between subsections from higher to lower address, and it will
also likely give us the wrong result at the boundary.

Consider:

.text 0
.loc 1 1 view -0
nop
.L2: .loc 1 2 view .LV2
.text 1
.L3: .loc 1 3 view .LV3
nop
.L4: .loc 1 4 view .LV4
.text 0
.L5: .loc 1 5 view .LV5

We'll compute:

.LV3 = (!(.L3 > .L2)) * (.Lv2 + 1)
.Lv5 = (!(.L5 > .L4)) * (.LV4 + 1)

while we should really compute:

.Lv3 = (.L3 == .L5) * (.LV5 + 1)
.LV5 = (.L5 == .L2) * (.LV2 + 1)

because the reordered text section, based on which we output the line
number program, is:

.text
.loc 1 1 view -0
nop
.L2: .loc 1 2 view .LV2 # 0
.L5: .loc 1 5 view .LV5 # 1
# --- transition from .text 0 to .text 1
.L3: .loc 1 3 view .LV3 # 2
nop
.L4: .loc 1 4 view .LV4 # 0

AFAICT fixing this will require introducing artificial undefined symbols
to be used as previous view label and previous view number for each
subsection, and then, when ordering subsections and sections, defining
the initial previous symbols of each subsection as equal to the final
previous symbols of the previous subsection in the same section.

Maybe with this arrangement we can keep on using O_gt, though reducing
the depth of the loc view expressions feels appealing in the context of
this PR.

Any better ideas?

-- 
Alexandre Oliva, freedom fighter   https://FSFLA.org/blogs/lxo
Be the change, be Free!         FSF Latin America board member
GNU Toolchain Engineer                Free Software Evangelist
Hay que enGNUrecerse, pero sin perder la terGNUra jamás-GNUChe

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PR24444] speed up locview resolution wiht relaxable frags
  2019-04-30  5:49       ` Alexandre Oliva
@ 2019-05-03  1:14         ` Alan Modra
  2019-05-04  0:56           ` Alexandre Oliva
  2019-05-04  3:22         ` Alexandre Oliva
  1 sibling, 1 reply; 10+ messages in thread
From: Alan Modra @ 2019-05-03  1:14 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: binutils

On Tue, Apr 30, 2019 at 02:49:38AM -0300, Alexandre Oliva wrote:
> AFAICT fixing this will require introducing artificial undefined symbols
> to be used as previous view label and previous view number for each
> subsection, and then, when ordering subsections and sections, defining
> the initial previous symbols of each subsection as equal to the final
> previous symbols of the previous subsection in the same section.

Don't you already keep struct line_entry on per-subseg lists?  Frags
too are kept on per-subseg lists until chain_frchains_together.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PR24444] speed up locview resolution wiht relaxable frags
  2019-05-03  1:14         ` Alan Modra
@ 2019-05-04  0:56           ` Alexandre Oliva
  0 siblings, 0 replies; 10+ messages in thread
From: Alexandre Oliva @ 2019-05-04  0:56 UTC (permalink / raw)
  To: Alan Modra; +Cc: binutils

On May  2, 2019, Alan Modra <amodra@gmail.com> wrote:

> On Tue, Apr 30, 2019 at 02:49:38AM -0300, Alexandre Oliva wrote:
>> AFAICT fixing this will require introducing artificial undefined symbols
>> to be used as previous view label and previous view number for each
>> subsection, and then, when ordering subsections and sections, defining
>> the initial previous symbols of each subsection as equal to the final
>> previous symbols of the previous subsection in the same section.

> Don't you already keep struct line_entry on per-subseg lists?

Yeah, sorry.  I had indeed started out worried about that, but I failed
to remove some traces of that suspicion from my message even after I
narrowed the problem down to the boundaries.

This testcase demonstrates the problem: the generated line number
program has nonzero views at subsection boundaries, but the view numbers
computed and assigned by the view expressions do not match those implied
by the line number program, because we assume a view reset when first
entering a subsection.

This would only be appropriate (and even then somewhat problematic) if
we generated a separate line number program, or forced a view reset in
the line number program at subsection boundaries.

/* Test view numbering continuity at subsection borders.

   Copyright (C) 2017-2019 Free Software Foundation, Inc.

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; either version 3 of the License, or
   (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */

	.file "dwarf2-19.c"
	.text 0
	.balign 4
	.globl _start
_start:
	.file 1 "dwarf2-19.c"
	.loc 1 1 view 0

	.section .rodata
	.uleb128 .L1
	.uleb128 .L3
	.uleb128 .L4
	.uleb128 .L2

	.text 1
	.loc 1 2 view .L1 	/* same address as view 0 above -> view 1 */

	.text 2
	.loc 1 3 view .L2	/* same address as .L4 below -> view 2 */

	.text 1
	.dc.l 0
	.loc 1 4 view .L3	/* bumped address from .L1's, view 0 */
	.loc 1 5 view .L4	/* same address, view 1 */

#as:
#readelf: -x.rodata -wL
#name: DWARF2 19
# The am33 avr cr16 crx ft32 mn10 msp430 nds32 pru rl78 and xtensa targets do not evaluate the subtraction of symbols at assembly time.
# The riscv targets do not support the subtraction of symbols.
# The tile targets require 8-byte instructions, but the test only simulates 4-byte instructions.
#notarget: am3*-* avr-* cr16-* crx-* ft32*-* mn10*-* msp430-* nds32*-* pru-* riscv*-* rl78-* tile*-* xtensa-*

Hex dump of section '\.rodata':
  0x00000000 01000102 *.*

Contents of the \.debug_line section:

CU: dwarf2-19\.c:
File name  *Line number  *Starting address  *View +Stmt
dwarf2-19\.c  *1  *0 +x
dwarf2-19\.c  *2  *0  *1 +x
dwarf2-19\.c  *4  *0x4 +x
dwarf2-19\.c  *5  *0x4  *1 +x
dwarf2-19\.c  *3  *0x4  *2 +x

-- 
Alexandre Oliva, freedom fighter  he/him   https://FSFLA.org/blogs/lxo
Be the change, be Free!                 FSF Latin America board member
GNU Toolchain Engineer                        Free Software Evangelist
Hay que enGNUrecerse, pero sin perder la terGNUra jamás - Che GNUevara

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PR24444] speed up locview resolution wiht relaxable frags
  2019-04-30  5:49       ` Alexandre Oliva
  2019-05-03  1:14         ` Alan Modra
@ 2019-05-04  3:22         ` Alexandre Oliva
  2019-05-04  5:44           ` Alan Modra
  1 sibling, 1 reply; 10+ messages in thread
From: Alexandre Oliva @ 2019-05-04  3:22 UTC (permalink / raw)
  To: Alan Modra; +Cc: binutils

On Apr 30, 2019, Alexandre Oliva <aoliva@redhat.com> wrote:

> .L5: .loc 1 5 view .LV5 # 1
> # --- transition from .text 0 to .text 1
> .L3: .loc 1 3 view .LV3 # 2

> AFAICT fixing this will require introducing artificial undefined symbols
> to be used as previous view label and previous view number for each
> subsection, and then, when ordering subsections and sections, defining
> the initial previous symbols of each subsection as equal to the final
> previous symbols of the previous subsection in the same section.

I ended up delaying the view assignment for heads of subsegments.
Here's the patch.  Tested on x86_64-linux-gnu native.  Ok to install?


[LVu] base subseg head view on prev subseg's tail

Location views at borders between subsegments/subsections in the same
segment/section are computed as if each new subsegment/subsection
started with a forced view reset to zero, but the line number program
does not introduce resets that are not explicitly requested, so if a
subsegment ends at the same address another starts, the line number
program will have a continuity of views at the border address, whereas
the initial view number label in the latter subsegment will be
miscomputed as zero.

This patch delays the assignment of view expressions at subsegment
heads to the time of chaining the frags of subsegments into a single
segment, so that they are set based on the view at the end of the
previous subsegment in the same segment.

The line number program created for the test program had an
unnecessary DW_LNS_advance_pc at the end.  This patch also arranges
for us not to emit it.


for  gas/ChangeLog

	* dwarf2dbg.c (set_or_check_view): Skip heads when assigning
	views of prior locs.
	(dwarf2_gen_line_info_1): Skip heads.
	(size_inc_line_addr, emit_inc_line_addr): Drop
	DW_LNS_advance_pc for zero addr delta.
	(dwarf2_finish): Assign views for heads of segments.
	* testsuite/gas/elf/dwarf2-19.d: New.
	* testsuite/gas/elf/dwarf2-19.s: New.
	* testsuite/gas/elf/elf.exp: Test it.
---
 gas/dwarf2dbg.c                   |   37 +++++++++++++++++++++++++++++----
 gas/testsuite/gas/elf/dwarf2-19.d |   21 +++++++++++++++++++
 gas/testsuite/gas/elf/dwarf2-19.s |   41 +++++++++++++++++++++++++++++++++++++
 gas/testsuite/gas/elf/elf.exp     |    1 +
 4 files changed, 95 insertions(+), 5 deletions(-)
 create mode 100644 gas/testsuite/gas/elf/dwarf2-19.d
 create mode 100644 gas/testsuite/gas/elf/dwarf2-19.s

diff --git a/gas/dwarf2dbg.c b/gas/dwarf2dbg.c
index 2d316ddcb9ca..b77751d8af3d 100644
--- a/gas/dwarf2dbg.c
+++ b/gas/dwarf2dbg.c
@@ -442,7 +442,16 @@ set_or_check_view (struct line_entry *e, struct line_entry *p,
       gas_assert (r == p);
       /* Set or check views until we find a defined or absent view.  */
       do
-	set_or_check_view (r, r->next, NULL);
+	{
+	  /* Do not define the head of a (sub?)segment view while
+	     handling others.  It would be defined too early, without
+	     regard to the last view of other subsegments.
+	     set_or_check_view will be called for every head segment
+	     that needs it.  */
+	  if (r == h)
+	    break;
+	  set_or_check_view (r, r->next, NULL);
+	}
       while (r->next && r->next->loc.view && !S_IS_DEFINED (r->next->loc.view)
 	     && (r = r->next));
 
@@ -454,6 +463,11 @@ set_or_check_view (struct line_entry *e, struct line_entry *p,
 	 simplify the view expressions, until we do so to P.  */
       do
 	{
+	  /* The head view of a subsegment may remain undefined while
+	     handling other elements, before it is linked to the last
+	     view of the previous subsegment.  */
+	  if (r == h)
+	    continue;
 	  gas_assert (S_IS_DEFINED (r->loc.view));
 	  resolve_expression (symbol_get_value_expression (r->loc.view));
 	}
@@ -480,9 +494,11 @@ dwarf2_gen_line_info_1 (symbolS *label, struct dwarf2_line_info *loc)
 
   lss = get_line_subseg (now_seg, now_subseg, TRUE);
 
-  if (loc->view)
+  /* Subseg heads are chained to previous subsegs in
+     dwarf2_finish.  */
+  if (loc->view && lss->head)
     set_or_check_view (e,
-		       !lss->head ? NULL : (struct line_entry *)lss->ptail,
+		       (struct line_entry *)lss->ptail,
 		       lss->head);
 
   *lss->ptail = e;
@@ -1176,7 +1192,7 @@ size_inc_line_addr (int line_delta, addressT addr_delta)
     {
       if (addr_delta == MAX_SPECIAL_ADDR_DELTA)
 	len = 1;
-      else
+      else if (addr_delta)
 	len = 1 + sizeof_leb128 (addr_delta, 0);
       return len + 3;
     }
@@ -1240,7 +1256,7 @@ emit_inc_line_addr (int line_delta, addressT addr_delta, char *p, int len)
     {
       if (addr_delta == MAX_SPECIAL_ADDR_DELTA)
 	*p++ = DW_LNS_const_add_pc;
-      else
+      else if (addr_delta)
 	{
 	  *p++ = DW_LNS_advance_pc;
 	  p += output_leb128 (p, addr_delta, 0);
@@ -2218,8 +2234,19 @@ dwarf2_finish (void)
       struct line_subseg *lss = s->head;
       struct line_entry **ptail = lss->ptail;
 
+      /* Reset the initial view of the first subsection of the
+	 section.  */
+      if (lss->head && lss->head->loc.view)
+	set_or_check_view (lss->head, NULL, NULL);
+
       while ((lss = lss->next) != NULL)
 	{
+	  /* Link the first view of subsequent subsections to the
+	     previous view.  */
+	  if (lss->head && lss->head->loc.view)
+	    set_or_check_view (lss->head,
+			       !s->head ? NULL : (struct line_entry *)ptail,
+			       s->head ? s->head->head : NULL);
 	  *ptail = lss->head;
 	  ptail = lss->ptail;
 	}
diff --git a/gas/testsuite/gas/elf/dwarf2-19.d b/gas/testsuite/gas/elf/dwarf2-19.d
new file mode 100644
index 000000000000..ebb3bf984b1c
--- /dev/null
+++ b/gas/testsuite/gas/elf/dwarf2-19.d
@@ -0,0 +1,21 @@
+#as:
+#readelf: -x.rodata -wL
+#name: DWARF2 19
+# The am33 avr cr16 crx ft32 mn10 msp430 nds32 pru rl78 and xtensa targets do not evaluate the subtraction of symbols at assembly time.
+# The riscv targets do not support the subtraction of symbols.
+# The tile targets require 8-byte instructions, but the test only simulates 4-byte instructions.
+#notarget: am3*-* avr-* cr16-* crx-* ft32*-* mn10*-* msp430-* nds32*-* pru-* riscv*-* rl78-* tile*-* xtensa-*
+
+Hex dump of section '\.rodata':
+  0x00000000 01000102 *.*
+
+Contents of the \.debug_line section:
+
+CU: dwarf2-19\.c:
+File name  *Line number  *Starting address  *View +Stmt
+dwarf2-19\.c  *1  *0 +x
+dwarf2-19\.c  *2  *0  *1 +x
+dwarf2-19\.c  *4  *0x4 +x
+dwarf2-19\.c  *5  *0x4  *1 +x
+dwarf2-19\.c  *3  *0x4  *2 +x
+dwarf2-19\.c  *3  *0x4  *3 +x
diff --git a/gas/testsuite/gas/elf/dwarf2-19.s b/gas/testsuite/gas/elf/dwarf2-19.s
new file mode 100644
index 000000000000..dd87be8fc9c1
--- /dev/null
+++ b/gas/testsuite/gas/elf/dwarf2-19.s
@@ -0,0 +1,41 @@
+/* Test view numbering continuity at subsection borders.
+
+   Copyright (C) 2017-2019 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+	.file "dwarf2-19.c"
+	.text 0
+	.balign 4
+	.globl _start
+_start:
+	.file 1 "dwarf2-19.c"
+	.loc 1 1 view 0
+
+	.section .rodata
+	.uleb128 .L1
+	.uleb128 .L3
+	.uleb128 .L4
+	.uleb128 .L2
+
+	.text 1
+	.loc 1 2 view .L1 	/* same address as view 0 above -> view 1 */
+
+	.text 2
+	.loc 1 3 view .L2	/* same address as .L4 below -> view 2 */
+
+	.text 1
+	.dc.l 0
+	.loc 1 4 view .L3	/* bumped address from .L1's, view 0 */
+	.loc 1 5 view .L4	/* same address, view 1 */
diff --git a/gas/testsuite/gas/elf/elf.exp b/gas/testsuite/gas/elf/elf.exp
index d616d5de7662..01d8b9dcb5f2 100644
--- a/gas/testsuite/gas/elf/elf.exp
+++ b/gas/testsuite/gas/elf/elf.exp
@@ -251,6 +251,7 @@ if { [is_elf_format] } then {
     run_dump_test "dwarf2-16"
     run_dump_test "dwarf2-17"
     run_dump_test "dwarf2-18"
+    run_dump_test "dwarf2-19"
     run_dump_test "bss"
     run_dump_test "bad-bss"
     run_dump_test "bad-section-flag"


-- 
Alexandre Oliva, freedom fighter  he/him   https://FSFLA.org/blogs/lxo
Be the change, be Free!                 FSF Latin America board member
GNU Toolchain Engineer                        Free Software Evangelist
Hay que enGNUrecerse, pero sin perder la terGNUra jamás - Che GNUevara

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PR24444] speed up locview resolution wiht relaxable frags
  2019-05-04  3:22         ` Alexandre Oliva
@ 2019-05-04  5:44           ` Alan Modra
  0 siblings, 0 replies; 10+ messages in thread
From: Alan Modra @ 2019-05-04  5:44 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: binutils

On Sat, May 04, 2019 at 12:22:22AM -0300, Alexandre Oliva wrote:
> for  gas/ChangeLog
> 
> 	* dwarf2dbg.c (set_or_check_view): Skip heads when assigning
> 	views of prior locs.
> 	(dwarf2_gen_line_info_1): Skip heads.
> 	(size_inc_line_addr, emit_inc_line_addr): Drop
> 	DW_LNS_advance_pc for zero addr delta.
> 	(dwarf2_finish): Assign views for heads of segments.
> 	* testsuite/gas/elf/dwarf2-19.d: New.
> 	* testsuite/gas/elf/dwarf2-19.s: New.
> 	* testsuite/gas/elf/elf.exp: Test it.

Looks good.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-05-04  5:44 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-13  8:57 [PR24444] speed up locview resolution wiht relaxable frags Alexandre Oliva
2019-04-14 13:22 ` Alan Modra
2019-04-16 13:24   ` Alexandre Oliva
2019-04-17  1:32     ` Alan Modra
2019-04-24 23:18     ` Alan Modra
2019-04-30  5:49       ` Alexandre Oliva
2019-05-03  1:14         ` Alan Modra
2019-05-04  0:56           ` Alexandre Oliva
2019-05-04  3:22         ` Alexandre Oliva
2019-05-04  5:44           ` Alan Modra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).