public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PR106746] drop cselib addr lookup in debug insn mem
@ 2023-01-14 11:26 Alexandre Oliva
  2023-01-16  7:29 ` Richard Biener
  2023-01-27 23:18 ` [PATCH] sched-deps, cselib: Fix up some -fcompare-debug issues and regressions [PR108463] Jakub Jelinek
  0 siblings, 2 replies; 5+ messages in thread
From: Alexandre Oliva @ 2023-01-14 11:26 UTC (permalink / raw)
  To: gcc-patches


The testcase used to get scheduled differently depending on the
presence of debug insns with MEMs.  It's not clear to me why those
MEMs affected scheduling, but the cselib pre-canonicalization of the
MEM address is not used at all when analyzing debug insns, so the
memory allocation and lookup are pure waste.  Somehow, avoiding that
waste fixes the problem, or makes it go latent.

Regstrapped on x86_64-linux-gnu.  Ok to install?


for  gcc/ChangeLog

	PR debug/106746
	* sched-deps.cc (sched_analyze_2): Skip cselib address lookup
	within debug insns.

for  gcc/testsuite/ChangeLog

	PR debug/106746
	* gcc.dg/target/i386/pr106746.c: New.
---
 gcc/sched-deps.cc                        |   36 +++++++++++++++---------------
 gcc/testsuite/gcc.target/i386/pr106746.c |   29 ++++++++++++++++++++++++
 2 files changed, 47 insertions(+), 18 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr106746.c

diff --git a/gcc/sched-deps.cc b/gcc/sched-deps.cc
index f9371b81fb41e..a9214f674329a 100644
--- a/gcc/sched-deps.cc
+++ b/gcc/sched-deps.cc
@@ -2605,26 +2605,26 @@ sched_analyze_2 (class deps_desc *deps, rtx x, rtx_insn *insn)
 
     case MEM:
       {
-	/* Reading memory.  */
-	rtx_insn_list *u;
-	rtx_insn_list *pending;
-	rtx_expr_list *pending_mem;
-	rtx t = x;
-
-	if (sched_deps_info->use_cselib)
-	  {
-	    machine_mode address_mode = get_address_mode (t);
-
-	    t = shallow_copy_rtx (t);
-	    cselib_lookup_from_insn (XEXP (t, 0), address_mode, 1,
-				     GET_MODE (t), insn);
-	    XEXP (t, 0)
-	      = cselib_subst_to_values_from_insn (XEXP (t, 0), GET_MODE (t),
-						  insn);
-	  }
-
 	if (!DEBUG_INSN_P (insn))
 	  {
+	    /* Reading memory.  */
+	    rtx_insn_list *u;
+	    rtx_insn_list *pending;
+	    rtx_expr_list *pending_mem;
+	    rtx t = x;
+
+	    if (sched_deps_info->use_cselib)
+	      {
+		machine_mode address_mode = get_address_mode (t);
+
+		t = shallow_copy_rtx (t);
+		cselib_lookup_from_insn (XEXP (t, 0), address_mode, 1,
+					 GET_MODE (t), insn);
+		XEXP (t, 0)
+		  = cselib_subst_to_values_from_insn (XEXP (t, 0), GET_MODE (t),
+						      insn);
+	      }
+
 	    t = canon_rtx (t);
 	    pending = deps->pending_read_insns;
 	    pending_mem = deps->pending_read_mems;
diff --git a/gcc/testsuite/gcc.target/i386/pr106746.c b/gcc/testsuite/gcc.target/i386/pr106746.c
new file mode 100644
index 0000000000000..14f7dab71d691
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr106746.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fsched2-use-superblocks -fcompare-debug -Wno-psabi" } */
+
+typedef char __attribute__((__vector_size__ (64))) U;
+typedef short __attribute__((__vector_size__ (64))) V;
+typedef int __attribute__((__vector_size__ (64))) W;
+
+char c;
+U a;
+U *r;
+W foo0_v512u32_0;
+
+void
+foo (W)
+{
+  U u;
+  V v;
+  W w = __builtin_shuffle (foo0_v512u32_0, foo0_v512u32_0);
+  u =
+    __builtin_shufflevector (a, u, 3, 0, 4, 9, 9, 6, 7, 8, 5,
+			     0, 6, 1, 8, 1, 2, 8, 6,
+			     1, 8, 4, 9, 3, 8, 4, 6, 0, 9, 0, 1, 8, 2, 3, 3,
+			     0, 4, 9, 9, 6, 7, 8, 5,
+			     0, 6, 1, 8, 1, 2, 8, 6,
+			     1, 8, 4, 9, 3, 8, 4, 6, 0, 9, 0, 1, 8, 2, 3);
+  v *= c;
+  w &= c;
+  *r = (U) v + (U) w;
+}


-- 
Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
   Free Software Activist                       GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PR106746] drop cselib addr lookup in debug insn mem
  2023-01-14 11:26 [PR106746] drop cselib addr lookup in debug insn mem Alexandre Oliva
@ 2023-01-16  7:29 ` Richard Biener
  2023-01-27 23:18 ` [PATCH] sched-deps, cselib: Fix up some -fcompare-debug issues and regressions [PR108463] Jakub Jelinek
  1 sibling, 0 replies; 5+ messages in thread
From: Richard Biener @ 2023-01-16  7:29 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: gcc-patches

On Sat, Jan 14, 2023 at 12:26 PM Alexandre Oliva via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
>
> The testcase used to get scheduled differently depending on the
> presence of debug insns with MEMs.  It's not clear to me why those
> MEMs affected scheduling, but the cselib pre-canonicalization of the
> MEM address is not used at all when analyzing debug insns, so the
> memory allocation and lookup are pure waste.  Somehow, avoiding that
> waste fixes the problem, or makes it go latent.
>
> Regstrapped on x86_64-linux-gnu.  Ok to install?

OK.

Richard.

>
> for  gcc/ChangeLog
>
>         PR debug/106746
>         * sched-deps.cc (sched_analyze_2): Skip cselib address lookup
>         within debug insns.
>
> for  gcc/testsuite/ChangeLog
>
>         PR debug/106746
>         * gcc.dg/target/i386/pr106746.c: New.
> ---
>  gcc/sched-deps.cc                        |   36 +++++++++++++++---------------
>  gcc/testsuite/gcc.target/i386/pr106746.c |   29 ++++++++++++++++++++++++
>  2 files changed, 47 insertions(+), 18 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr106746.c
>
> diff --git a/gcc/sched-deps.cc b/gcc/sched-deps.cc
> index f9371b81fb41e..a9214f674329a 100644
> --- a/gcc/sched-deps.cc
> +++ b/gcc/sched-deps.cc
> @@ -2605,26 +2605,26 @@ sched_analyze_2 (class deps_desc *deps, rtx x, rtx_insn *insn)
>
>      case MEM:
>        {
> -       /* Reading memory.  */
> -       rtx_insn_list *u;
> -       rtx_insn_list *pending;
> -       rtx_expr_list *pending_mem;
> -       rtx t = x;
> -
> -       if (sched_deps_info->use_cselib)
> -         {
> -           machine_mode address_mode = get_address_mode (t);
> -
> -           t = shallow_copy_rtx (t);
> -           cselib_lookup_from_insn (XEXP (t, 0), address_mode, 1,
> -                                    GET_MODE (t), insn);
> -           XEXP (t, 0)
> -             = cselib_subst_to_values_from_insn (XEXP (t, 0), GET_MODE (t),
> -                                                 insn);
> -         }
> -
>         if (!DEBUG_INSN_P (insn))
>           {
> +           /* Reading memory.  */
> +           rtx_insn_list *u;
> +           rtx_insn_list *pending;
> +           rtx_expr_list *pending_mem;
> +           rtx t = x;
> +
> +           if (sched_deps_info->use_cselib)
> +             {
> +               machine_mode address_mode = get_address_mode (t);
> +
> +               t = shallow_copy_rtx (t);
> +               cselib_lookup_from_insn (XEXP (t, 0), address_mode, 1,
> +                                        GET_MODE (t), insn);
> +               XEXP (t, 0)
> +                 = cselib_subst_to_values_from_insn (XEXP (t, 0), GET_MODE (t),
> +                                                     insn);
> +             }
> +
>             t = canon_rtx (t);
>             pending = deps->pending_read_insns;
>             pending_mem = deps->pending_read_mems;
> diff --git a/gcc/testsuite/gcc.target/i386/pr106746.c b/gcc/testsuite/gcc.target/i386/pr106746.c
> new file mode 100644
> index 0000000000000..14f7dab71d691
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr106746.c
> @@ -0,0 +1,29 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fsched2-use-superblocks -fcompare-debug -Wno-psabi" } */
> +
> +typedef char __attribute__((__vector_size__ (64))) U;
> +typedef short __attribute__((__vector_size__ (64))) V;
> +typedef int __attribute__((__vector_size__ (64))) W;
> +
> +char c;
> +U a;
> +U *r;
> +W foo0_v512u32_0;
> +
> +void
> +foo (W)
> +{
> +  U u;
> +  V v;
> +  W w = __builtin_shuffle (foo0_v512u32_0, foo0_v512u32_0);
> +  u =
> +    __builtin_shufflevector (a, u, 3, 0, 4, 9, 9, 6, 7, 8, 5,
> +                            0, 6, 1, 8, 1, 2, 8, 6,
> +                            1, 8, 4, 9, 3, 8, 4, 6, 0, 9, 0, 1, 8, 2, 3, 3,
> +                            0, 4, 9, 9, 6, 7, 8, 5,
> +                            0, 6, 1, 8, 1, 2, 8, 6,
> +                            1, 8, 4, 9, 3, 8, 4, 6, 0, 9, 0, 1, 8, 2, 3);
> +  v *= c;
> +  w &= c;
> +  *r = (U) v + (U) w;
> +}
>
>
> --
> Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
>    Free Software Activist                       GNU Toolchain Engineer
> Disinformation flourishes because many people care deeply about injustice
> but very few check the facts.  Ask me about <https://stallmansupport.org>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] sched-deps, cselib: Fix up some -fcompare-debug issues and regressions [PR108463]
  2023-01-14 11:26 [PR106746] drop cselib addr lookup in debug insn mem Alexandre Oliva
  2023-01-16  7:29 ` Richard Biener
@ 2023-01-27 23:18 ` Jakub Jelinek
  2023-02-02 11:16   ` Alexandre Oliva
  1 sibling, 1 reply; 5+ messages in thread
From: Jakub Jelinek @ 2023-01-27 23:18 UTC (permalink / raw)
  To: Richard Biener, Jeff Law, Alexandre Oliva; +Cc: gcc-patches

Hi!

On Sat, Jan 14, 2023 at 08:26:00AM -0300, Alexandre Oliva via Gcc-patches wrote:
> The testcase used to get scheduled differently depending on the
> presence of debug insns with MEMs.  It's not clear to me why those
> MEMs affected scheduling, but the cselib pre-canonicalization of the
> MEM address is not used at all when analyzing debug insns, so the
> memory allocation and lookup are pure waste.  Somehow, avoiding that
> waste fixes the problem, or makes it go latent.

Unfortunately, this patch breaks the following testcase.
The code in sched_analyze_2 did 2 things:
1) cselib_lookup_from_insn
2) shallow_copy_rtx + cselib_subst_to_values_from_insn
Now, 1) is precondition of 2), we can only subst the VALUEs if we
have actually looked the address up, but as can be seen on that testcase,
we are relying on at least the 1) to be done because we subst the values
later on even on DEBUG_INSNs and actually use those when needed.
cselib_subst_to_values_from_insn mostly just replaces stuff in the
returned rtx, except for:
      /* This used to happen for autoincrements, but we deal with them
         properly now.  Remove the if stmt for the next release.  */
      if (! e)
        {
          /* Assign a value that doesn't match any other.  */
          e = new_cselib_val (next_uid, GET_MODE (x), x);
        }
which is like that since 2011, I hope it is never reachable and we should
in stage1 replace that with gcc_assert or just remove (then it will
segfault on following
      return e->val_rtx;
).

So, I (as done in the patch below) reinstalled the 1) and not 2) for
DEBUG_INSNs.  This fixed the new testcase, but broke again the PR106746
testcases.

I've spent a day debugging that and found the problem is that as documented
in a large comment in cselib.cc above n_useless_values variable definition,
we spend quite a few effort on making sure that VALUEs created on
DEBUG_INSNs don't affect the cselib decisions for non-DEBUG_INSNs such as
pruning of useless values etc., but if a VALUE created that way is then
looked up/needed from non-DEBUG_INSNs, we promote it to non-debug.

The reason for -fcompare-debug failure is that there is one large DEBUG_INSN
with 16 MEMs in it mostly with addresses that so far didn't appear in the IL
otherwise.  Later on, we see an instruction storing into MEM destination
and invalidate that MEM.  Unfortunately, there is a bug caused by the
introduction of SP_DERIVED_VALUE_P where alias.cc isn't able to disambiguate
MEMs with sp + optional offset in address vs. MEMs with address being a
VALUE having SP_DERIVED_VALUE_P + constant (or the SP_DERIVED_VALUE_P
itself), which ought to be possible when REG_VALUES (REGNO
(stack_pointer_rtx)) has SP_DERIVED_VALUE_P + constant location.  Not sure
if I should try to fix that in stage4 or defer for stage1.
Anyway, the cselib_invalidate_mem call because of this invalidates basically
all MEMs with the exception of 5 which have MEM_EXPRs that guarantee
non-aliasing with the sp based store.
Unfortunately, n_useless_values which in my understanding should be always
the same between -g and -g0 compilations diverges, has 3 more useless values
for -g.

Now, these were initially VALUEs created for DEBUG_INSN lookups.  As I said,
cselib.cc has code to promote such VALUEs (well, their location elements) to
non-debug if they are looked up from non-DEBUG_INSNs.  The problem is that
when looking some completely unrelated MEM from a non-DEBUG_INSN we run into
a hash collision and so call cselib_hasher::equal to check if the unrelated
MEM is equal to the one from DEBUG_INSN only element.  The equal static
member function calls rtx_equal_for_cselib_1 and if that returns true,
promotes the location to non-DEBUG, otherwise returns false.  So far so
good.  But rtx_equal_for_cselib_1 internally performs various other cselib
lookups, all done with the non-DEBUG_INSN cselib_current_insn, so they
all promote to non-debug.  And that is wrong, because if it was -g0
compilation, such hashtable entry wouldn't be there at all (or would be
but wouldn't contain that locs element), so with -g0 we wouldn't call
that rtx_equal_for_cselib_1 at all.  So, I think we need to pretend
that such lookup which only happens with -g and not -g0 actually comes
from some DEBUG_INSN (note, the lookups rtx_equal_for_cselib_1 does
are always with create = 0).
The cselib.cc part of the patch does that.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

BTW, I'm not really sure how:
          if (num_mems < param_max_cselib_memory_locations
              && ! canon_anti_dependence (x, false, mem_rtx,
                                          GET_MODE (mem_rtx), mem_addr))
            {
              has_mem = true;
              num_mems++;
              p = &(*p)->next;
              continue;
            }
num_mems cap can actually work correctly for -fcompare-debug,
I'd think we would need to differentiate between num_debug_mems and
num_mems depending on if setting_insn is non-NULL DEBUG_INSN or not.
That was one of my suspicions on this testcase, but the number of MEMs
was small enough for the param in either case (especially because of
the above mentioned missed non-aliasings).  But as implemented, I think
if we have tons of non-aliased MEMs from DEBUG_INSN setting_insns,
we could unchain lots more non-DEBUG MEMs with -g than with -g0.

2023-01-27  Jakub Jelinek  <jakub@redhat.com>

	PR debug/106746
	PR rtl-optimization/108463
	* cselib.cc (cselib_current_insn): Move declaration earlier.
	(cselib_hasher::equal): For debug only locs, temporarily override
	cselib_current_insn to their l->setting_insn for the
	rtx_equal_for_cselib_1 call, so that unsuccessful comparisons don't
	promote some debug locs.
	* sched-deps.cc (sched_analyze_2) <case MEM>: For MEMs in DEBUG_INSNs
	when using cselib call cselib_lookup_from_insn on the address but
	don't substitute it.

	* gcc.dg/pr108463.c: New test.

--- gcc/cselib.cc.jj	2023-01-27 19:49:45.925097052 +0100
+++ gcc/cselib.cc	2023-01-27 19:59:11.222824051 +0100
@@ -80,6 +80,10 @@ struct expand_value_data
 
 static rtx cselib_expand_value_rtx_1 (rtx, struct expand_value_data *, int);
 
+/* This is a global so we don't have to pass this through every function.
+   It is used in new_elt_loc_list to set SETTING_INSN.  */
+static rtx_insn *cselib_current_insn;
+
 /* There are three ways in which cselib can look up an rtx:
    - for a REG, the reg_values table (which is indexed by regno) is used
    - for a MEM, we recursively look up its address and then follow the
@@ -143,11 +147,25 @@ cselib_hasher::equal (const cselib_val *
   /* We don't guarantee that distinct rtx's have different hash values,
      so we need to do a comparison.  */
   for (l = v->locs; l; l = l->next)
-    if (rtx_equal_for_cselib_1 (l->loc, x, memmode, 0))
+    if (l->setting_insn && DEBUG_INSN_P (l->setting_insn)
+	&& (!cselib_current_insn || !DEBUG_INSN_P (cselib_current_insn)))
       {
-	promote_debug_loc (l);
-	return true;
+	rtx_insn *save_cselib_current_insn = cselib_current_insn;
+	/* If l is so far a debug only loc, without debug stmts it
+	   would never be compared to x at all, so temporarily pretend
+	   current instruction is that DEBUG_INSN so that we don't
+	   promote other debug locs even for unsuccessful comparison.  */
+	cselib_current_insn = l->setting_insn;
+	bool match = rtx_equal_for_cselib_1 (l->loc, x, memmode, 0);
+	cselib_current_insn = save_cselib_current_insn;
+	if (match)
+	  {
+	    promote_debug_loc (l);
+	    return true;
+	  }
       }
+    else if (rtx_equal_for_cselib_1 (l->loc, x, memmode, 0))
+      return true;
 
   return false;
 }
@@ -158,10 +176,6 @@ static hash_table<cselib_hasher> *cselib
 /* A table to hold preserved values.  */
 static hash_table<cselib_hasher> *cselib_preserved_hash_table;
 
-/* This is a global so we don't have to pass this through every function.
-   It is used in new_elt_loc_list to set SETTING_INSN.  */
-static rtx_insn *cselib_current_insn;
-
 /* The unique id that the next create value will take.  */
 static unsigned int next_uid;
 
--- gcc/sched-deps.cc.jj	2023-01-19 09:58:50.971227752 +0100
+++ gcc/sched-deps.cc	2023-01-27 19:49:58.736909549 +0100
@@ -2605,7 +2605,14 @@ sched_analyze_2 (class deps_desc *deps,
 
     case MEM:
       {
-	if (!DEBUG_INSN_P (insn))
+	if (DEBUG_INSN_P (insn) && sched_deps_info->use_cselib)
+	  {
+	    machine_mode address_mode = get_address_mode (x);
+
+	    cselib_lookup_from_insn (XEXP (x, 0), address_mode, 1,
+				     GET_MODE (x), insn);
+	  }
+	else if (!DEBUG_INSN_P (insn))
 	  {
 	    /* Reading memory.  */
 	    rtx_insn_list *u;
--- gcc/testsuite/gcc.dg/pr108463.c.jj	2023-01-27 20:02:22.420025922 +0100
+++ gcc/testsuite/gcc.dg/pr108463.c	2023-01-27 20:01:58.051382532 +0100
@@ -0,0 +1,13 @@
+/* PR rtl-optimization/108463 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fsched2-use-superblocks -fcompare-debug -Wno-psabi" } */
+
+typedef int __attribute__((__vector_size__ (32))) V;
+int a;
+
+void
+foo (V v)
+{
+  a--;
+  v = (V) v;
+}


	Jakub


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched-deps, cselib: Fix up some -fcompare-debug issues and regressions [PR108463]
  2023-01-27 23:18 ` [PATCH] sched-deps, cselib: Fix up some -fcompare-debug issues and regressions [PR108463] Jakub Jelinek
@ 2023-02-02 11:16   ` Alexandre Oliva
  2023-02-02 12:02     ` Richard Biener
  0 siblings, 1 reply; 5+ messages in thread
From: Alexandre Oliva @ 2023-02-02 11:16 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, Jeff Law, gcc-patches

On Jan 27, 2023, Jakub Jelinek <jakub@redhat.com> wrote:

> Now, 1) is precondition of 2), we can only subst the VALUEs if we
> have actually looked the address up, but as can be seen on that testcase,
> we are relying on at least the 1) to be done because we subst the values
> later on even on DEBUG_INSNs and actually use those when needed.

Ugh.  That definitely rings a bell, now that you mention it.  I wish I
had recalled that when I saw the "obvious" opportunity for optimization
:-/

> So, I (as done in the patch below) reinstalled the 1) and not 2) for
> DEBUG_INSNs.

Thanks!

> I've spent a day debugging that and found the problem is that as documented
> in a large comment in cselib.cc above n_useless_values variable definition,
> we spend quite a few effort on making sure that VALUEs created on
> DEBUG_INSNs don't affect the cselib decisions for non-DEBUG_INSNs such as
> pruning of useless values etc., but if a VALUE created that way is then
> looked up/needed from non-DEBUG_INSNs, we promote it to non-debug.

*nod*

> The reason for -fcompare-debug failure is that there is one large DEBUG_INSN
> with 16 MEMs in it mostly with addresses that so far didn't appear in the IL
> otherwise.  Later on, we see an instruction storing into MEM destination
> and invalidate that MEM.

Aha!

> Unfortunately, n_useless_values which in my understanding should be always
> the same between -g and -g0 compilations diverges, has 3 more useless values
> for -g.

Yeah, that's not good.

> Now, these were initially VALUEs created for DEBUG_INSN lookups.  As I said,
> cselib.cc has code to promote such VALUEs (well, their location elements) to
> non-debug if they are looked up from non-DEBUG_INSNs.  The problem is that
> when looking some completely unrelated MEM from a non-DEBUG_INSN we run into
> a hash collision and so call cselib_hasher::equal to check if the unrelated
> MEM is equal to the one from DEBUG_INSN only element.  The equal static
> member function calls rtx_equal_for_cselib_1 and if that returns true,
> promotes the location to non-DEBUG, otherwise returns false.  So far so
> good.  But rtx_equal_for_cselib_1 internally performs various other cselib
> lookups, all done with the non-DEBUG_INSN cselib_current_insn, so they
> all promote to non-debug.

Good catch!

> So, I think we need to pretend
> that such lookup which only happens with -g and not -g0 actually comes
> from some DEBUG_INSN (note, the lookups rtx_equal_for_cselib_1 does
> are always with create = 0).
> The cselib.cc part of the patch does that.

Agreed, that makes sense to me, thanks!

> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

FWIW, I'd approve it if I had the authority to do so :-)


> I'd think we would need to differentiate between num_debug_mems and
> num_mems depending on if setting_insn is non-NULL DEBUG_INSN or not.

*nod*, I concur.

Thanks!

-- 
Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
   Free Software Activist                       GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched-deps, cselib: Fix up some -fcompare-debug issues and regressions [PR108463]
  2023-02-02 11:16   ` Alexandre Oliva
@ 2023-02-02 12:02     ` Richard Biener
  0 siblings, 0 replies; 5+ messages in thread
From: Richard Biener @ 2023-02-02 12:02 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: Jakub Jelinek, Jeff Law, gcc-patches

On Thu, 2 Feb 2023, Alexandre Oliva wrote:

> On Jan 27, 2023, Jakub Jelinek <jakub@redhat.com> wrote:
> 
> > Now, 1) is precondition of 2), we can only subst the VALUEs if we
> > have actually looked the address up, but as can be seen on that testcase,
> > we are relying on at least the 1) to be done because we subst the values
> > later on even on DEBUG_INSNs and actually use those when needed.
> 
> Ugh.  That definitely rings a bell, now that you mention it.  I wish I
> had recalled that when I saw the "obvious" opportunity for optimization
> :-/
> 
> > So, I (as done in the patch below) reinstalled the 1) and not 2) for
> > DEBUG_INSNs.
> 
> Thanks!
> 
> > I've spent a day debugging that and found the problem is that as documented
> > in a large comment in cselib.cc above n_useless_values variable definition,
> > we spend quite a few effort on making sure that VALUEs created on
> > DEBUG_INSNs don't affect the cselib decisions for non-DEBUG_INSNs such as
> > pruning of useless values etc., but if a VALUE created that way is then
> > looked up/needed from non-DEBUG_INSNs, we promote it to non-debug.
> 
> *nod*
> 
> > The reason for -fcompare-debug failure is that there is one large DEBUG_INSN
> > with 16 MEMs in it mostly with addresses that so far didn't appear in the IL
> > otherwise.  Later on, we see an instruction storing into MEM destination
> > and invalidate that MEM.
> 
> Aha!
> 
> > Unfortunately, n_useless_values which in my understanding should be always
> > the same between -g and -g0 compilations diverges, has 3 more useless values
> > for -g.
> 
> Yeah, that's not good.
> 
> > Now, these were initially VALUEs created for DEBUG_INSN lookups.  As I said,
> > cselib.cc has code to promote such VALUEs (well, their location elements) to
> > non-debug if they are looked up from non-DEBUG_INSNs.  The problem is that
> > when looking some completely unrelated MEM from a non-DEBUG_INSN we run into
> > a hash collision and so call cselib_hasher::equal to check if the unrelated
> > MEM is equal to the one from DEBUG_INSN only element.  The equal static
> > member function calls rtx_equal_for_cselib_1 and if that returns true,
> > promotes the location to non-DEBUG, otherwise returns false.  So far so
> > good.  But rtx_equal_for_cselib_1 internally performs various other cselib
> > lookups, all done with the non-DEBUG_INSN cselib_current_insn, so they
> > all promote to non-debug.
> 
> Good catch!
> 
> > So, I think we need to pretend
> > that such lookup which only happens with -g and not -g0 actually comes
> > from some DEBUG_INSN (note, the lookups rtx_equal_for_cselib_1 does
> > are always with create = 0).
> > The cselib.cc part of the patch does that.
> 
> Agreed, that makes sense to me, thanks!
> 
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> FWIW, I'd approve it if I had the authority to do so :-)

OK.

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-02-02 12:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-14 11:26 [PR106746] drop cselib addr lookup in debug insn mem Alexandre Oliva
2023-01-16  7:29 ` Richard Biener
2023-01-27 23:18 ` [PATCH] sched-deps, cselib: Fix up some -fcompare-debug issues and regressions [PR108463] Jakub Jelinek
2023-02-02 11:16   ` Alexandre Oliva
2023-02-02 12:02     ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).