public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/108696] New: querying relations is slow
@ 2023-02-07 13:00 rguenth at gcc dot gnu.org
  2023-02-08 17:16 ` [Bug tree-optimization/108696] " amacleod at redhat dot com
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-02-07 13:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108696

            Bug ID: 108696
           Summary: querying relations is slow
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

With the compiler.i testcase from PR26854 I notice we do quite a lot of bitmap
intersections via dom_oracle::query_relation for the case (which hits 99% of
the time) when a SSA name doesn't have any relation but equiv_set () returns a
newly allocated bitmap with just the self-equivalence recorded.

The self-equivalence case isn't used to prune the work done by query_relation
it seems.  Maybe find_relation_dom does that via

  // IF either name does not occur in a relation anywhere, there isnt one.
  if (!bitmap_bit_p (m_relation_set, v1) || !bitmap_bit_p (m_relation_set, v2))
    return VREL_VARYING;

but we still get all the way to the query_relation overload which eventually
does the intersection with m_relation_set which is a lot more expensive than
this.

A naiive

diff --git a/gcc/value-relation.cc b/gcc/value-relation.cc
index f5b1e67e420..f58d5050bdb 100644
--- a/gcc/value-relation.cc
+++ b/gcc/value-relation.cc
@@ -1373,6 +1373,10 @@ dom_oracle::query_relation (basic_block bb, tree ssa1,
tree ssa2)
   unsigned v2 = SSA_NAME_VERSION (ssa2);
   if (v1 == v2)
     return VREL_EQ;
+ 
+  // IF either name does not occur in a relation anywhere, there isnt one.
+  if (!bitmap_bit_p (m_relation_set, v1) || !bitmap_bit_p (m_relation_set,
v2))
+    return VREL_VARYING;

   // Check for equivalence first.  They must be in each equivalency set.
   const_bitmap equiv1 = equiv_set (ssa1, bb);

cuts

 dominator optimization             :  17.35 ( 15%) 

down to

 dominator optimization             :  16.17 ( 14%)

note almost all of the DOM time is spent in ranger for this testcase.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/108696] querying relations is slow
  2023-02-07 13:00 [Bug tree-optimization/108696] New: querying relations is slow rguenth at gcc dot gnu.org
@ 2023-02-08 17:16 ` amacleod at redhat dot com
  2023-02-08 17:50 ` amacleod at redhat dot com
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: amacleod at redhat dot com @ 2023-02-08 17:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108696

--- Comment #1 from Andrew Macleod <amacleod at redhat dot com> ---
I think that change will end up missing equivalences and transitives.  The
m_relation_set bitmap is only used to track non-equivalence relations
registered with the oracle. Equivalences are handled separately in the
equivalence oracle.

The speedup you are seeing is likely because it is bailing on checking for
equivalences and transitive relations through equivalences.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/108696] querying relations is slow
  2023-02-07 13:00 [Bug tree-optimization/108696] New: querying relations is slow rguenth at gcc dot gnu.org
  2023-02-08 17:16 ` [Bug tree-optimization/108696] " amacleod at redhat dot com
@ 2023-02-08 17:50 ` amacleod at redhat dot com
  2023-02-08 18:01 ` rguenther at suse dot de
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: amacleod at redhat dot com @ 2023-02-08 17:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108696

--- Comment #2 from Andrew Macleod <amacleod at redhat dot com> ---
Created attachment 54437
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54437&action=edit
possible patch

This patch should successfully short circuit unnecessary checks. untested in
compiler.i

Where did you get a 17% time in DOM?  when I run compiler.i I get
dominator optimization             :  38.28 (  2%) 
where most of the time is in
machine dep reorg                  :1447.64 ( 60%) 

I'll check this patch for correctness and to see if it generally makes any time
improvements that are measurable elsewhere.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/108696] querying relations is slow
  2023-02-07 13:00 [Bug tree-optimization/108696] New: querying relations is slow rguenth at gcc dot gnu.org
  2023-02-08 17:16 ` [Bug tree-optimization/108696] " amacleod at redhat dot com
  2023-02-08 17:50 ` amacleod at redhat dot com
@ 2023-02-08 18:01 ` rguenther at suse dot de
  2023-02-10  8:38 ` rguenth at gcc dot gnu.org
  2023-02-10 14:53 ` amacleod at redhat dot com
  4 siblings, 0 replies; 6+ messages in thread
From: rguenther at suse dot de @ 2023-02-08 18:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108696

--- Comment #3 from rguenther at suse dot de <rguenther at suse dot de> ---
> Am 08.02.2023 um 18:50 schrieb amacleod at redhat dot com <gcc-bugzilla@gcc.gnu.org>:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108696
> 
> --- Comment #2 from Andrew Macleod <amacleod at redhat dot com> ---
> Created attachment 54437
>  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54437&action=edit
> possible patch
> 
> This patch should successfully short circuit unnecessary checks. untested in
> compiler.i
> 
> Where did you get a 17% time in DOM?

When using -O1 as recommended for this kind of testcase.

>  when I run compiler.i I get
> dominator optimization             :  38.28 (  2%) 
> where most of the time is in
> machine dep reorg                  :1447.64 ( 60%) 

Ah. Another thing to investigate…

> 
> I'll check this patch for correctness and to see if it generally makes any time
> improvements that are measurable elsewhere.
> 
> -- 
> You are receiving this mail because:
> You reported the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/108696] querying relations is slow
  2023-02-07 13:00 [Bug tree-optimization/108696] New: querying relations is slow rguenth at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2023-02-08 18:01 ` rguenther at suse dot de
@ 2023-02-10  8:38 ` rguenth at gcc dot gnu.org
  2023-02-10 14:53 ` amacleod at redhat dot com
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-02-10  8:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108696

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Andrew Macleod from comment #2)
> Created attachment 54437 [details]
> possible patch
> 
> This patch should successfully short circuit unnecessary checks. untested in
> compiler.i
> 
> Where did you get a 17% time in DOM?  when I run compiler.i I get
> dominator optimization             :  38.28 (  2%) 
> where most of the time is in
> machine dep reorg                  :1447.64 ( 60%) 
> 
> I'll check this patch for correctness and to see if it generally makes any
> time improvements that are measurable elsewhere.

It helps the callgrind profile but doesn't show any effect on the -ftime-report
or a perf profile.  For the testcase the bitmap operations in ranger are
definitely visible in a perf profile (with call traces), and with -O1 it's
mostly DOM and jump-threading that perform any ranger ops when diagnostics are
disabled.

That said, not allocating the self-relation bitmaps at query time is
definitely good (not 100% sure if the patch achieves that).

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/108696] querying relations is slow
  2023-02-07 13:00 [Bug tree-optimization/108696] New: querying relations is slow rguenth at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2023-02-10  8:38 ` rguenth at gcc dot gnu.org
@ 2023-02-10 14:53 ` amacleod at redhat dot com
  4 siblings, 0 replies; 6+ messages in thread
From: amacleod at redhat dot com @ 2023-02-10 14:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108696

--- Comment #5 from Andrew Macleod <amacleod at redhat dot com> ---
(In reply to Richard Biener from comment #4)

> 
> That said, not allocating the self-relation bitmaps at query time is
> definitely good (not 100% sure if the patch achieves that).

 If it can determine ahead of time that it isn't needed, then the self maps do
not get allocated. If it actually has to do a query, then they are allocated.

Im still looking around at this... It was simpler to always do bitmap vs bitmap
comparison rather than the full set of combinations.  Figured I would revisit
it if it was a performance concern.  I will experiment with what happens if we
expand the API, at least internally, to do ssa vs ssa, bitmap vs bitmap, and
ssa vs bitmap checks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-02-10 14:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-07 13:00 [Bug tree-optimization/108696] New: querying relations is slow rguenth at gcc dot gnu.org
2023-02-08 17:16 ` [Bug tree-optimization/108696] " amacleod at redhat dot com
2023-02-08 17:50 ` amacleod at redhat dot com
2023-02-08 18:01 ` rguenther at suse dot de
2023-02-10  8:38 ` rguenth at gcc dot gnu.org
2023-02-10 14:53 ` amacleod at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).