public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug fortran/98426] New: find_symbol in module.c traverses O(N) part of a search tree
@ 2020-12-23 10:42 mscfd at gmx dot net
  2020-12-28 10:29 ` [Bug fortran/98426] " mscfd at gmx dot net
                   ` (14 more replies)
  0 siblings, 15 replies; 16+ messages in thread
From: mscfd at gmx dot net @ 2020-12-23 10:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98426

            Bug ID: 98426
           Summary: find_symbol in module.c traverses O(N) part of a
                    search tree
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: fortran
          Assignee: unassigned at gcc dot gnu.org
          Reporter: mscfd at gmx dot net
  Target Milestone: ---

Created attachment 49837
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49837&action=edit
patch

Compilation times of top level modules in a large project I am working on are 
abysmal (up to 30 seconds for modules even with little code in it). Those
modules are using other modules whose mod files are up to 500kB. As far as I
can  see the mod files contain all [including private] symbols from any module
in their use-chain.

I used -pg compiler option to compile gcc itself and ran f951 and gprof with
some modules. gprof showed that about 80% or more of the compilation time is
spent in routine "find_symbol" in module.c.

>From the documentation I can see that argument "gfc_symtree *st" of
"find_symbol" should be a balanced binary search tree, ordered by
gfc_symtree->name. It looks like the tree can have different nodes with the
same name, so traversing the tree to check all nodes of the given name might
require to descend both the left and right subtree.

However, "find_symbol" traverses far more nodes than necessary. (Assuming that
all symbol names are unique, it traverses about half of the tree on average,
making the lookup operation O(N) instead of O(log N).)

The attached patch descends only if it can expect a match in the subtree (and
no match was found so far). This brings down the execution time of find_symbol
to almost nothing, speeding up the compilation time by a factor of about 5-10
for my top level modules (makes a difference if a module compiles in 3 instead
of 30 seconds)!


The patch regtests, and also compiles the large project with its big symbol
trees flawlessly and much faster.

Are there any testcases which check that find_symbol works as expected? If I
change find_symbol to always return NULL, then
make -k check-fortran gives
# of expected passes            55911
# of unexpected failures        1
# of expected failures          232
# of unsupported tests          81
with just one unexpected failure I could not locate.

In particular if I compile check use_only_1.f90 by hand (added by PR33541,
which added find_symbol), everything seems to work without a proper return
value from find_symbol.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-04-27  2:34 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-23 10:42 [Bug fortran/98426] New: find_symbol in module.c traverses O(N) part of a search tree mscfd at gmx dot net
2020-12-28 10:29 ` [Bug fortran/98426] " mscfd at gmx dot net
2020-12-28 10:36 ` mscfd at gmx dot net
2020-12-29 11:15 ` mscfd at gmx dot net
2021-10-29 18:36 ` aldot at gcc dot gnu.org
2024-04-23 20:40 ` matthew.thompson at nasa dot gov
2024-04-23 23:26 ` jvdelisle at gcc dot gnu.org
2024-04-24  2:59 ` jvdelisle at gcc dot gnu.org
2024-04-24  3:05 ` jvdelisle at gcc dot gnu.org
2024-04-25 11:20 ` matthew.thompson at nasa dot gov
2024-04-25 16:38 ` jvdelisle at gcc dot gnu.org
2024-04-26  3:12 ` jvdelisle at gcc dot gnu.org
2024-04-26 12:43 ` matthew.thompson at nasa dot gov
2024-04-26 15:53 ` matthew.thompson at nasa dot gov
2024-04-26 15:54 ` matthew.thompson at nasa dot gov
2024-04-27  2:34 ` jvdelisle at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).