* [Bug fortran/98426] find_symbol in module.c traverses O(N) part of a search tree
2020-12-23 10:42 [Bug fortran/98426] New: find_symbol in module.c traverses O(N) part of a search tree mscfd at gmx dot net
@ 2020-12-28 10:29 ` mscfd at gmx dot net
2020-12-28 10:36 ` mscfd at gmx dot net
` (13 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: mscfd at gmx dot net @ 2020-12-28 10:29 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98426
--- Comment #1 from martin <mscfd at gmx dot net> ---
Created attachment 49846
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49846&action=edit
corrected patch
Comparison with c was wrong.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug fortran/98426] find_symbol in module.c traverses O(N) part of a search tree
2020-12-23 10:42 [Bug fortran/98426] New: find_symbol in module.c traverses O(N) part of a search tree mscfd at gmx dot net
2020-12-28 10:29 ` [Bug fortran/98426] " mscfd at gmx dot net
@ 2020-12-28 10:36 ` mscfd at gmx dot net
2020-12-29 11:15 ` mscfd at gmx dot net
` (12 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: mscfd at gmx dot net @ 2020-12-28 10:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98426
--- Comment #2 from martin <mscfd at gmx dot net> ---
I further tried to find out what the call to find_symbol (this is the call
which consumed the compilation time) is achieving in read_modules(). Even with
the accidentially wrong patch everything just seems to work (and I have lots of
generic interfaces and submodules etc., for which there are checks in
read_modules following the call to find_symbol).
With some printf's I at least checked the the corrected patch definitely finds
the symbols it is looking for. But I cannot come up with some testcase showing
that find_symbol returns the expected result in read_modules().
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug fortran/98426] find_symbol in module.c traverses O(N) part of a search tree
2020-12-23 10:42 [Bug fortran/98426] New: find_symbol in module.c traverses O(N) part of a search tree mscfd at gmx dot net
2020-12-28 10:29 ` [Bug fortran/98426] " mscfd at gmx dot net
2020-12-28 10:36 ` mscfd at gmx dot net
@ 2020-12-29 11:15 ` mscfd at gmx dot net
2021-10-29 18:36 ` aldot at gcc dot gnu.org
` (11 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: mscfd at gmx dot net @ 2020-12-29 11:15 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98426
--- Comment #3 from martin <mscfd at gmx dot net> ---
Sorry for the noise, but as this gives big reductions in compilation time it is
quite important to me (and probably other big module based projects).
I just realised that I mixed up gfc_symtree->name and gfc_symtree->n.sym->name.
The reason why find_symbol traverses the whole tree in the worst case is that
it is ordered by gfc_symtree->name but it looks for gfc_symtree->n.sym->name.
So far I got the impression that either gfc_symtree->name is a unique name
("@xxx") or equal to gfc_symtree->n.sym->name (if n.sym!=NULL). In this case,
the following patch should do the trick and traverse only log(N) of the tree:
diff --git a/gcc/fortran/module.c b/gcc/fortran/module.c
index 4c6ff22d5c1..8dc59e25d46 100644
--- a/gcc/fortran/module.c
+++ b/gcc/fortran/module.c
@@ -4659,10 +4659,20 @@ find_symbol (gfc_symtree *st, const char *name,
return st;
}
+ c = strcmp (name, st->name);
+ if (c < 0)
+ retval = find_symbol (st->left, name, module, generic);
+ else if (c > 0)
+ retval = find_symbol (st->right, name, module, generic);
+ else
+ retval = NULL;
+
+/* original traverse
retval = find_symbol (st->left, name, module, generic);
if (retval == NULL)
retval = find_symbol (st->right, name, module, generic);
+*/
return retval;
}
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug fortran/98426] find_symbol in module.c traverses O(N) part of a search tree
2020-12-23 10:42 [Bug fortran/98426] New: find_symbol in module.c traverses O(N) part of a search tree mscfd at gmx dot net
` (2 preceding siblings ...)
2020-12-29 11:15 ` mscfd at gmx dot net
@ 2021-10-29 18:36 ` aldot at gcc dot gnu.org
2024-04-23 20:40 ` matthew.thompson at nasa dot gov
` (10 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: aldot at gcc dot gnu.org @ 2021-10-29 18:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98426
Bernhard Reutner-Fischer <aldot at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |aldot at gcc dot gnu.org
--- Comment #4 from Bernhard Reutner-Fischer <aldot at gcc dot gnu.org> ---
I'm thinking about switching all these symbol lookups/symtree traversals (i.e.
the whole fortran frontend) to pointer comparison which should _greatly_
speedup any symbol lookup, fyi.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug fortran/98426] find_symbol in module.c traverses O(N) part of a search tree
2020-12-23 10:42 [Bug fortran/98426] New: find_symbol in module.c traverses O(N) part of a search tree mscfd at gmx dot net
` (3 preceding siblings ...)
2021-10-29 18:36 ` aldot at gcc dot gnu.org
@ 2024-04-23 20:40 ` matthew.thompson at nasa dot gov
2024-04-23 23:26 ` jvdelisle at gcc dot gnu.org
` (9 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: matthew.thompson at nasa dot gov @ 2024-04-23 20:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98426
Matt Thompson <matthew.thompson at nasa dot gov> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |matthew.thompson at nasa dot gov
--- Comment #5 from Matt Thompson <matthew.thompson at nasa dot gov> ---
I just wanted to come on here and say this is still an issue in GCC 13.2 (on
macOS). I did a profile of my code today and saw that this 56-line file of just
uses:
https://github.com/GEOS-ESM/MAPL/blob/main/base/Base.F90
is our second longest file to compile, see:
https://github.com/GEOS-ESM/MAPL/wiki/Profiling-MAPL-Builds#checking-the-build-times-on-cli
taking longer than a 5600 line (!) file of actual exciting Fortran:
https://github.com/GEOS-ESM/MAPL/blob/main/gridcomps/History/MAPL_HistoryGridComp.F90
Is there anyway I can help move this bug to "CONFIRMED"?
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug fortran/98426] find_symbol in module.c traverses O(N) part of a search tree
2020-12-23 10:42 [Bug fortran/98426] New: find_symbol in module.c traverses O(N) part of a search tree mscfd at gmx dot net
` (4 preceding siblings ...)
2024-04-23 20:40 ` matthew.thompson at nasa dot gov
@ 2024-04-23 23:26 ` jvdelisle at gcc dot gnu.org
2024-04-24 2:59 ` jvdelisle at gcc dot gnu.org
` (8 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: jvdelisle at gcc dot gnu.org @ 2024-04-23 23:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98426
Jerry DeLisle <jvdelisle at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jvdelisle at gcc dot gnu.org
--- Comment #6 from Jerry DeLisle <jvdelisle at gcc dot gnu.org> ---
HI Matt, I can try your patch and if all tests clean I will request for it to
be approved and I can push it. We have all been rather busy and short handed.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug fortran/98426] find_symbol in module.c traverses O(N) part of a search tree
2020-12-23 10:42 [Bug fortran/98426] New: find_symbol in module.c traverses O(N) part of a search tree mscfd at gmx dot net
` (5 preceding siblings ...)
2024-04-23 23:26 ` jvdelisle at gcc dot gnu.org
@ 2024-04-24 2:59 ` jvdelisle at gcc dot gnu.org
2024-04-24 3:05 ` jvdelisle at gcc dot gnu.org
` (7 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: jvdelisle at gcc dot gnu.org @ 2024-04-24 2:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98426
Jerry DeLisle <jvdelisle at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2024-04-24
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
--- Comment #7 from Jerry DeLisle <jvdelisle at gcc dot gnu.org> ---
With the proposed patch, the following test case fails.
! { dg-do compile }
! { dg-options "-fsecond-underscore" }
! PR fortran/95689 - ICE in check_sym_interfaces, at fortran/interface.c:2015
module m2345678901234567890123456789012345678901234567890123456789_123
type t2345678901234567890123456789012345678901234567890123456789_123
end type
interface
module subroutine
s2345678901234567890123456789012345678901234567890123456789_123 &
(x2345678901234567890123456789012345678901234567890123456789_123)
end
end interface
end
submodule(m2345678901234567890123456789012345678901234567890123456789_123) &
t2345678901234567890123456789012345678901234567890123456789_123
end
$ gfc -c -fsecond-underscore pr95689.f90
pr95689.f90:14:74:
14 |
submodule(m2345678901234567890123456789012345678901234567890123456789_123) &
|
1
Error: Name ‘t2345678901234567890123456789012345678901234567890123456789_123’
at (1) is an ambiguous reference to
‘m2345678901234567890123456789012345678901234567890123456789_123.t2345678901234567890123456789012345678901234567890123456789_123’
from current program unit
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug fortran/98426] find_symbol in module.c traverses O(N) part of a search tree
2020-12-23 10:42 [Bug fortran/98426] New: find_symbol in module.c traverses O(N) part of a search tree mscfd at gmx dot net
` (6 preceding siblings ...)
2024-04-24 2:59 ` jvdelisle at gcc dot gnu.org
@ 2024-04-24 3:05 ` jvdelisle at gcc dot gnu.org
2024-04-25 11:20 ` matthew.thompson at nasa dot gov
` (6 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: jvdelisle at gcc dot gnu.org @ 2024-04-24 3:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98426
--- Comment #8 from Jerry DeLisle <jvdelisle at gcc dot gnu.org> ---
Martin or Matt,
Can you test the following variation to see if you get better results.
return st;
}
retval = NULL;
if (c <= 0)
retval = find_symbol (st->left, name, module, generic);
if (c > 0 && retval == NULL)
retval = find_symbol (st->right, name, module, generic);
if (c > 0 && retval == NULL)
retval = find_symbol (st->left, name, module, generic);
if (c <= 0 && retval == NULL)
retval = find_symbol (st->right, name, module, generic);
return retval;
This does pass regression testing but I do not think it guarantees better
results. Apparently the value of c does not guarantee a find going left or
right.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug fortran/98426] find_symbol in module.c traverses O(N) part of a search tree
2020-12-23 10:42 [Bug fortran/98426] New: find_symbol in module.c traverses O(N) part of a search tree mscfd at gmx dot net
` (7 preceding siblings ...)
2024-04-24 3:05 ` jvdelisle at gcc dot gnu.org
@ 2024-04-25 11:20 ` matthew.thompson at nasa dot gov
2024-04-25 16:38 ` jvdelisle at gcc dot gnu.org
` (5 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: matthew.thompson at nasa dot gov @ 2024-04-25 11:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98426
--- Comment #9 from Matt Thompson <matthew.thompson at nasa dot gov> ---
Jerry,
I tried your patch, but it didn't seem to help my reproducer.
Stock GCC13:
Number of Modules | Build Time
----------------- | ----------
10 | 0.336674
20 | 2.34525
30 | 7.37042
40 | 17.2896
50 | 33.9653
GCC13 with Jerry Patch:
Number of Modules | Build Time
----------------- | ----------
10 | 0.378347
20 | 2.51914
30 | 8.10597
40 | 18.982
50 | 37.3188
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug fortran/98426] find_symbol in module.c traverses O(N) part of a search tree
2020-12-23 10:42 [Bug fortran/98426] New: find_symbol in module.c traverses O(N) part of a search tree mscfd at gmx dot net
` (8 preceding siblings ...)
2024-04-25 11:20 ` matthew.thompson at nasa dot gov
@ 2024-04-25 16:38 ` jvdelisle at gcc dot gnu.org
2024-04-26 3:12 ` jvdelisle at gcc dot gnu.org
` (4 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: jvdelisle at gcc dot gnu.org @ 2024-04-25 16:38 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98426
Jerry DeLisle <jvdelisle at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|unassigned at gcc dot gnu.org |jvdelisle at gcc dot gnu.org
--- Comment #10 from Jerry DeLisle <jvdelisle at gcc dot gnu.org> ---
Well the experiment did not work. I think we need to do something a bit more
complex. I will asign to myself so it does fall into a crack and get back here
as we go. I will consult with others on ideas.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug fortran/98426] find_symbol in module.c traverses O(N) part of a search tree
2020-12-23 10:42 [Bug fortran/98426] New: find_symbol in module.c traverses O(N) part of a search tree mscfd at gmx dot net
` (9 preceding siblings ...)
2024-04-25 16:38 ` jvdelisle at gcc dot gnu.org
@ 2024-04-26 3:12 ` jvdelisle at gcc dot gnu.org
2024-04-26 12:43 ` matthew.thompson at nasa dot gov
` (3 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: jvdelisle at gcc dot gnu.org @ 2024-04-26 3:12 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98426
--- Comment #11 from Jerry DeLisle <jvdelisle at gcc dot gnu.org> ---
I am able to run your reproducer and I can see the increasing times as the
number of modules goes up. I am curious if you could randomize the subroutine
names? These appear fairly repetitive and I wonder if this biases the test.
My results:
Build time for base.F90 in Modules_100: 0.446443 seconds
Number of Modules | Build Time
----------------- | ----------
10 | 0.0155371
20 | 0.024554
30 | 0.041164
40 | 0.0620602
50 | 0.092014
60 | 0.135193
70 | 0.184979
80 | 0.255272
90 | 0.335244
100 | 0.446443
real 0m27.194s
user 2m27.698s
sys 0m21.231s
The build time presented appears to not be wall time so I wonder what this is.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug fortran/98426] find_symbol in module.c traverses O(N) part of a search tree
2020-12-23 10:42 [Bug fortran/98426] New: find_symbol in module.c traverses O(N) part of a search tree mscfd at gmx dot net
` (10 preceding siblings ...)
2024-04-26 3:12 ` jvdelisle at gcc dot gnu.org
@ 2024-04-26 12:43 ` matthew.thompson at nasa dot gov
2024-04-26 15:53 ` matthew.thompson at nasa dot gov
` (2 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: matthew.thompson at nasa dot gov @ 2024-04-26 12:43 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98426
--- Comment #12 from Matt Thompson <matthew.thompson at nasa dot gov> ---
Jerry,
Actually, I took a look at my reproducer and it's not quite what I was wanting
(I made a mistake in the Jinja templates). I'm going to work on it now to fix
this up. And I'll look at adding the random name option as well. Back soon...
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug fortran/98426] find_symbol in module.c traverses O(N) part of a search tree
2020-12-23 10:42 [Bug fortran/98426] New: find_symbol in module.c traverses O(N) part of a search tree mscfd at gmx dot net
` (11 preceding siblings ...)
2024-04-26 12:43 ` matthew.thompson at nasa dot gov
@ 2024-04-26 15:53 ` matthew.thompson at nasa dot gov
2024-04-26 15:54 ` matthew.thompson at nasa dot gov
2024-04-27 2:34 ` jvdelisle at gcc dot gnu.org
14 siblings, 0 replies; 16+ messages in thread
From: matthew.thompson at nasa dot gov @ 2024-04-26 15:53 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98426
--- Comment #13 from Matt Thompson <matthew.thompson at nasa dot gov> ---
Okay I have a new reproducer that I'll attach here. It uses the random names.
I see the same behavior:
IFX 2024.1:
Number of Modules | Build Time
----------------- | ----------
10 | 0.100479
20 | 0.15462
30 | 0.209833
40 | 0.259712
50 | 0.314469
GCC 13.2:
Number of Modules | Build Time
----------------- | ----------
10 | 1.55647
20 | 6.41986
30 | 12.7215
40 | 25.4188
50 | 36.3083
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug fortran/98426] find_symbol in module.c traverses O(N) part of a search tree
2020-12-23 10:42 [Bug fortran/98426] New: find_symbol in module.c traverses O(N) part of a search tree mscfd at gmx dot net
` (12 preceding siblings ...)
2024-04-26 15:53 ` matthew.thompson at nasa dot gov
@ 2024-04-26 15:54 ` matthew.thompson at nasa dot gov
2024-04-27 2:34 ` jvdelisle at gcc dot gnu.org
14 siblings, 0 replies; 16+ messages in thread
From: matthew.thompson at nasa dot gov @ 2024-04-26 15:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98426
--- Comment #14 from Matt Thompson <matthew.thompson at nasa dot gov> ---
Never mind. I'll send attachment to Jerry offline. It's too big for here.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug fortran/98426] find_symbol in module.c traverses O(N) part of a search tree
2020-12-23 10:42 [Bug fortran/98426] New: find_symbol in module.c traverses O(N) part of a search tree mscfd at gmx dot net
` (13 preceding siblings ...)
2024-04-26 15:54 ` matthew.thompson at nasa dot gov
@ 2024-04-27 2:34 ` jvdelisle at gcc dot gnu.org
14 siblings, 0 replies; 16+ messages in thread
From: jvdelisle at gcc dot gnu.org @ 2024-04-27 2:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98426
--- Comment #15 from Jerry DeLisle <jvdelisle at gcc dot gnu.org> ---
(In reply to Matt Thompson from comment #14)
> Never mind. I'll send attachment to Jerry offline. It's too big for here.
Got it. It works quite well for our purposes.
^ permalink raw reply [flat|nested] 16+ messages in thread