public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug fortran/30409] [fortran] missed optimization with pure function arguments
[not found] <bug-30409-4@http.gcc.gnu.org/bugzilla/>
@ 2010-10-02 8:10 ` tkoenig at gcc dot gnu.org
2023-10-22 16:43 ` kargl at gcc dot gnu.org
` (5 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: tkoenig at gcc dot gnu.org @ 2010-10-02 8:10 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30409
--- Comment #5 from Thomas Koenig <tkoenig at gcc dot gnu.org> 2010-10-02 08:10:51 UTC ---
Related to PR 45777.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug fortran/30409] [fortran] missed optimization with pure function arguments
[not found] <bug-30409-4@http.gcc.gnu.org/bugzilla/>
2010-10-02 8:10 ` [Bug fortran/30409] [fortran] missed optimization with pure function arguments tkoenig at gcc dot gnu.org
@ 2023-10-22 16:43 ` kargl at gcc dot gnu.org
2023-10-22 19:18 ` anlauf at gcc dot gnu.org
` (4 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: kargl at gcc dot gnu.org @ 2023-10-22 16:43 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30409
--- Comment #7 from kargl at gcc dot gnu.org ---
The attached testcase use xmin and xmax uninitialized.
After setting xmin = 0 and xmax = 1 and adding z(1) to
the print statements to prevent the inner loop from
being optimized away, I see the following:
% gfcx -o z -O0 a.f90 && ./z
time 1: 1.78299993E-02 7249751.00
time 2: 6.37416887 7249751.00
% gfcx -o z -O1 a.f90 && ./z
time 1: 1.37590002E-02 7249751.00
time 2: 6.36764479 7249751.00
% gfcx -o z -O2 a.f90 && ./z
time 1: 1.23690004E-02 7249751.00
time 2: 1.85729897 7249751.00
% gfcx -o z -O3 a.f90 && ./z
time 1: 2.43199989E-03 7249751.00
time 2: 1.85660207 7249751.00
% gfcx -o z -Ofast a.f90 && ./z
time 1: 3.63499997E-03 7249751.50
time 2: 0.621210992 7249751.50
so the timing improves with optimization. -fdump-tree-original still
shows the generation of a temporary variable for the actual argument
1/y in the second set of nested loops. -fdump-tree-optimized is
fairly difficult for me to decipher, but it appears that the 1/y
is not hoisted out of the inner loop.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug fortran/30409] [fortran] missed optimization with pure function arguments
[not found] <bug-30409-4@http.gcc.gnu.org/bugzilla/>
2010-10-02 8:10 ` [Bug fortran/30409] [fortran] missed optimization with pure function arguments tkoenig at gcc dot gnu.org
2023-10-22 16:43 ` kargl at gcc dot gnu.org
@ 2023-10-22 19:18 ` anlauf at gcc dot gnu.org
2023-10-23 5:50 ` tkoenig at gcc dot gnu.org
` (3 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: anlauf at gcc dot gnu.org @ 2023-10-22 19:18 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30409
--- Comment #8 from anlauf at gcc dot gnu.org ---
The suggested optimization needs to take into account that the evaluation
of the temporary expression might trap, or that allocatable variables are
not allocated, etc.
The trap etc. would not occur if the trip count of the loop is zero for the
non-hoisted variant, so we need to make sure not to generate failing code
for the hoisted one.
Similarly, for conditional code in the loop body, like
if (cond) then
expression1 (..., 1/y)
else
expression2 (..., 1/z)
end if
where cond protects from traps even for finite trip counts, these conditions
may also need to be identified, and an appropriate block generated.
Some HPC compilers have directives (MOVE/NOMOVE) to annotate the respective
loops, and corresponding compiler options that are enabled only at aggressive
optimization levels for real-world code.
I wonder how much (or little) really needs to be done here, or if the task
can be split in a suitable way between FE and ME.
The tree-dump shows a __builtin_malloc/__builtin_free for the temporary
*within* the i-loop. Would it be possible to move this *management* just
one loop level up, if the size of the temporary is known to be constant?
(Which is the case here). I mean attach it to the outer scope?
Maybe the middle end then better "sees" what can reasonably be done?
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug fortran/30409] [fortran] missed optimization with pure function arguments
[not found] <bug-30409-4@http.gcc.gnu.org/bugzilla/>
` (2 preceding siblings ...)
2023-10-22 19:18 ` anlauf at gcc dot gnu.org
@ 2023-10-23 5:50 ` tkoenig at gcc dot gnu.org
2023-10-23 19:23 ` kargl at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 11+ messages in thread
From: tkoenig at gcc dot gnu.org @ 2023-10-23 5:50 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30409
Thomas Koenig <tkoenig at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Depends on| |21046
--- Comment #9 from Thomas Koenig <tkoenig at gcc dot gnu.org> ---
(In reply to anlauf from comment #8)
> I wonder how much (or little) really needs to be done here, or if the task
> can be split in a suitable way between FE and ME.
>
> The tree-dump shows a __builtin_malloc/__builtin_free for the temporary
> *within* the i-loop. Would it be possible to move this *management* just
> one loop level up, if the size of the temporary is known to be constant?
> (Which is the case here). I mean attach it to the outer scope?
> Maybe the middle end then better "sees" what can reasonably be done?
A lot of it can probably be done in the middle end.
For memory allocation, this would be PR21046 (first variant), which
would be highly useful already.
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21046
[Bug 21046] move memory allocation out of a loop
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug fortran/30409] [fortran] missed optimization with pure function arguments
[not found] <bug-30409-4@http.gcc.gnu.org/bugzilla/>
` (3 preceding siblings ...)
2023-10-23 5:50 ` tkoenig at gcc dot gnu.org
@ 2023-10-23 19:23 ` kargl at gcc dot gnu.org
2023-10-23 19:28 ` anlauf at gcc dot gnu.org
2023-10-23 21:45 ` kargl at gcc dot gnu.org
6 siblings, 0 replies; 11+ messages in thread
From: kargl at gcc dot gnu.org @ 2023-10-23 19:23 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30409
--- Comment #10 from kargl at gcc dot gnu.org ---
(In reply to anlauf from comment #8)
> The suggested optimization needs to take into account that the evaluation
> of the temporary expression might trap, or that allocatable variables are
> not allocated, etc.
>
> The trap etc. would not occur if the trip count of the loop is zero for the
> non-hoisted variant, so we need to make sure not to generate failing code
> for the hoisted one.
>
> Similarly, for conditional code in the loop body, like
>
> if (cond) then
> expression1 (..., 1/y)
> else
> expression2 (..., 1/z)
> end if
>
> where cond protects from traps even for finite trip counts, these conditions
> may also need to be identified, and an appropriate block generated.
I'm not sure what you are worried about here. If one has
do i = 1, n
... = expression1(..., 1/y)
end do
then this is equivalent to
do i = 1, n
tmp = 1 / y
... = expression1(..., tmp)
end do
which is equivalent to
tmp = 1 / y
do i = 1, n
... = expression1(..., tmp)
end do
I suppose I could do something exceedingly stupid such as
function expression1(..., xxx)
common /foo/y
y = 0
...
end
but this would then lead to invalid Fortran when i = 2 in the
above initial loops as (1/y) is invalid Fortran if y = 0.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug fortran/30409] [fortran] missed optimization with pure function arguments
[not found] <bug-30409-4@http.gcc.gnu.org/bugzilla/>
` (4 preceding siblings ...)
2023-10-23 19:23 ` kargl at gcc dot gnu.org
@ 2023-10-23 19:28 ` anlauf at gcc dot gnu.org
2023-10-23 21:45 ` kargl at gcc dot gnu.org
6 siblings, 0 replies; 11+ messages in thread
From: anlauf at gcc dot gnu.org @ 2023-10-23 19:28 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30409
--- Comment #11 from anlauf at gcc dot gnu.org ---
(In reply to kargl from comment #10)
> (In reply to anlauf from comment #8)
> I'm not sure what you are worried about here. If one has
>
> do i = 1, n
> ... = expression1(..., 1/y)
> end do
>
> then this is equivalent to
>
> do i = 1, n
> tmp = 1 / y
> ... = expression1(..., tmp)
> end do
OK so far.
> which is equivalent to
>
> tmp = 1 / y
> do i = 1, n
> ... = expression1(..., tmp)
> end do
No. Strictly speaking, it is only equivalent to:
if (n > 0) tmp = 1 / y
do i = 1, n
... = expression1(..., tmp)
end do
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug fortran/30409] [fortran] missed optimization with pure function arguments
[not found] <bug-30409-4@http.gcc.gnu.org/bugzilla/>
` (5 preceding siblings ...)
2023-10-23 19:28 ` anlauf at gcc dot gnu.org
@ 2023-10-23 21:45 ` kargl at gcc dot gnu.org
6 siblings, 0 replies; 11+ messages in thread
From: kargl at gcc dot gnu.org @ 2023-10-23 21:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30409
--- Comment #12 from kargl at gcc dot gnu.org ---
(In reply to anlauf from comment #11)
> (In reply to kargl from comment #10)
> > (In reply to anlauf from comment #8)
>
> > which is equivalent to
> >
> > tmp = 1 / y
> > do i = 1, n
> > ... = expression1(..., tmp)
> > end do
>
> No. Strictly speaking, it is only equivalent to:
>
> if (n > 0) tmp = 1 / y
> do i = 1, n
> ... = expression1(..., tmp)
> end do
Ah, yes, I missed the possibility that the loop may not loop at all.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug fortran/30409] [fortran] missed optimization with pure function arguments
2007-01-08 21:31 [Bug fortran/30409] New: " kargl at gcc dot gnu dot org
` (2 preceding siblings ...)
2007-01-09 11:11 ` rguenth at gcc dot gnu dot org
@ 2007-01-09 16:08 ` kargl at gcc dot gnu dot org
3 siblings, 0 replies; 11+ messages in thread
From: kargl at gcc dot gnu dot org @ 2007-01-09 16:08 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from kargl at gcc dot gnu dot org 2007-01-09 16:08 -------
Note, above the first FORALL statement one needs to add
the following 2 lines of code
xmin = 0.
xmax = 1.
As a side note, both Pathscale and Intel in the c.l.f thread have
acknowledged that their compilers also miss this optimization.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30409
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug fortran/30409] [fortran] missed optimization with pure function arguments
2007-01-08 21:31 [Bug fortran/30409] New: " kargl at gcc dot gnu dot org
2007-01-08 21:32 ` [Bug fortran/30409] " kargl at gcc dot gnu dot org
2007-01-08 21:36 ` kargl at gcc dot gnu dot org
@ 2007-01-09 11:11 ` rguenth at gcc dot gnu dot org
2007-01-09 16:08 ` kargl at gcc dot gnu dot org
3 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2007-01-09 11:11 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from rguenth at gcc dot gnu dot org 2007-01-09 11:11 -------
In the middle-end this somewhat is related to PR26387. Of course this is a
place
where frontend optimization is probably easier to do.
Confirmed.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
BugsThisDependsOn| |26387
Status|UNCONFIRMED |NEW
Ever Confirmed|0 |1
Keywords| |missed-optimization
Last reconfirmed|0000-00-00 00:00:00 |2007-01-09 11:11:16
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30409
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug fortran/30409] [fortran] missed optimization with pure function arguments
2007-01-08 21:31 [Bug fortran/30409] New: " kargl at gcc dot gnu dot org
2007-01-08 21:32 ` [Bug fortran/30409] " kargl at gcc dot gnu dot org
@ 2007-01-08 21:36 ` kargl at gcc dot gnu dot org
2007-01-09 11:11 ` rguenth at gcc dot gnu dot org
2007-01-09 16:08 ` kargl at gcc dot gnu dot org
3 siblings, 0 replies; 11+ messages in thread
From: kargl at gcc dot gnu dot org @ 2007-01-08 21:36 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from kargl at gcc dot gnu dot org 2007-01-08 21:36 -------
Sorry about the long URL, but the code comes from this comp.lang.fortran
thread.
http://groups-beta.google.com/group/comp.lang.fortran/browse_thread/thread/9f9bf1c116dc4b69/712366ef4318e84d#712366ef4318e84d
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30409
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug fortran/30409] [fortran] missed optimization with pure function arguments
2007-01-08 21:31 [Bug fortran/30409] New: " kargl at gcc dot gnu dot org
@ 2007-01-08 21:32 ` kargl at gcc dot gnu dot org
2007-01-08 21:36 ` kargl at gcc dot gnu dot org
` (2 subsequent siblings)
3 siblings, 0 replies; 11+ messages in thread
From: kargl at gcc dot gnu dot org @ 2007-01-08 21:32 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from kargl at gcc dot gnu dot org 2007-01-08 21:32 -------
Created an attachment (id=12871)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12871&action=view)
missed optimization
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30409
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-10-23 21:45 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <bug-30409-4@http.gcc.gnu.org/bugzilla/>
2010-10-02 8:10 ` [Bug fortran/30409] [fortran] missed optimization with pure function arguments tkoenig at gcc dot gnu.org
2023-10-22 16:43 ` kargl at gcc dot gnu.org
2023-10-22 19:18 ` anlauf at gcc dot gnu.org
2023-10-23 5:50 ` tkoenig at gcc dot gnu.org
2023-10-23 19:23 ` kargl at gcc dot gnu.org
2023-10-23 19:28 ` anlauf at gcc dot gnu.org
2023-10-23 21:45 ` kargl at gcc dot gnu.org
2007-01-08 21:31 [Bug fortran/30409] New: " kargl at gcc dot gnu dot org
2007-01-08 21:32 ` [Bug fortran/30409] " kargl at gcc dot gnu dot org
2007-01-08 21:36 ` kargl at gcc dot gnu dot org
2007-01-09 11:11 ` rguenth at gcc dot gnu dot org
2007-01-09 16:08 ` kargl at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).