public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug fortran/30409] [fortran] missed optimization with pure function arguments
       [not found] <bug-30409-4@http.gcc.gnu.org/bugzilla/>
@ 2010-10-02  8:10 ` tkoenig at gcc dot gnu.org
  2023-10-22 16:43 ` kargl at gcc dot gnu.org
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: tkoenig at gcc dot gnu.org @ 2010-10-02  8:10 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30409

--- Comment #5 from Thomas Koenig <tkoenig at gcc dot gnu.org> 2010-10-02 08:10:51 UTC ---
Related to PR 45777.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug fortran/30409] [fortran] missed optimization with pure function arguments
       [not found] <bug-30409-4@http.gcc.gnu.org/bugzilla/>
  2010-10-02  8:10 ` [Bug fortran/30409] [fortran] missed optimization with pure function arguments tkoenig at gcc dot gnu.org
@ 2023-10-22 16:43 ` kargl at gcc dot gnu.org
  2023-10-22 19:18 ` anlauf at gcc dot gnu.org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: kargl at gcc dot gnu.org @ 2023-10-22 16:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30409

--- Comment #7 from kargl at gcc dot gnu.org ---
The attached testcase use xmin and xmax uninitialized.
After setting xmin = 0 and xmax = 1 and adding z(1) to 
the print statements to prevent the inner loop from 
being optimized away,  I see the following:

% gfcx -o z -O0 a.f90 && ./z
 time 1:    1.78299993E-02   7249751.00    
 time 2:    6.37416887       7249751.00    
% gfcx -o z -O1 a.f90 && ./z
 time 1:    1.37590002E-02   7249751.00    
 time 2:    6.36764479       7249751.00    
% gfcx -o z -O2 a.f90 && ./z
 time 1:    1.23690004E-02   7249751.00    
 time 2:    1.85729897       7249751.00    
% gfcx -o z -O3 a.f90 && ./z
 time 1:    2.43199989E-03   7249751.00    
 time 2:    1.85660207       7249751.00    
% gfcx -o z -Ofast a.f90 && ./z
 time 1:    3.63499997E-03   7249751.50    
 time 2:   0.621210992       7249751.50    

so the timing improves with optimization.  -fdump-tree-original still
shows the generation of a temporary variable for the actual argument
1/y in the second set of nested loops.  -fdump-tree-optimized is 
fairly difficult for me to decipher, but it appears that the 1/y
is not hoisted out of the inner loop.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug fortran/30409] [fortran] missed optimization with pure function arguments
       [not found] <bug-30409-4@http.gcc.gnu.org/bugzilla/>
  2010-10-02  8:10 ` [Bug fortran/30409] [fortran] missed optimization with pure function arguments tkoenig at gcc dot gnu.org
  2023-10-22 16:43 ` kargl at gcc dot gnu.org
@ 2023-10-22 19:18 ` anlauf at gcc dot gnu.org
  2023-10-23  5:50 ` tkoenig at gcc dot gnu.org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: anlauf at gcc dot gnu.org @ 2023-10-22 19:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30409

--- Comment #8 from anlauf at gcc dot gnu.org ---
The suggested optimization needs to take into account that the evaluation
of the temporary expression might trap, or that allocatable variables are
not allocated, etc.

The trap etc. would not occur if the trip count of the loop is zero for the
non-hoisted variant, so we need to make sure not to generate failing code
for the hoisted one.

Similarly, for conditional code in the loop body, like

  if (cond) then
     expression1 (..., 1/y)
  else
     expression2 (..., 1/z)
  end if

where cond protects from traps even for finite trip counts, these conditions
may also need to be identified, and an appropriate block generated.

Some HPC compilers have directives (MOVE/NOMOVE) to annotate the respective
loops, and corresponding compiler options that are enabled only at aggressive
optimization levels for real-world code.

I wonder how much (or little) really needs to be done here, or if the task
can be split in a suitable way between FE and ME.

The tree-dump shows a __builtin_malloc/__builtin_free for the temporary
*within* the i-loop.  Would it be possible to move this *management* just
one loop level up, if the size of the temporary is known to be constant?
(Which is the case here).  I mean attach it to the outer scope?
Maybe the middle end then better "sees" what can reasonably be done?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug fortran/30409] [fortran] missed optimization with pure function arguments
       [not found] <bug-30409-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2023-10-22 19:18 ` anlauf at gcc dot gnu.org
@ 2023-10-23  5:50 ` tkoenig at gcc dot gnu.org
  2023-10-23 19:23 ` kargl at gcc dot gnu.org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: tkoenig at gcc dot gnu.org @ 2023-10-23  5:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30409

Thomas Koenig <tkoenig at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Depends on|                            |21046

--- Comment #9 from Thomas Koenig <tkoenig at gcc dot gnu.org> ---
(In reply to anlauf from comment #8)

> I wonder how much (or little) really needs to be done here, or if the task
> can be split in a suitable way between FE and ME.
> 
> The tree-dump shows a __builtin_malloc/__builtin_free for the temporary
> *within* the i-loop.  Would it be possible to move this *management* just
> one loop level up, if the size of the temporary is known to be constant?
> (Which is the case here).  I mean attach it to the outer scope?
> Maybe the middle end then better "sees" what can reasonably be done?

A lot of it can probably be done in the middle end.

For memory allocation, this would be PR21046 (first variant), which
would be highly useful already.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21046
[Bug 21046] move memory allocation out of a loop

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug fortran/30409] [fortran] missed optimization with pure function arguments
       [not found] <bug-30409-4@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2023-10-23  5:50 ` tkoenig at gcc dot gnu.org
@ 2023-10-23 19:23 ` kargl at gcc dot gnu.org
  2023-10-23 19:28 ` anlauf at gcc dot gnu.org
  2023-10-23 21:45 ` kargl at gcc dot gnu.org
  6 siblings, 0 replies; 11+ messages in thread
From: kargl at gcc dot gnu.org @ 2023-10-23 19:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30409

--- Comment #10 from kargl at gcc dot gnu.org ---
(In reply to anlauf from comment #8)
> The suggested optimization needs to take into account that the evaluation
> of the temporary expression might trap, or that allocatable variables are
> not allocated, etc.
> 
> The trap etc. would not occur if the trip count of the loop is zero for the
> non-hoisted variant, so we need to make sure not to generate failing code
> for the hoisted one.
> 
> Similarly, for conditional code in the loop body, like
> 
>   if (cond) then
>      expression1 (..., 1/y)
>   else
>      expression2 (..., 1/z)
>   end if
> 
> where cond protects from traps even for finite trip counts, these conditions
> may also need to be identified, and an appropriate block generated.

I'm not sure what you are worried about here.  If one has

   do i = 1, n
      ... = expression1(..., 1/y)
   end do

then this is equivalent to

   do i = 1, n
      tmp = 1 / y
      ... = expression1(..., tmp)
   end do

which is equivalent to 

   tmp = 1 / y
   do i = 1, n
      ... = expression1(..., tmp)
   end do

I suppose I could do something exceedingly stupid such as

   function expression1(..., xxx)
      common /foo/y
      y = 0
      ...
   end

but this would then lead to invalid Fortran when i = 2 in the
above initial loops as (1/y) is invalid Fortran if y = 0.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug fortran/30409] [fortran] missed optimization with pure function arguments
       [not found] <bug-30409-4@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2023-10-23 19:23 ` kargl at gcc dot gnu.org
@ 2023-10-23 19:28 ` anlauf at gcc dot gnu.org
  2023-10-23 21:45 ` kargl at gcc dot gnu.org
  6 siblings, 0 replies; 11+ messages in thread
From: anlauf at gcc dot gnu.org @ 2023-10-23 19:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30409

--- Comment #11 from anlauf at gcc dot gnu.org ---
(In reply to kargl from comment #10)
> (In reply to anlauf from comment #8)
> I'm not sure what you are worried about here.  If one has
> 
>    do i = 1, n
>       ... = expression1(..., 1/y)
>    end do
> 
> then this is equivalent to
> 
>    do i = 1, n
>       tmp = 1 / y
>       ... = expression1(..., tmp)
>    end do

OK so far.

> which is equivalent to 
> 
>    tmp = 1 / y
>    do i = 1, n
>       ... = expression1(..., tmp)
>    end do

No.  Strictly speaking, it is only equivalent to:

    if (n > 0) tmp = 1 / y
    do i = 1, n
       ... = expression1(..., tmp)
    end do

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug fortran/30409] [fortran] missed optimization with pure function arguments
       [not found] <bug-30409-4@http.gcc.gnu.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2023-10-23 19:28 ` anlauf at gcc dot gnu.org
@ 2023-10-23 21:45 ` kargl at gcc dot gnu.org
  6 siblings, 0 replies; 11+ messages in thread
From: kargl at gcc dot gnu.org @ 2023-10-23 21:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30409

--- Comment #12 from kargl at gcc dot gnu.org ---
(In reply to anlauf from comment #11)
> (In reply to kargl from comment #10)
> > (In reply to anlauf from comment #8)
> 
> > which is equivalent to 
> > 
> >    tmp = 1 / y
> >    do i = 1, n
> >       ... = expression1(..., tmp)
> >    end do
> 
> No.  Strictly speaking, it is only equivalent to:
> 
>     if (n > 0) tmp = 1 / y
>     do i = 1, n
>        ... = expression1(..., tmp)
>     end do

Ah, yes, I missed the possibility that the loop may not loop at all.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug fortran/30409] [fortran] missed optimization with pure function arguments
  2007-01-08 21:31 [Bug fortran/30409] New: " kargl at gcc dot gnu dot org
                   ` (2 preceding siblings ...)
  2007-01-09 11:11 ` rguenth at gcc dot gnu dot org
@ 2007-01-09 16:08 ` kargl at gcc dot gnu dot org
  3 siblings, 0 replies; 11+ messages in thread
From: kargl at gcc dot gnu dot org @ 2007-01-09 16:08 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from kargl at gcc dot gnu dot org  2007-01-09 16:08 -------
Note,  above the first FORALL statement one needs to add
the following 2 lines of code

xmin = 0.
xmax = 1.

As a side note, both Pathscale and Intel in the c.l.f thread have
acknowledged that their compilers also miss this optimization.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30409


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug fortran/30409] [fortran] missed optimization with pure function arguments
  2007-01-08 21:31 [Bug fortran/30409] New: " kargl at gcc dot gnu dot org
  2007-01-08 21:32 ` [Bug fortran/30409] " kargl at gcc dot gnu dot org
  2007-01-08 21:36 ` kargl at gcc dot gnu dot org
@ 2007-01-09 11:11 ` rguenth at gcc dot gnu dot org
  2007-01-09 16:08 ` kargl at gcc dot gnu dot org
  3 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2007-01-09 11:11 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from rguenth at gcc dot gnu dot org  2007-01-09 11:11 -------
In the middle-end this somewhat is related to PR26387.  Of course this is a
place
where frontend optimization is probably easier to do.

Confirmed.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  BugsThisDependsOn|                            |26387
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
           Keywords|                            |missed-optimization
   Last reconfirmed|0000-00-00 00:00:00         |2007-01-09 11:11:16
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30409


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug fortran/30409] [fortran] missed optimization with pure function arguments
  2007-01-08 21:31 [Bug fortran/30409] New: " kargl at gcc dot gnu dot org
  2007-01-08 21:32 ` [Bug fortran/30409] " kargl at gcc dot gnu dot org
@ 2007-01-08 21:36 ` kargl at gcc dot gnu dot org
  2007-01-09 11:11 ` rguenth at gcc dot gnu dot org
  2007-01-09 16:08 ` kargl at gcc dot gnu dot org
  3 siblings, 0 replies; 11+ messages in thread
From: kargl at gcc dot gnu dot org @ 2007-01-08 21:36 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from kargl at gcc dot gnu dot org  2007-01-08 21:36 -------
Sorry about the long URL, but the code comes from this comp.lang.fortran
thread.

http://groups-beta.google.com/group/comp.lang.fortran/browse_thread/thread/9f9bf1c116dc4b69/712366ef4318e84d#712366ef4318e84d


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30409


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug fortran/30409] [fortran] missed optimization with pure function arguments
  2007-01-08 21:31 [Bug fortran/30409] New: " kargl at gcc dot gnu dot org
@ 2007-01-08 21:32 ` kargl at gcc dot gnu dot org
  2007-01-08 21:36 ` kargl at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: kargl at gcc dot gnu dot org @ 2007-01-08 21:32 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from kargl at gcc dot gnu dot org  2007-01-08 21:32 -------
Created an attachment (id=12871)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12871&action=view)
missed optimization


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30409


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-10-23 21:45 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-30409-4@http.gcc.gnu.org/bugzilla/>
2010-10-02  8:10 ` [Bug fortran/30409] [fortran] missed optimization with pure function arguments tkoenig at gcc dot gnu.org
2023-10-22 16:43 ` kargl at gcc dot gnu.org
2023-10-22 19:18 ` anlauf at gcc dot gnu.org
2023-10-23  5:50 ` tkoenig at gcc dot gnu.org
2023-10-23 19:23 ` kargl at gcc dot gnu.org
2023-10-23 19:28 ` anlauf at gcc dot gnu.org
2023-10-23 21:45 ` kargl at gcc dot gnu.org
2007-01-08 21:31 [Bug fortran/30409] New: " kargl at gcc dot gnu dot org
2007-01-08 21:32 ` [Bug fortran/30409] " kargl at gcc dot gnu dot org
2007-01-08 21:36 ` kargl at gcc dot gnu dot org
2007-01-09 11:11 ` rguenth at gcc dot gnu dot org
2007-01-09 16:08 ` kargl at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).