public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug fortran/53778] New: bad code (delivering NaN instead of proper result) with -foptimize-sibling-calls
@ 2012-06-26  9:37 thomas.orgis at awi dot de
  2012-06-26 12:29 ` [Bug fortran/53778] " burnus at gcc dot gnu.org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: thomas.orgis at awi dot de @ 2012-06-26  9:37 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53778

             Bug #: 53778
           Summary: bad code (delivering NaN instead of proper result)
                    with -foptimize-sibling-calls
    Classification: Unclassified
           Product: gcc
           Version: 4.6.3
            Status: UNCONFIRMED
          Severity: major
          Priority: P3
         Component: fortran
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: thomas.orgis@awi.de
              Host: x86_64-pc-linux-gnu
            Target: x86_64-pc-linux-gnu
             Build: x86_64-pc-linux-gnu


I have a function in my Fortran code base that looks like this:

function dat_init_wind(handle, x, perturb) result(wind)
   type(datafield_t), intent(in) :: handle
   real(kind=8), dimension(:), intent(in) :: x
   logical, intent(in), optional :: perturb
   real(kind=8), dimension(handle%world%dims) :: wind

   logical :: pert

   pert = .true.
   if(present(perturb)) pert = perturb;

   select case(handle%initial_state)
   case(dat_init_geosiso)
      wind = geosiso_wind(handle, handle%bottomwind, x)
   case(dat_init_baroiso)
      wind = baroiso_wind(handle, handle%bottomwind, x)
   case(dat_init_baroclinpoly)
      wind = baroclinpoly_wind(handle, handle%baroclinpoly, x)
   case default
      wind = handle%base_speeds * bottomwind_profile(handle%bottomwind,
world_map_y(handle%world, x))
   end select_

   if(pert .and. handle%perturb_wind /= 0) call pert_perturb(handle%pert,
wind(handle%perturb_wind), x, handle%world%scale%space /&
& handle%world%scale%time)
end function

This worked fine until a recent change, where I changed innerworkings of
pert_perturb(). Suddenly the result (wind) was a set of NaN instead of 0 (in
the considered configuration).

Note that pert is false, as is (handle%perturb_wind /= 0), so the changes to
pert_perturb() should have no influence on the result. Also, I noticed that
adding a printout to the function fixes things, even if it is not actually
called:

    select case(handle%initial_state)
    case(-100)
        write(0,0) 'This is a stub that never is executed but prevent compiler
BUG 25 from triggering. Apparently.'
    case(dat_init_geosiso)

Also, dropping the call to bottomwind_profile() from the 'default' case, which
is what is actually executed, fixes the issue, but makes the code rather no-op
for me. While the recent changes in my code also touch that function, it itself
still computes correclty (as does pert_perturb()).

That somehow fits with me narrowing down the issue to the minimal optimization
flags needed (down from simple -g -O2):

-O -g -foptimize-sibling-calls

(The -g can be dropped, I strongly assume.) So it is something with optimizing
function calls / stack mess-around.

Note that I was unable to further reduce what is behind -O, I can activate all
flags that make up -O here individually and the issue does not come into play.
But also, setting -O and disabling all flags also does not trigger (it's a
combination I guess). So it is <some unknown basic setting> plus sibling call
optimization.

Now, does this description ring a bell? It would be so swell if this situation
was clear enough to diagnose the error in gfortran's optimization. If not, I
will have to try to extract something self-contained out of my codebase again
... which might need considerable time.

If this issue is already known and something along that fixed (in 4.7,
perhaps?), that would be a nice surprise.

I apologize if this turns out to be a bug in my code after all, but since I
work with  -fbounds-check -ffpe-trap=invalid,zero,overflow usually, I don't see
much possibility to produce such breakage in Fortran. If this were a C program,
I'd look harder for me messing up someplace;-)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug fortran/53778] bad code (delivering NaN instead of proper result) with -foptimize-sibling-calls
  2012-06-26  9:37 [Bug fortran/53778] New: bad code (delivering NaN instead of proper result) with -foptimize-sibling-calls thomas.orgis at awi dot de
@ 2012-06-26 12:29 ` burnus at gcc dot gnu.org
  2012-06-27  8:59 ` thomas.orgis at awi dot de
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: burnus at gcc dot gnu.org @ 2012-06-26 12:29 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53778

Tobias Burnus <burnus at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |burnus at gcc dot gnu.org

--- Comment #1 from Tobias Burnus <burnus at gcc dot gnu.org> 2012-06-26 12:29:30 UTC ---
(In reply to comment #0)
> I have a function in my Fortran code base that looks like this:
[...]

Can you create a full example? It is usually not easy to debug such issues
without having a handle to a full example. Additionally, the code shown does
typically not contain the full information. For instance, the issue might be
due to inlining - but then it depends also on how and where it is inlined.


> If this issue is already known and something along that fixed (in 4.7,
> perhaps?), that would be a nice surprise.

Well, the easiest is that you try it yourself as you have the full source code.
Unofficial builds of 4.7 and 4.8 are available at
http://gcc.gnu.org/wiki/GFortranBinaries

Having said that, I am not aware of any recent fix which could have directly
fixed that, nor do I know of any issue related to -foptimize-sibling-calls
which still affects the fairly recent 4.6.3.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug fortran/53778] bad code (delivering NaN instead of proper result) with -foptimize-sibling-calls
  2012-06-26  9:37 [Bug fortran/53778] New: bad code (delivering NaN instead of proper result) with -foptimize-sibling-calls thomas.orgis at awi dot de
  2012-06-26 12:29 ` [Bug fortran/53778] " burnus at gcc dot gnu.org
@ 2012-06-27  8:59 ` thomas.orgis at awi dot de
  2012-06-27  9:03 ` thomas.orgis at awi dot de
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: thomas.orgis at awi dot de @ 2012-06-27  8:59 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53778

--- Comment #2 from Thomas Orgis <thomas.orgis at awi dot de> 2012-06-27 08:58:57 UTC ---
Created attachment 27710
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27710
tarball with complete source to reproduce the issue

Ok, then, I feared as much. After taking too much time off the work I get
actually paid for (did I say that scientific programming sucks?) ... here is a
tarball that reproduces the issue for me and is hopefully small enough to serve
as a starting point.

It is a set of modules, most of which should not be relevant, just tedious to
eliminate. The important sources are datafield.f90, bottomwind.f90 and
perturb.f90 ... well just start from dat_init_wind(), which exhibits the bug
when called just after dat_init_density() (with inside call to pert_perturb()).


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug fortran/53778] bad code (delivering NaN instead of proper result) with -foptimize-sibling-calls
  2012-06-26  9:37 [Bug fortran/53778] New: bad code (delivering NaN instead of proper result) with -foptimize-sibling-calls thomas.orgis at awi dot de
  2012-06-26 12:29 ` [Bug fortran/53778] " burnus at gcc dot gnu.org
  2012-06-27  8:59 ` thomas.orgis at awi dot de
@ 2012-06-27  9:03 ` thomas.orgis at awi dot de
  2012-06-27 13:44 ` dominiq at lps dot ens.fr
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: thomas.orgis at awi dot de @ 2012-06-27  9:03 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53778

--- Comment #3 from Thomas Orgis <thomas.orgis at awi dot de> 2012-06-27 09:03:25 UTC ---
... and not to forget profiles.f90 ... that module links the perturbation in
dat_init_density() to the wind in dat_init_wind(). Changes in there along with
moving perturb to using those same profiles as the wind triggered the bug to
begin with.

So, if it is something broken in my code, it is likely to be found in
profiles.f90 or perturb.f90 ... or bottomwinds.f90 .


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug fortran/53778] bad code (delivering NaN instead of proper result) with -foptimize-sibling-calls
  2012-06-26  9:37 [Bug fortran/53778] New: bad code (delivering NaN instead of proper result) with -foptimize-sibling-calls thomas.orgis at awi dot de
                   ` (2 preceding siblings ...)
  2012-06-27  9:03 ` thomas.orgis at awi dot de
@ 2012-06-27 13:44 ` dominiq at lps dot ens.fr
  2012-06-27 15:27 ` burnus at gcc dot gnu.org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: dominiq at lps dot ens.fr @ 2012-06-27 13:44 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53778

--- Comment #4 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2012-06-27 13:44:43 UTC ---
I have played a little with the attached test (I had to comment out 'use
textdata' and 'use lapack'. On x86_64-apple-darwin10, I do not get any NaN with
4.6.3, 4.7.1, or trunk with the default flag. However if I add
'-finit-real=snan -ffpe-trap=invalid,zero,overflow' to the flags, the line

 single q:   1.6614401858304297                            NaN                 
     NaN

(for '-finit-real=nan') is replaced with

Floating exception

So it seems likely that the code is using uninitialized variable(s) (not
detected by -Wuninitialized). Note that one property of NaN is that they
"propagate", i.e., they can be detected quite far away from the point where
they are generated.

It looks like the bug is in the code rather than in gfortran.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug fortran/53778] bad code (delivering NaN instead of proper result) with -foptimize-sibling-calls
  2012-06-26  9:37 [Bug fortran/53778] New: bad code (delivering NaN instead of proper result) with -foptimize-sibling-calls thomas.orgis at awi dot de
                   ` (3 preceding siblings ...)
  2012-06-27 13:44 ` dominiq at lps dot ens.fr
@ 2012-06-27 15:27 ` burnus at gcc dot gnu.org
  2012-06-27 21:31 ` thomas.orgis at awi dot de
  2012-06-27 21:34 ` thomas.orgis at awi dot de
  6 siblings, 0 replies; 8+ messages in thread
From: burnus at gcc dot gnu.org @ 2012-06-27 15:27 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53778

--- Comment #5 from Tobias Burnus <burnus at gcc dot gnu.org> 2012-06-27 15:27:10 UTC ---
(In reply to comment #4)
> However if I add '-finit-real=snan -ffpe-trap=invalid,zero,overflow' to the\
> flags, the line
[...]

For me it fails with:
  #3  0x419D2B in __datafield_MOD_dat_init_wind at datafield.f90:211
  #4  0x41B830 in show_state at test.f90:55
  Floating point exception


However, the problem is in profiles.f90's profile_values, called by:

 function profile_deriv(handle, y) result(deriv)
   ...
   call profile_values(handle, y, deriv=deriv)
end function

 subroutine profile_values(handle, y, deriv, value, aderiv, aderiv2)
...
   if(zero(handle%length)) then
      val(0) = 1
      return
   end if
...
   if(present(value))   value   = val( 0)
...
end subroutine


Do you spot the uninitialized variable?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug fortran/53778] bad code (delivering NaN instead of proper result) with -foptimize-sibling-calls
  2012-06-26  9:37 [Bug fortran/53778] New: bad code (delivering NaN instead of proper result) with -foptimize-sibling-calls thomas.orgis at awi dot de
                   ` (4 preceding siblings ...)
  2012-06-27 15:27 ` burnus at gcc dot gnu.org
@ 2012-06-27 21:31 ` thomas.orgis at awi dot de
  2012-06-27 21:34 ` thomas.orgis at awi dot de
  6 siblings, 0 replies; 8+ messages in thread
From: thomas.orgis at awi dot de @ 2012-06-27 21:31 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53778

Thomas Orgis <thomas.orgis at awi dot de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |INVALID

--- Comment #6 from Thomas Orgis <thomas.orgis at awi dot de> 2012-06-27 21:31:06 UTC ---
Meh ... I see it. Trapped by those optional fortran dummy arguments. All the
fuss and hassle because of a dumb missing initialization. At least now I know
why this specific configuration crashed (y dimension has length 0).

OK, you're off the hook this time;-) I can kill compiler bug 25 from my list
(not just gfortran) as invalid and so can you with this report.

Thanks for investigating and also showing me the correct syntax to trap missing
initializations (for some reason, I had -finit-real-nan in my flags before ...
which is wrong in more than one way).


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug fortran/53778] bad code (delivering NaN instead of proper result) with -foptimize-sibling-calls
  2012-06-26  9:37 [Bug fortran/53778] New: bad code (delivering NaN instead of proper result) with -foptimize-sibling-calls thomas.orgis at awi dot de
                   ` (5 preceding siblings ...)
  2012-06-27 21:31 ` thomas.orgis at awi dot de
@ 2012-06-27 21:34 ` thomas.orgis at awi dot de
  6 siblings, 0 replies; 8+ messages in thread
From: thomas.orgis at awi dot de @ 2012-06-27 21:34 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53778

--- Comment #7 from Thomas Orgis <thomas.orgis at awi dot de> 2012-06-27 21:34:43 UTC ---
Eh, it must have been -finit-real=nan ... so only wrong in one way;-)


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-06-27 21:34 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-26  9:37 [Bug fortran/53778] New: bad code (delivering NaN instead of proper result) with -foptimize-sibling-calls thomas.orgis at awi dot de
2012-06-26 12:29 ` [Bug fortran/53778] " burnus at gcc dot gnu.org
2012-06-27  8:59 ` thomas.orgis at awi dot de
2012-06-27  9:03 ` thomas.orgis at awi dot de
2012-06-27 13:44 ` dominiq at lps dot ens.fr
2012-06-27 15:27 ` burnus at gcc dot gnu.org
2012-06-27 21:31 ` thomas.orgis at awi dot de
2012-06-27 21:34 ` thomas.orgis at awi dot de

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).