public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/50905] New: Gcc4.6.x's -ftree-parallelize-loops is effective only when using "-O2/-O3 -ffast-math"
@ 2011-10-28 19:37 xunxun1982 at gmail dot com
  2011-10-28 19:45 ` [Bug tree-optimization/50905] " pinskia at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: xunxun1982 at gmail dot com @ 2011-10-28 19:37 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50905

             Bug #: 50905
           Summary: Gcc4.6.x's -ftree-parallelize-loops is effective only
                    when using "-O2/-O3 -ffast-math"
    Classification: Unclassified
           Product: gcc
           Version: 4.6.2
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: xunxun1982@gmail.com


Created attachment 25648
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25648
the test

I found that Gcc4.6.1 and Gcc4.6.2 -ftree-parallelize-loops is effective only
when using "-O2/-O3 -ffast-math" on Windows or on Ubuntu. 
Is this right?

Using my test,

$gcc -ftree-parallelize-loops=2 -c main.c
$nm main.o
                 U clock
0000000000000000 T main
                 U printf
                 U puts
                 U sin
$gcc -O2 -ftree-parallelize-loops=2 -c main.c
$nm main.o
0000000000000000 r .LC4
0000000000000008 r .LC6
                 U __printf_chk
                 U clock
0000000000000000 T main
                 U puts
                 U sin
$gcc -ffast-math -ftree-parallelize-loops=2 -c main.c
$nm main.o
                 U clock
0000000000000000 T main
                 U printf
                 U puts
                 U sin
$gcc -O2 -ftree-parallelize-loops=2 -ffast-math -c main.c
$nm main.o
0000000000000000 r .LC5
                 U GOMP_parallel_end
                 U GOMP_parallel_start
                 U __printf_chk
                 U clock
0000000000000000 T main
0000000000000000 t main._loopfn.0
                 U omp_get_num_threads
                 U omp_get_thread_num
                 U puts
                 U sin
Only -O2 -ftree-parallelize-loops=2 -ffast-math achieve my desired result.

But I think it should generate the similar symbols below when using
-ftree-parallelize-loops=2 alone:
                 U GOMP_parallel_end
                 U GOMP_parallel_start
                 U omp_get_num_threads
                 U omp_get_thread_num

Am I right?
Or the -ftree-parallelize-loops is such option that should be used with
"-O2/-O3 -ffast-math"?

Thanks.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/50905] Gcc4.6.x's -ftree-parallelize-loops is effective only when using "-O2/-O3 -ffast-math"
  2011-10-28 19:37 [Bug tree-optimization/50905] New: Gcc4.6.x's -ftree-parallelize-loops is effective only when using "-O2/-O3 -ffast-math" xunxun1982 at gmail dot com
@ 2011-10-28 19:45 ` pinskia at gcc dot gnu.org
  2011-10-28 19:54 ` xunxun1982 at gmail dot com
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2011-10-28 19:45 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50905

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |INVALID

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> 2011-10-28 19:45:18 UTC ---
>Am I right?
Yes but the reason comes down to fp math is not associative which means it is
impossible to do reductions with fp math unless you have -ffast-math (or really
-fassociative-math) turned on.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/50905] Gcc4.6.x's -ftree-parallelize-loops is effective only when using "-O2/-O3 -ffast-math"
  2011-10-28 19:37 [Bug tree-optimization/50905] New: Gcc4.6.x's -ftree-parallelize-loops is effective only when using "-O2/-O3 -ffast-math" xunxun1982 at gmail dot com
  2011-10-28 19:45 ` [Bug tree-optimization/50905] " pinskia at gcc dot gnu.org
@ 2011-10-28 19:54 ` xunxun1982 at gmail dot com
  2011-10-28 20:07 ` pinskia at gcc dot gnu.org
  2011-10-28 20:14 ` xunxun1982 at gmail dot com
  3 siblings, 0 replies; 5+ messages in thread
From: xunxun1982 at gmail dot com @ 2011-10-28 19:54 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50905

xunxun <xunxun1982 at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |UNCONFIRMED
         Resolution|INVALID                     |

--- Comment #2 from xunxun <xunxun1982 at gmail dot com> 2011-10-28 19:53:47 UTC ---
(In reply to comment #1)
> >Am I right?
> Yes but the reason comes down to fp math is not associative which means it is
> impossible to do reductions with fp math unless you have -ffast-math (or really
> -fassociative-math) turned on.

But -ffast-math -ftree-parallelize-loops=2 is also no use.
We must combine -ffast-math and -O2/-O3.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/50905] Gcc4.6.x's -ftree-parallelize-loops is effective only when using "-O2/-O3 -ffast-math"
  2011-10-28 19:37 [Bug tree-optimization/50905] New: Gcc4.6.x's -ftree-parallelize-loops is effective only when using "-O2/-O3 -ffast-math" xunxun1982 at gmail dot com
  2011-10-28 19:45 ` [Bug tree-optimization/50905] " pinskia at gcc dot gnu.org
  2011-10-28 19:54 ` xunxun1982 at gmail dot com
@ 2011-10-28 20:07 ` pinskia at gcc dot gnu.org
  2011-10-28 20:14 ` xunxun1982 at gmail dot com
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2011-10-28 20:07 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50905

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |INVALID

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> 2011-10-28 20:07:03 UTC ---
Yes if you mean without -O1/-O2/-O3 -ftree-parallelize-loops does not work,
this is expected as explained in the manual, -O1 enables more than the options
that includes adding more optimizations.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/50905] Gcc4.6.x's -ftree-parallelize-loops is effective only when using "-O2/-O3 -ffast-math"
  2011-10-28 19:37 [Bug tree-optimization/50905] New: Gcc4.6.x's -ftree-parallelize-loops is effective only when using "-O2/-O3 -ffast-math" xunxun1982 at gmail dot com
                   ` (2 preceding siblings ...)
  2011-10-28 20:07 ` pinskia at gcc dot gnu.org
@ 2011-10-28 20:14 ` xunxun1982 at gmail dot com
  3 siblings, 0 replies; 5+ messages in thread
From: xunxun1982 at gmail dot com @ 2011-10-28 20:14 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50905

--- Comment #4 from xunxun <xunxun1982 at gmail dot com> 2011-10-28 20:14:38 UTC ---
(In reply to comment #3)
> Yes if you mean without -O1/-O2/-O3 -ftree-parallelize-loops does not work,
> this is expected as explained in the manual, -O1 enables more than the options
> that includes adding more optimizations.

Thanks for the information.

-----------------------
I read the manual:
-ftree-parallelize-loops=n
    Parallelize loops, i.e., split their iteration space to run in n threads.
This is only possible for loops whose iterations are independent and can be
arbitrarily reordered. The optimization is only profitable on multiprocessor
machines, for loops that are CPU-intensive, rather than constrained e.g. by
memory bandwidth. This option implies -pthread, and thus is only supported on
targets that have support for -pthread. 

I don't notice any -O1/-O2/-O3 Optimization information here.
-----------------------

If the information is true, we should only deal with the issue between
-ftree-parallelize-loops=n and -ffast-math


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-10-28 20:14 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-10-28 19:37 [Bug tree-optimization/50905] New: Gcc4.6.x's -ftree-parallelize-loops is effective only when using "-O2/-O3 -ffast-math" xunxun1982 at gmail dot com
2011-10-28 19:45 ` [Bug tree-optimization/50905] " pinskia at gcc dot gnu.org
2011-10-28 19:54 ` xunxun1982 at gmail dot com
2011-10-28 20:07 ` pinskia at gcc dot gnu.org
2011-10-28 20:14 ` xunxun1982 at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).