public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/31079] 20% difference between ifort/gfortran, missed vectorization
       [not found] <bug-31079-4@http.gcc.gnu.org/bugzilla/>
@ 2012-07-18 13:28 ` rguenth at gcc dot gnu.org
  2013-03-27 11:45 ` rguenth at gcc dot gnu.org
  1 sibling, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-07-18 13:28 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31079

--- Comment #14 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-07-18 13:28:41 UTC ---
Smart again - with stock trunk I get everything optimized away ;)


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/31079] 20% difference between ifort/gfortran, missed vectorization
       [not found] <bug-31079-4@http.gcc.gnu.org/bugzilla/>
  2012-07-18 13:28 ` [Bug tree-optimization/31079] 20% difference between ifort/gfortran, missed vectorization rguenth at gcc dot gnu.org
@ 2013-03-27 11:45 ` rguenth at gcc dot gnu.org
  1 sibling, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-03-27 11:45 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31079

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
      Known to work|                            |4.8.0
         Resolution|                            |FIXED
   Target Milestone|---                         |4.8.0

--- Comment #15 from Richard Biener <rguenth at gcc dot gnu.org> 2013-03-27 11:44:56 UTC ---
We vectorize the new testcase now (move the USE function to a separate TU
to not optimize everything away...).

At -O3 -ffast-math I see

4.6: 4.25s
4.7: 4.25s
4.8/trunk: 2.7s

ifort 12.1 and -fast: 3.6s

I conclude - fixed for 4.8.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/31079] 20% difference between ifort/gfortran, missed vectorization
  2007-03-08  9:46 [Bug tree-optimization/31079] New: 300% difference between ifort/gfortran jv244 at cam dot ac dot uk
                   ` (7 preceding siblings ...)
  2008-08-19  6:12 ` jv244 at cam dot ac dot uk
@ 2008-08-19 13:33 ` jv244 at cam dot ac dot uk
  8 siblings, 0 replies; 11+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-08-19 13:33 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #13 from jv244 at cam dot ac dot uk  2008-08-19 13:31 -------
(In reply to comment #11)

> This (PR31079_11.f90) should be a replacement for comment #4, and illustrates
> the vectorizer issue.

The patch Richard posted in PR37150 also improves this PR31079_11.f90 testcase
a lot:

ifort               : 2.54
gfortran (unpatched): 4.00
gfortran (patched)  : 2.96


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31079


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/31079] 20% difference between ifort/gfortran, missed vectorization
  2007-03-08  9:46 [Bug tree-optimization/31079] New: 300% difference between ifort/gfortran jv244 at cam dot ac dot uk
                   ` (6 preceding siblings ...)
  2008-08-19  6:11 ` jv244 at cam dot ac dot uk
@ 2008-08-19  6:12 ` jv244 at cam dot ac dot uk
  2008-08-19 13:33 ` jv244 at cam dot ac dot uk
  8 siblings, 0 replies; 11+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-08-19  6:12 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #12 from jv244 at cam dot ac dot uk  2008-08-19 06:11 -------
Created an attachment (id=16096)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16096&action=view)
ifort's asm for PR31079_11.f90 at -O3 -xT -S


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31079


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/31079] 20% difference between ifort/gfortran, missed vectorization
  2007-03-08  9:46 [Bug tree-optimization/31079] New: 300% difference between ifort/gfortran jv244 at cam dot ac dot uk
                   ` (5 preceding siblings ...)
  2008-08-19  5:46 ` jv244 at cam dot ac dot uk
@ 2008-08-19  6:11 ` jv244 at cam dot ac dot uk
  2008-08-19  6:12 ` jv244 at cam dot ac dot uk
  2008-08-19 13:33 ` jv244 at cam dot ac dot uk
  8 siblings, 0 replies; 11+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-08-19  6:11 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #11 from jv244 at cam dot ac dot uk  2008-08-19 06:09 -------
Created an attachment (id=16095)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16095&action=view)
new testcase 

This (PR31079_11.f90) should be a replacement for comment #4, and illustrates
the vectorizer issue.

> gfortran -O3 -ftree-vectorize -ffast-math -march=native PR31079_11.f90
> ./a.out
   4.0282512

> ifort -O3 -xT PR31079_11.f90
PR31079_11.f90(52): (col. 13) remark: LOOP WAS VECTORIZED.
PR31079_11.f90(52): (col. 13) remark: BLOCK WAS VECTORIZED.
PR31079_11.f90(52): (col. 13) remark: LOOP WAS VECTORIZED.
PR31079_11.f90(52): (col. 13) remark: LOOP WAS VECTORIZED.
PR31079_11.f90(17): (col. 8) remark: LOOP WAS VECTORIZED.
PR31079_11.f90(24): (col. 5) remark: BLOCK WAS VECTORIZED.
PR31079_11.f90(30): (col. 7) remark: LOOP WAS VECTORIZED.
PR31079_11.f90(31): (col. 7) remark: LOOP WAS VECTORIZED.
> ./a.out
   2.640165

The inner loop looks like:

    DO i=1,N
      s(1:2)=s(1:2)+pxy(i)%a(:)*dpy(i)%a(1)
      s(3:4)=s(3:4)+pxy(i)%a(:)*dpy(i)%a(2)
    ENDDO

which ifort vectorizes (I will attach the full asm):

..B3.4:                         # Preds ..B3.4 ..B3.3
        movddup   collocate_core_2_2_0_0_$DPY.0.1(%rax), %xmm2  #30.33
        movddup   8+collocate_core_2_2_0_0_$DPY.0.1(%rax), %xmm4 #31.33
        movaps    collocate_core_2_2_0_0_$PXY.0.1(%rax), %xmm3  #30.7
        mulpd     %xmm3, %xmm2                                  #30.32
        incq      %rdx                                          #29.5
        addq      $16, %rax                                     #29.5
        addpd     %xmm2, %xmm1                                  #30.7
        cmpq      $1000, %rdx                                   #29.5
        mulpd     %xmm3, %xmm4                                  #31.32
        addpd     %xmm4, %xmm0                                  #31.7
        jl        ..B3.4        # Prob 99%                      #29.5


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31079


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/31079] 20% difference between ifort/gfortran, missed vectorization
  2007-03-08  9:46 [Bug tree-optimization/31079] New: 300% difference between ifort/gfortran jv244 at cam dot ac dot uk
                   ` (4 preceding siblings ...)
  2008-08-19  5:45 ` jv244 at cam dot ac dot uk
@ 2008-08-19  5:46 ` jv244 at cam dot ac dot uk
  2008-08-19  6:11 ` jv244 at cam dot ac dot uk
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-08-19  5:46 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #10 from jv244 at cam dot ac dot uk  2008-08-19 05:45 -------
Created an attachment (id=16094)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16094&action=view)
comment #0 intel's assembly (ifort 9.1 at -O2 -xT)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31079


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/31079] 20% difference between ifort/gfortran, missed vectorization
  2007-03-08  9:46 [Bug tree-optimization/31079] New: 300% difference between ifort/gfortran jv244 at cam dot ac dot uk
                   ` (2 preceding siblings ...)
  2008-08-18 15:23 ` rguenth at gcc dot gnu dot org
@ 2008-08-19  5:45 ` jv244 at cam dot ac dot uk
  2008-08-19  5:45 ` jv244 at cam dot ac dot uk
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-08-19  5:45 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from jv244 at cam dot ac dot uk  2008-08-19 05:43 -------
(In reply to comment #7)
> That is, GCCs inner loop is
> 
> .L6:
>         addl    $1, %eax
>         addsd   %xmm12, %xmm11
>         cmpl    $100000000, %eax
>         addsd   %xmm14, %xmm3
>         addsd   %xmm15, %xmm2
>         addsd   %xmm13, %xmm1
>         jne     .L6
> 
> which doesn't necessarily look slower than ICCs.
> 

Right... checked trunk, and it now does something very smart with the testcase
from comment 4 ... it is now about 10 times faster than ifort (9.1 /11.0)

> gfortran -O3 -ftree-vectorize -ffast-math -march=native -S PR31079_4.f90
> ./a.out
  0.25201499

> ifort -xT -O2 PR31079_4.f90
> ./a.out
   2.040127

I'll see if there is a way to get the testcase somewhat smarter. I checked the
very first program (comment #0), and this is still slower with gfortran (intel
3.51 vs gfortran 4.1). Just for completeness, I attach the Fortran source and
the intel assembly. 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31079


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/31079] 20% difference between ifort/gfortran, missed vectorization
  2007-03-08  9:46 [Bug tree-optimization/31079] New: 300% difference between ifort/gfortran jv244 at cam dot ac dot uk
                   ` (3 preceding siblings ...)
  2008-08-19  5:45 ` jv244 at cam dot ac dot uk
@ 2008-08-19  5:45 ` jv244 at cam dot ac dot uk
  2008-08-19  5:46 ` jv244 at cam dot ac dot uk
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-08-19  5:45 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #9 from jv244 at cam dot ac dot uk  2008-08-19 05:44 -------
Created an attachment (id=16093)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16093&action=view)
comment #0 source


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31079


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/31079] 20% difference between ifort/gfortran, missed vectorization
  2007-03-08  9:46 [Bug tree-optimization/31079] New: 300% difference between ifort/gfortran jv244 at cam dot ac dot uk
  2008-01-08 10:22 ` [Bug tree-optimization/31079] 20% difference between ifort/gfortran, missed vectorization jv244 at cam dot ac dot uk
  2008-08-18 15:21 ` rguenth at gcc dot gnu dot org
@ 2008-08-18 15:23 ` rguenth at gcc dot gnu dot org
  2008-08-19  5:45 ` jv244 at cam dot ac dot uk
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-08-18 15:23 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from rguenth at gcc dot gnu dot org  2008-08-18 15:22 -------
That is, GCCs inner loop is

.L6:
        addl    $1, %eax
        addsd   %xmm12, %xmm11
        cmpl    $100000000, %eax
        addsd   %xmm14, %xmm3
        addsd   %xmm15, %xmm2
        addsd   %xmm13, %xmm1
        jne     .L6

which doesn't necessarily look slower than ICCs.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31079


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/31079] 20% difference between ifort/gfortran, missed vectorization
  2007-03-08  9:46 [Bug tree-optimization/31079] New: 300% difference between ifort/gfortran jv244 at cam dot ac dot uk
  2008-01-08 10:22 ` [Bug tree-optimization/31079] 20% difference between ifort/gfortran, missed vectorization jv244 at cam dot ac dot uk
@ 2008-08-18 15:21 ` rguenth at gcc dot gnu dot org
  2008-08-18 15:23 ` rguenth at gcc dot gnu dot org
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2008-08-18 15:21 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from rguenth at gcc dot gnu dot org  2008-08-18 15:20 -------
The problem for the GCC vectorizer is that there are no loads or stores left
in the loop and it doesn't handle vectorizing "registers" only.  This is a
case where real vectorization of straight-line code would be necessary.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31079


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/31079] 20% difference between ifort/gfortran, missed vectorization
  2007-03-08  9:46 [Bug tree-optimization/31079] New: 300% difference between ifort/gfortran jv244 at cam dot ac dot uk
@ 2008-01-08 10:22 ` jv244 at cam dot ac dot uk
  2008-08-18 15:21 ` rguenth at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 11+ messages in thread
From: jv244 at cam dot ac dot uk @ 2008-01-08 10:22 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from jv244 at cam dot ac dot uk  2008-01-08 09:52 -------
updated the summary after the analysis in comment #4, and and CCed Dorit for
the vectorization issue.


-- 

jv244 at cam dot ac dot uk changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dorit at il dot ibm dot com
            Summary|300% difference between     |20% difference between
                   |ifort/gfortran              |ifort/gfortran, missed
                   |                            |vectorization


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31079


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-03-27 11:45 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-31079-4@http.gcc.gnu.org/bugzilla/>
2012-07-18 13:28 ` [Bug tree-optimization/31079] 20% difference between ifort/gfortran, missed vectorization rguenth at gcc dot gnu.org
2013-03-27 11:45 ` rguenth at gcc dot gnu.org
2007-03-08  9:46 [Bug tree-optimization/31079] New: 300% difference between ifort/gfortran jv244 at cam dot ac dot uk
2008-01-08 10:22 ` [Bug tree-optimization/31079] 20% difference between ifort/gfortran, missed vectorization jv244 at cam dot ac dot uk
2008-08-18 15:21 ` rguenth at gcc dot gnu dot org
2008-08-18 15:23 ` rguenth at gcc dot gnu dot org
2008-08-19  5:45 ` jv244 at cam dot ac dot uk
2008-08-19  5:45 ` jv244 at cam dot ac dot uk
2008-08-19  5:46 ` jv244 at cam dot ac dot uk
2008-08-19  6:11 ` jv244 at cam dot ac dot uk
2008-08-19  6:12 ` jv244 at cam dot ac dot uk
2008-08-19 13:33 ` jv244 at cam dot ac dot uk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).