public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/18437] vectorizer failed for matrix multiplication
       [not found] <bug-18437-4@http.gcc.gnu.org/bugzilla/>
@ 2011-05-22 16:02 ` steven at gcc dot gnu.org
  2011-07-27 12:39 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: steven at gcc dot gnu.org @ 2011-05-22 16:02 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18437

Steven Bosscher <steven at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2005-12-21 03:40:32         |2011-05-22 17:36:32

--- Comment #4 from Steven Bosscher <steven at gcc dot gnu.org> 2011-05-22 15:36:52 UTC ---
Test case of comment #0 is not vectorized in recent GCC:

     1    #define align(x) __attribute__((align(x)))
     2    typedef float align(16) MATRIX[3][3];
     3     
     4    void RotateMatrix(MATRIX ret, MATRIX a, MATRIX b)
     5    {
     6      int i, j;
     7     
     8      for (j = 0; j < 3; j++)
     9        for (i = 0; i < 3; i++)
    10          ret[j][i] =   a[j][0] * b[0][i]
    11                      + a[j][1] * b[1][i]
    12                      + a[j][2] * b[2][i];
    13    }


t.c:8: note: not vectorized: loop contains function calls or data references
that cannot be analyzed
t.c:8: note: bad data references.
t.c:4: note: vectorized 0 loops in function.

"GCC: (GNU) 4.6.0 20110312 (experimental) [trunk revision 170907]"


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/18437] vectorizer failed for matrix multiplication
       [not found] <bug-18437-4@http.gcc.gnu.org/bugzilla/>
  2011-05-22 16:02 ` [Bug tree-optimization/18437] vectorizer failed for matrix multiplication steven at gcc dot gnu.org
@ 2011-07-27 12:39 ` rguenth at gcc dot gnu.org
  2012-04-17 13:55 ` matz at gcc dot gnu.org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2011-07-27 12:39 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18437

--- Comment #5 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-07-27 12:38:20 UTC ---
The initial testcase is probably a bad example (3x3 matrix).  The following
testcase is borrowed from Polyhedron rnflow and is vectorized by ICC but
not by GCC (the ICC variant is 15% faster):

      function trs2a2 (j, k, u, d, m)
      real, dimension (1:m,1:m) :: trs2a2  
      real, dimension (1:m,1:m) :: u, d
      integer, intent (in)      :: j, k, m
      real (kind = selected_real_kind (10,50)) :: dtmp
      trs2a2 = 0.0
      do iclw1 = j, k - 1
         do iclw2 = j, k - 1
            dtmp = 0.0d0
            do iclww = j, k - 1
               dtmp = dtmp + u (iclw1, iclww) * d (iclww, iclw2)
            enddo
            trs2a2 (iclw1, iclw2) = dtmp
         enddo
      enddo
      return
      end function trs2a2

the reason why GCC cannot vectorize this is that the load from U has
a non-constant stride, so vectorization would need to load two scalars
and build up a vector (ICC does that).  If the stride were constant
but not power-of-two GCC would reject that as well, probably to not
confuse the interleaving code.  Data dependence analysis also rejects
non-constant strides.

Further complication (for the cost model) is the accumulator of
type double compared to the data types of float.  ICC uses only
half of the float vectors here to handle mixed float/double type
loops (but it still unrolls the loop).


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/18437] vectorizer failed for matrix multiplication
       [not found] <bug-18437-4@http.gcc.gnu.org/bugzilla/>
  2011-05-22 16:02 ` [Bug tree-optimization/18437] vectorizer failed for matrix multiplication steven at gcc dot gnu.org
  2011-07-27 12:39 ` rguenth at gcc dot gnu.org
@ 2012-04-17 13:55 ` matz at gcc dot gnu.org
  2012-05-09 13:07 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: matz at gcc dot gnu.org @ 2012-04-17 13:55 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18437

--- Comment #6 from Michael Matz <matz at gcc dot gnu.org> 2012-04-17 13:54:36 UTC ---
Author: matz
Date: Tue Apr 17 13:54:26 2012
New Revision: 186530

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=186530
Log:
    PR tree-optimization/18437

    * tree-vectorizer.h (_stmt_vec_info.stride_load_p): New member.
    (STMT_VINFO_STRIDE_LOAD_P): New accessor.
    (vect_check_strided_load): Declare.
    * tree-vect-data-refs.c (vect_check_strided_load): New function.
    (vect_analyze_data_refs): Use it to accept strided loads.
    * tree-vect-stmts.c (vectorizable_load): Ditto and handle them.

testsuite/
    * gfortran.dg/vect/rnflow-trs2a2.f90: New test.

Added:
    trunk/gcc/testsuite/gfortran.dg/vect/rnflow-trs2a2.f90
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-vect-data-refs.c
    trunk/gcc/tree-vect-stmts.c
    trunk/gcc/tree-vectorizer.h


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/18437] vectorizer failed for matrix multiplication
       [not found] <bug-18437-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2012-04-17 13:55 ` matz at gcc dot gnu.org
@ 2012-05-09 13:07 ` rguenth at gcc dot gnu.org
  2012-07-13  8:50 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-05-09 13:07 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18437

--- Comment #7 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-05-09 12:59:49 UTC ---
Author: rguenth
Date: Wed May  9 12:59:46 2012
New Revision: 187330

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=187330
Log:
2012-05-09  Richard Guenther  <rguenther@suse.de>

    PR tree-optimization/18437
    * gfortran.dg/vect/rnflow-trs2a2.f90: Move ...
    * gfortran.dg/vect/fast-math-rnflow-trs2a2.f90: ... here.

Added:
    trunk/gcc/testsuite/gfortran.dg/vect/fast-math-rnflow-trs2a2.f90
      - copied unchanged from r187329,
trunk/gcc/testsuite/gfortran.dg/vect/rnflow-trs2a2.f90
Removed:
    trunk/gcc/testsuite/gfortran.dg/vect/rnflow-trs2a2.f90
Modified:
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/18437] vectorizer failed for matrix multiplication
       [not found] <bug-18437-4@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2012-05-09 13:07 ` rguenth at gcc dot gnu.org
@ 2012-07-13  8:50 ` rguenth at gcc dot gnu.org
  2023-08-04 20:19 ` pinskia at gcc dot gnu.org
  2023-08-04 20:20 ` pinskia at gcc dot gnu.org
  6 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-07-13  8:50 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18437

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Blocks|                            |53947

--- Comment #8 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-07-13 08:49:47 UTC ---
Link to vectorizer missed-optimization meta-bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/18437] vectorizer failed for matrix multiplication
       [not found] <bug-18437-4@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2012-07-13  8:50 ` rguenth at gcc dot gnu.org
@ 2023-08-04 20:19 ` pinskia at gcc dot gnu.org
  2023-08-04 20:20 ` pinskia at gcc dot gnu.org
  6 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-04 20:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18437

--- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
For the original testcase in comment #0, with `-O3 -fno-vect-cost-model` GCC
can vectorize it on aarch64 but not on x86_64.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/18437] vectorizer failed for matrix multiplication
       [not found] <bug-18437-4@http.gcc.gnu.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2023-08-04 20:19 ` pinskia at gcc dot gnu.org
@ 2023-08-04 20:20 ` pinskia at gcc dot gnu.org
  6 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-04 20:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18437

--- Comment #10 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #9)
> For the original testcase in comment #0, with `-O3 -fno-vect-cost-model` GCC
> can vectorize it on aarch64 but not on x86_64.

I should say starting in GCC 6 .

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/18437] vectorizer failed for matrix multiplication
  2004-11-12  1:22 [Bug tree-optimization/18437] New: " giovannibajo at libero dot it
  2004-11-12  2:43 ` [Bug tree-optimization/18437] " pinskia at gcc dot gnu dot org
  2005-02-11 19:16 ` pinskia at gcc dot gnu dot org
@ 2005-09-20 17:44 ` pinskia at gcc dot gnu dot org
  2 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-09-20 17:44 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2005-09-20 17:44 -------
Oh, the issue here is that a, b, and ret all could point to the same array because the type is (float[3])* 
or arraryptr in:
typedef float array[3];
typedef array *arraryptr;

If we change ret, a, and b to be global variables, then the vectorizer could be done except for the fact:
t.c:11: note: not vectorized: iteration count too small.
t.c:11: note: bad operation or unsupported loop bound.
t.c:11: note: vectorized 0 loops in function.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18437


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/18437] vectorizer failed for matrix multiplication
  2004-11-12  1:22 [Bug tree-optimization/18437] New: " giovannibajo at libero dot it
  2004-11-12  2:43 ` [Bug tree-optimization/18437] " pinskia at gcc dot gnu dot org
@ 2005-02-11 19:16 ` pinskia at gcc dot gnu dot org
  2005-09-20 17:44 ` pinskia at gcc dot gnu dot org
  2 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-02-11 19:16 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2005-02-11 14:12 -------
We now get:
t3.c:9: note: not vectorized: can't determine dependence between: (*D.1338_16)[0] and 
(*D.1336_10)[i_53]


-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2004-11-12 02:43:04         |2005-02-11 14:12:15
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18437


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/18437] vectorizer failed for matrix multiplication
  2004-11-12  1:22 [Bug tree-optimization/18437] New: " giovannibajo at libero dot it
@ 2004-11-12  2:43 ` pinskia at gcc dot gnu dot org
  2005-02-11 19:16 ` pinskia at gcc dot gnu dot org
  2005-09-20 17:44 ` pinskia at gcc dot gnu dot org
  2 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-11-12  2:43 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-11-12 02:43 -------
Confirmed, ICC can do this but does not because it is not very inefficient to do it.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|                            |1
   Last reconfirmed|0000-00-00 00:00:00         |2004-11-12 02:43:04
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18437


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-08-04 20:20 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-18437-4@http.gcc.gnu.org/bugzilla/>
2011-05-22 16:02 ` [Bug tree-optimization/18437] vectorizer failed for matrix multiplication steven at gcc dot gnu.org
2011-07-27 12:39 ` rguenth at gcc dot gnu.org
2012-04-17 13:55 ` matz at gcc dot gnu.org
2012-05-09 13:07 ` rguenth at gcc dot gnu.org
2012-07-13  8:50 ` rguenth at gcc dot gnu.org
2023-08-04 20:19 ` pinskia at gcc dot gnu.org
2023-08-04 20:20 ` pinskia at gcc dot gnu.org
2004-11-12  1:22 [Bug tree-optimization/18437] New: " giovannibajo at libero dot it
2004-11-12  2:43 ` [Bug tree-optimization/18437] " pinskia at gcc dot gnu dot org
2005-02-11 19:16 ` pinskia at gcc dot gnu dot org
2005-09-20 17:44 ` pinskia at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).