public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug fortran/29550]  New: Optimize -fexternal-blas calls for transpose()/conj()
@ 2006-10-22 16:26 tobias dot burnus at physik dot fu-berlin dot de
  2006-10-22 18:45 ` [Bug fortran/29550] Optimize -fexternal-blas calls for conj() fxcoudert at gcc dot gnu dot org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: tobias dot burnus at physik dot fu-berlin dot de @ 2006-10-22 16:26 UTC (permalink / raw)
  To: gcc-bugs

Often, matrix multiplications contain transpose() or conj(), e.g.
  matmul(transpose(A),B)
or
  matmul(A,conj(transpose(B))
  matmul(A,transpose(conj(B))

The *gemm subroutines of BLAS anticipate this via the TRANSA and TRANSB
options:
- 'N' (unchanged)
- 'T' (transpose)
- 'C' (hermitian conjugate / transpose+complex conjugate)

Thus for -fexternal-blas these extra options should be used, if possible.


-- 
           Summary: Optimize -fexternal-blas calls for transpose()/conj()
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: fortran
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: tobias dot burnus at physik dot fu-berlin dot de


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29550


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug fortran/29550] Optimize -fexternal-blas calls for conj()
  2006-10-22 16:26 [Bug fortran/29550] New: Optimize -fexternal-blas calls for transpose()/conj() tobias dot burnus at physik dot fu-berlin dot de
@ 2006-10-22 18:45 ` fxcoudert at gcc dot gnu dot org
  2006-10-22 23:15 ` [Bug fortran/29550] Optimize -fexternal-blas calls for conjg() fxcoudert at gcc dot gnu dot org
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: fxcoudert at gcc dot gnu dot org @ 2006-10-22 18:45 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from fxcoudert at gcc dot gnu dot org  2006-10-22 18:45 -------
The current code already recognizes matrix transposition and gives BLAS gemm
functions the right TRANSA and TRANSB argument in this case.

Confirmed for CONJG, which we don't currently handle.


-- 

fxcoudert at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
           Keywords|                            |missed-optimization
      Known to fail|                            |4.3.0
   Last reconfirmed|0000-00-00 00:00:00         |2006-10-22 18:45:29
               date|                            |
            Summary|Optimize -fexternal-blas    |Optimize -fexternal-blas
                   |calls for transpose()/conj()|calls for conj()


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29550


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug fortran/29550] Optimize -fexternal-blas calls for conjg()
  2006-10-22 16:26 [Bug fortran/29550] New: Optimize -fexternal-blas calls for transpose()/conj() tobias dot burnus at physik dot fu-berlin dot de
  2006-10-22 18:45 ` [Bug fortran/29550] Optimize -fexternal-blas calls for conj() fxcoudert at gcc dot gnu dot org
@ 2006-10-22 23:15 ` fxcoudert at gcc dot gnu dot org
  2006-10-23 21:56 ` fxcoudert at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: fxcoudert at gcc dot gnu dot org @ 2006-10-22 23:15 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from fxcoudert at gcc dot gnu dot org  2006-10-22 23:14 -------
I've been thinking a bit about this. It's a common case, and it would probably
be worth optimizing it.

We could detect in iresolve.c (gfc_resolve_matmul) that one (or both) of the
arguments to MATMUL is a call to CONJ, and then rewrite the code to be
MATMUL(A,B,2) instead of MATMUL(A,CONJG(B)), where the 2 is an extra "hidden"
integer argument that means here that the second MATMUL arg is to be conjugated
during the matrix multiplication.

After that, we could also detect in iresolve.c (gfc_resolve_conjg) when the
result of a matmul call is conjugated (C = CONJG(MATMUL(A,B))) and optimize
this as well.

Notes for a wannabe coder: that argument a (or b) is the conjg function can be
identified by (a->expr_type == EXPR_FUNCTION &&
a->value.function->isym->generic_id == GFC_ISYM_CONJG). Rewriting the
expressions might be a bit subtle, but not so hard; for the extra argument, see
how the rrspacing and spacing intrinsics are implemented.


-- 

fxcoudert at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fxcoudert at gcc dot gnu dot
                   |                            |org
   Last reconfirmed|2006-10-22 18:45:29         |2006-10-22 23:14:54
               date|                            |
            Summary|Optimize -fexternal-blas    |Optimize -fexternal-blas
                   |calls for conj()            |calls for conjg()


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29550


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug fortran/29550] Optimize -fexternal-blas calls for conjg()
  2006-10-22 16:26 [Bug fortran/29550] New: Optimize -fexternal-blas calls for transpose()/conj() tobias dot burnus at physik dot fu-berlin dot de
  2006-10-22 18:45 ` [Bug fortran/29550] Optimize -fexternal-blas calls for conj() fxcoudert at gcc dot gnu dot org
  2006-10-22 23:15 ` [Bug fortran/29550] Optimize -fexternal-blas calls for conjg() fxcoudert at gcc dot gnu dot org
@ 2006-10-23 21:56 ` fxcoudert at gcc dot gnu dot org
  2006-10-29 22:12 ` fxcoudert at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: fxcoudert at gcc dot gnu dot org @ 2006-10-23 21:56 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from fxcoudert at gcc dot gnu dot org  2006-10-23 21:56 -------
(In reply to comment #2)
> We could detect in iresolve.c (gfc_resolve_matmul) that one (or both) of the
> arguments to MATMUL is a call to CONJ, and then rewrite the code to be
> MATMUL(A,B,2) instead of MATMUL(A,CONJG(B)), where the 2 is an extra "hidden"
> integer argument that means here that the second MATMUL arg is to be conjugated
> during the matrix multiplication.

I've create a patch along these lines, and it was too slow. I'm now writing
special functions for every possible case.


-- 

fxcoudert at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|unassigned at gcc dot gnu   |fxcoudert at gcc dot gnu dot
                   |dot org                     |org
             Status|NEW                         |ASSIGNED
   Last reconfirmed|2006-10-22 23:14:54         |2006-10-23 21:56:37
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29550


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug fortran/29550] Optimize -fexternal-blas calls for conjg()
  2006-10-22 16:26 [Bug fortran/29550] New: Optimize -fexternal-blas calls for transpose()/conj() tobias dot burnus at physik dot fu-berlin dot de
                   ` (2 preceding siblings ...)
  2006-10-23 21:56 ` fxcoudert at gcc dot gnu dot org
@ 2006-10-29 22:12 ` fxcoudert at gcc dot gnu dot org
  2006-11-04 14:18 ` jb at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: fxcoudert at gcc dot gnu dot org @ 2006-10-29 22:12 UTC (permalink / raw)
  To: gcc-bugs



-- 

fxcoudert at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|fxcoudert at gcc dot gnu dot|unassigned at gcc dot gnu
                   |org                         |dot org
             Status|ASSIGNED                    |NEW


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29550


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug fortran/29550] Optimize -fexternal-blas calls for conjg()
  2006-10-22 16:26 [Bug fortran/29550] New: Optimize -fexternal-blas calls for transpose()/conj() tobias dot burnus at physik dot fu-berlin dot de
                   ` (3 preceding siblings ...)
  2006-10-29 22:12 ` fxcoudert at gcc dot gnu dot org
@ 2006-11-04 14:18 ` jb at gcc dot gnu dot org
  2006-11-04 18:22 ` fxcoudert at gcc dot gnu dot org
  2010-09-13 18:54 ` tkoenig at gcc dot gnu dot org
  6 siblings, 0 replies; 8+ messages in thread
From: jb at gcc dot gnu dot org @ 2006-11-04 14:18 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from jb at gcc dot gnu dot org  2006-11-04 14:18 -------
Can't you use the same trick that the frontend already uses to detect the
matmul(transpose(a),b) thing?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29550


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug fortran/29550] Optimize -fexternal-blas calls for conjg()
  2006-10-22 16:26 [Bug fortran/29550] New: Optimize -fexternal-blas calls for transpose()/conj() tobias dot burnus at physik dot fu-berlin dot de
                   ` (4 preceding siblings ...)
  2006-11-04 14:18 ` jb at gcc dot gnu dot org
@ 2006-11-04 18:22 ` fxcoudert at gcc dot gnu dot org
  2010-09-13 18:54 ` tkoenig at gcc dot gnu dot org
  6 siblings, 0 replies; 8+ messages in thread
From: fxcoudert at gcc dot gnu dot org @ 2006-11-04 18:22 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from fxcoudert at gcc dot gnu dot org  2006-11-04 18:21 -------
(In reply to comment #4)
> Can't you use the same trick that the frontend already uses to detect the
> matmul(transpose(a),b) thing?

The front-end doesn't "detect" the case you quote. It's only that transposition
is done by creating a new array descriptor, pointing to the same data but with
inverted strides; matmul being able to operate on all possible strides, it is
indeed optimal.

For conjugation, it's a bit more difficult. We really need the front-end to
recognize if one argument of matmul is conjugated, and simplify the code itself
by 1) removing the conjugation and 2) emitting a different matmul call. The
fact that there is no BLAS routine for matmul(conj(a),b) but only
matmul(tranpose(conj(a))) makes it a bit more difficult.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29550


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug fortran/29550] Optimize -fexternal-blas calls for conjg()
  2006-10-22 16:26 [Bug fortran/29550] New: Optimize -fexternal-blas calls for transpose()/conj() tobias dot burnus at physik dot fu-berlin dot de
                   ` (5 preceding siblings ...)
  2006-11-04 18:22 ` fxcoudert at gcc dot gnu dot org
@ 2010-09-13 18:54 ` tkoenig at gcc dot gnu dot org
  6 siblings, 0 replies; 8+ messages in thread
From: tkoenig at gcc dot gnu dot org @ 2010-09-13 18:54 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from tkoenig at gcc dot gnu dot org  2010-09-13 18:53 -------
Sounds like something for front end optimization.

Should we maybe generate the BLAS calls directly, instead of jumping
through the library functions?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29550


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-09-13 18:54 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-10-22 16:26 [Bug fortran/29550] New: Optimize -fexternal-blas calls for transpose()/conj() tobias dot burnus at physik dot fu-berlin dot de
2006-10-22 18:45 ` [Bug fortran/29550] Optimize -fexternal-blas calls for conj() fxcoudert at gcc dot gnu dot org
2006-10-22 23:15 ` [Bug fortran/29550] Optimize -fexternal-blas calls for conjg() fxcoudert at gcc dot gnu dot org
2006-10-23 21:56 ` fxcoudert at gcc dot gnu dot org
2006-10-29 22:12 ` fxcoudert at gcc dot gnu dot org
2006-11-04 14:18 ` jb at gcc dot gnu dot org
2006-11-04 18:22 ` fxcoudert at gcc dot gnu dot org
2010-09-13 18:54 ` tkoenig at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).