public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/33244] New: Missed opportunities for vectorization due to PRE
@ 2007-08-30 2:55 spop at gcc dot gnu dot org
2007-08-30 15:24 ` [Bug tree-optimization/33244] " dberlin at dberlin dot org
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: spop at gcc dot gnu dot org @ 2007-08-30 2:55 UTC (permalink / raw)
To: gcc-bugs
The following loop showing up in the top time users in capacita.f90 is
not vectorized because the loop latch block is non empty:
./capacita.f90:51: note: ===== analyze_loop_nest =====
./capacita.f90:51: note: === vect_analyze_loop_form ===
./capacita.f90:51: note: not vectorized: unexpected loop form.
./capacita.f90:51: note: bad loop form.
./capacita.f90:9: note: vectorized 0 loops in function.
This block contains the following code that comes from the
partial redundancy elimination pass:
bb_14 (preds = {bb_13 }, succs = {bb_13 })
{
<bb 14>:
# VUSE <SFT.109_593> { SFT.109 }
pretmp.166_821 = g.dim[1].stride;
goto <bb 13>;
}
Now, if I disable the PRE with -fno-tree-pre, I get another problem on
the data dependence analysis:
base_address: &d1
offset from base address: 0
constant offset from base address: 0
step: 0
aligned to: 128
base_object: d1
symbol tag: d1
FAILED as dr address is invariant
/home/seb/ex/capacita.f90:46: note: not vectorized: unhandled data-ref
/home/seb/ex/capacita.f90:46: note: bad data references.
/home/seb/ex/capacita.f90:4: note: vectorized 0 loops in function.
This fail corresponds to the following code in tree-data-ref.c
/* FIXME -- data dependence analysis does not work correctly for objects
with
invariant addresses. Let us fail here until the problem is fixed. */
if (dr_address_invariant_p (dr))
{
free_data_ref (dr);
if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "\tFAILED as dr address is invariant\n");
ret = false;
break;
}
Due to the following statement:
# VUSE <d1_143> { d1 }
d1.33_86 = d1;
So here the data reference is for d1 that is a read with the following tree:
arg 1 <var_decl 0xb7be01cc d1 type <real_type 0xb7b4eaf8 real4>
addressable used public static SF file /home/seb/ex/capacita.f90 line
11 size <integer_cst 0xb7b4163c 32> unit size <integer_cst 0xb7b41428 4>
align 32
chain <var_decl 0xb7be0170 d2 type <real_type 0xb7b4eaf8 real4>
addressable used public static SF file /home/seb/ex/capacita.f90
line 11 size <integer_cst 0xb7b4163c 32> unit size <integer_cst 0xb7b41428 4>
align 32 chain <var_decl 0xb7be0114 eps0>>>
I don't really know how this could be handled as a data reference,
because that statement has a VUSE but the type of d1 is scalar.
A reduced testcase is like this:
module solv_cap
implicit none
public :: init_solve
integer, parameter, public :: dp = selected_real_kind(5)
real(kind=dp), private :: Pi, Mu0, c0, eps0
logical, private :: UseFFT, UsePreco
real(kind=dp), private :: D1, D2
integer, private, save :: Ng1=0, Ng2=0
integer, private, pointer, dimension(:,:) :: Grid
real(kind=dp), private, allocatable, dimension(:,:) :: G
contains
subroutine init_solve(Grid_in, GrSize1, GrSize2, UseFFT_in, UsePreco_in)
integer, intent(in), target, dimension(:,:) :: Grid_in
real(kind=dp), intent(in) :: GrSize1, GrSize2
logical, intent(in) :: UseFFT_in, UsePreco_in
integer :: i, j
Pi = acos(-1.0_dp)
Mu0 = 4e-7_dp * Pi
c0 = 299792458
eps0 = 1 / (Mu0 * c0**2)
UseFFT = UseFFT_in
UsePreco = UsePreco_in
if(Ng1 /= 0 .and. allocated(G) ) then
deallocate( G )
end if
Grid => Grid_in
Ng1 = size(Grid, 1)
Ng2 = size(Grid, 2)
D1 = GrSize1/Ng1
D2 = GrSize2/Ng2
allocate( G(0:Ng1,0:Ng2) )
write(unit=*, fmt=*) "Calculating G"
do i=0,Ng1
do j=0,Ng2
G(i,j) = Ginteg( -D1/2,-D2/2, D1/2,D2/2, i*D1,j*D2 )
end do
end do
if(UseFFT) then
write(unit=*, fmt=*) "Transforming G"
call FourirG(G,1)
end if
return
end subroutine init_solve
function Ginteg(xq1,yq1, xq2,yq2, xp,yp) result(G)
real(kind=dp), intent(in) :: xq1,yq1, xq2,yq2, xp,yp
real(kind=dp) :: G
real(kind=dp) :: x1,x2,y1,y2,t
x1 = xq1-xp
x2 = xq2-xp
y1 = yq1-yp
y2 = yq2-yp
if (x1+x2 < 0) then
t = -x1
x1 = -x2
x2 = t
end if
if (y1+y2 < 0) then
t = -y1
y1 = -y2
y2 = t
end if
G = Vprim(x2,y2)-Vprim(x1,y2)-Vprim(x2,y1)+Vprim(x1,y1)
return
end function Ginteg
function Vprim(x,y) result(VP)
real(kind=dp), intent(in) :: x,y
real(kind=dp) :: VP
real(kind=dp) :: r
r = sqrt(x**2+y**2)
VP = y*log(x+r) + x*log(y+r)
return
end function Vprim
end module solv_cap
--
Summary: Missed opportunities for vectorization due to PRE
Product: gcc
Version: 4.3.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: spop at gcc dot gnu dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33244
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/33244] Missed opportunities for vectorization due to PRE
2007-08-30 2:55 [Bug tree-optimization/33244] New: Missed opportunities for vectorization due to PRE spop at gcc dot gnu dot org
@ 2007-08-30 15:24 ` dberlin at dberlin dot org
2010-09-07 14:47 ` matz at gcc dot gnu dot org
2010-09-08 12:35 ` matz at gcc dot gnu dot org
2 siblings, 0 replies; 4+ messages in thread
From: dberlin at dberlin dot org @ 2007-08-30 15:24 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from dberlin at gcc dot gnu dot org 2007-08-30 15:24 -------
Subject: Re: New: Missed opportunities for vectorization due to PRE
On 30 Aug 2007 02:55:17 -0000, spop at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
> The following loop showing up in the top time users in capacita.f90 is
> not vectorized because the loop latch block is non empty:
>
> ./capacita.f90:51: note: ===== analyze_loop_nest =====
> ./capacita.f90:51: note: === vect_analyze_loop_form ===
> ./capacita.f90:51: note: not vectorized: unexpected loop form.
> ./capacita.f90:51: note: bad loop form.
> ./capacita.f90:9: note: vectorized 0 loops in function.
>
> This block contains the following code that comes from the
> partial redundancy elimination pass:
>
> bb_14 (preds = {bb_13 }, succs = {bb_13 })
> {
> <bb 14>:
> # VUSE <SFT.109_593> { SFT.109 }
> pretmp.166_821 = g.dim[1].stride;
> goto <bb 13>;
>
> }
>
PRE is just invariant hoisting. If we didn't, something else would (LIM).
> Now, if I disable the PRE with -fno-tree-pre, I get another problem on
> the data dependence analysis:
>
> base_address: &d1
> offset from base address: 0
> constant offset from base address: 0
> step: 0
> aligned to: 128
> base_object: d1
> symbol tag: d1
> FAILED as dr address is invariant
>
> /home/seb/ex/capacita.f90:46: note: not vectorized: unhandled data-ref
> /home/seb/ex/capacita.f90:46: note: bad data references.
> /home/seb/ex/capacita.f90:4: note: vectorized 0 loops in function.
>
> This fail corresponds to the following code in tree-data-ref.c
>
> /* FIXME -- data dependence analysis does not work correctly for objects
> with
> invariant addresses. Let us fail here until the problem is fixed. */
> if (dr_address_invariant_p (dr))
> {
> free_data_ref (dr);
> if (dump_file && (dump_flags & TDF_DETAILS))
> fprintf (dump_file, "\tFAILED as dr address is invariant\n");
> ret = false;
> break;
> }
>
> Due to the following statement:
>
> # VUSE <d1_143> { d1 }
> d1.33_86 = d1;
>
> So here the data reference is for d1 that is a read with the following tree:
>
> arg 1 <var_decl 0xb7be01cc d1 type <real_type 0xb7b4eaf8 real4>
> addressable used public static SF file /home/seb/ex/capacita.f90 line
> 11 size <integer_cst 0xb7b4163c 32> unit size <integer_cst 0xb7b41428 4>
> align 32
> chain <var_decl 0xb7be0170 d2 type <real_type 0xb7b4eaf8 real4>
> addressable used public static SF file /home/seb/ex/capacita.f90
> line 11 size <integer_cst 0xb7b4163c 32> unit size <integer_cst 0xb7b41428 4>
> align 32 chain <var_decl 0xb7be0114 eps0>>>
>
> I don't really know how this could be handled as a data reference,
> because that statement has a VUSE but the type of d1 is scalar.
Yes, but it is a global, and should be looked at as any other load is.
:)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33244
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/33244] Missed opportunities for vectorization due to PRE
2007-08-30 2:55 [Bug tree-optimization/33244] New: Missed opportunities for vectorization due to PRE spop at gcc dot gnu dot org
2007-08-30 15:24 ` [Bug tree-optimization/33244] " dberlin at dberlin dot org
@ 2010-09-07 14:47 ` matz at gcc dot gnu dot org
2010-09-08 12:35 ` matz at gcc dot gnu dot org
2 siblings, 0 replies; 4+ messages in thread
From: matz at gcc dot gnu dot org @ 2010-09-07 14:47 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from matz at gcc dot gnu dot org 2010-09-07 14:41 -------
Since the fix for PR44710 we can if-convert the conditions in the inner loop.
With http://gcc.gnu.org/ml/gcc-patches/2010-09/msg00542.html we also
make sure that the latch block isn't filled, which in turn then triggers
the if-conversion. This then reveals the rest of the problems, which are:
* inlining needs to happen (our default parameters don't inline ginteg)
The patch above ensures this by making the functions internal
* a library with vectorized logf needs to be available (libacml_mv for
instance)
The patch above works around this by getting rid of calls to log/sqrt
* loop interchange needs to happen, because in the original testcase
we have:
do i=0,Ng1
do j=0,Ng2
G(i,j) = ...
exactly the wrong way around. Our loop-interchange code is only
capable of vectorizing perfect nests, which here doesn't exist
as LIM and PRE move out some loop invariant expressions from the
inner to the outer loop. If we weren't doing that, that itself would
already prevent vectorization.
The patch above works around this by doing the interchange by hand.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33244
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/33244] Missed opportunities for vectorization due to PRE
2007-08-30 2:55 [Bug tree-optimization/33244] New: Missed opportunities for vectorization due to PRE spop at gcc dot gnu dot org
2007-08-30 15:24 ` [Bug tree-optimization/33244] " dberlin at dberlin dot org
2010-09-07 14:47 ` matz at gcc dot gnu dot org
@ 2010-09-08 12:35 ` matz at gcc dot gnu dot org
2 siblings, 0 replies; 4+ messages in thread
From: matz at gcc dot gnu dot org @ 2010-09-08 12:35 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from matz at gcc dot gnu dot org 2010-09-08 12:35 -------
Subject: Bug 33244
Author: matz
Date: Wed Sep 8 12:34:52 2010
New Revision: 163998
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=163998
Log:
PR tree-optimization/33244
* tree-ssa-sink.c (statement_sink_location): Don't sink into
empty loop latches.
testsuite/
PR tree-optimization/33244
* gfortran.dg/vect/fast-math-vect-8.f90: New test.
Added:
trunk/gcc/testsuite/gfortran.dg/vect/fast-math-vect-8.f90
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-sink.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33244
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-09-08 12:35 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-08-30 2:55 [Bug tree-optimization/33244] New: Missed opportunities for vectorization due to PRE spop at gcc dot gnu dot org
2007-08-30 15:24 ` [Bug tree-optimization/33244] " dberlin at dberlin dot org
2010-09-07 14:47 ` matz at gcc dot gnu dot org
2010-09-08 12:35 ` matz at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).