public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/33244]  New: Missed opportunities for vectorization due to PRE
@ 2007-08-30  2:55 spop at gcc dot gnu dot org
  2007-08-30 15:24 ` [Bug tree-optimization/33244] " dberlin at dberlin dot org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: spop at gcc dot gnu dot org @ 2007-08-30  2:55 UTC (permalink / raw)
  To: gcc-bugs

The following loop showing up in the top time users in capacita.f90 is
not vectorized because the loop latch block is non empty:

./capacita.f90:51: note: ===== analyze_loop_nest =====
./capacita.f90:51: note: === vect_analyze_loop_form ===
./capacita.f90:51: note: not vectorized: unexpected loop form.
./capacita.f90:51: note: bad loop form.
./capacita.f90:9: note: vectorized 0 loops in function.

This block contains the following code that comes from the
partial redundancy elimination pass:

      bb_14 (preds = {bb_13 }, succs = {bb_13 })
      {
      <bb 14>:
        # VUSE <SFT.109_593> { SFT.109 }
        pretmp.166_821 = g.dim[1].stride;
        goto <bb 13>;

      }

Now, if I disable the PRE with -fno-tree-pre, I get another problem on
the data dependence analysis:

        base_address: &d1
        offset from base address: 0
        constant offset from base address: 0
        step: 0
        aligned to: 128
        base_object: d1
        symbol tag: d1
        FAILED as dr address is invariant

/home/seb/ex/capacita.f90:46: note: not vectorized: unhandled data-ref 
/home/seb/ex/capacita.f90:46: note: bad data references.
/home/seb/ex/capacita.f90:4: note: vectorized 0 loops in function.

This fail corresponds to the following code in tree-data-ref.c

      /* FIXME -- data dependence analysis does not work correctly for objects
with
         invariant addresses.  Let us fail here until the problem is fixed.  */
      if (dr_address_invariant_p (dr))
        {
          free_data_ref (dr);
          if (dump_file && (dump_flags & TDF_DETAILS))
            fprintf (dump_file, "\tFAILED as dr address is invariant\n");
          ret = false;
          break;
        }

Due to the following statement:

# VUSE <d1_143> { d1 }
d1.33_86 = d1;

So here the data reference is for d1 that is a read with the following tree:

    arg 1 <var_decl 0xb7be01cc d1 type <real_type 0xb7b4eaf8 real4>
        addressable used public static SF file /home/seb/ex/capacita.f90 line
11 size <integer_cst 0xb7b4163c 32> unit size <integer_cst 0xb7b41428 4>
        align 32
        chain <var_decl 0xb7be0170 d2 type <real_type 0xb7b4eaf8 real4>
            addressable used public static SF file /home/seb/ex/capacita.f90
line 11 size <integer_cst 0xb7b4163c 32> unit size <integer_cst 0xb7b41428 4>
            align 32 chain <var_decl 0xb7be0114 eps0>>>

I don't really know how this could be handled as a data reference,
because that statement has a VUSE but the type of d1 is scalar.

A reduced testcase is like this:



module solv_cap

  implicit none

  public  :: init_solve

  integer, parameter, public :: dp = selected_real_kind(5)

  real(kind=dp), private :: Pi, Mu0, c0, eps0
  logical,       private :: UseFFT, UsePreco
  real(kind=dp), private :: D1, D2
  integer,       private, save :: Ng1=0, Ng2=0
  integer,       private, pointer,     dimension(:,:)  :: Grid
  real(kind=dp), private, allocatable, dimension(:,:)  :: G

contains

  subroutine init_solve(Grid_in, GrSize1, GrSize2, UseFFT_in, UsePreco_in)
    integer, intent(in), target, dimension(:,:) :: Grid_in
    real(kind=dp), intent(in)  :: GrSize1, GrSize2
    logical,       intent(in)  :: UseFFT_in, UsePreco_in
    integer                    :: i, j

    Pi = acos(-1.0_dp)
    Mu0 = 4e-7_dp * Pi
    c0 = 299792458
    eps0 = 1 / (Mu0 * c0**2)

    UseFFT = UseFFT_in
    UsePreco = UsePreco_in

    if(Ng1 /= 0 .and. allocated(G) ) then
      deallocate( G )
    end if

    Grid => Grid_in
    Ng1 = size(Grid, 1)
    Ng2 = size(Grid, 2)
    D1 = GrSize1/Ng1
    D2 = GrSize2/Ng2

    allocate( G(0:Ng1,0:Ng2) )

    write(unit=*, fmt=*) "Calculating G"
    do i=0,Ng1
      do j=0,Ng2
        G(i,j) = Ginteg( -D1/2,-D2/2, D1/2,D2/2, i*D1,j*D2 )
      end do
    end do

    if(UseFFT) then
      write(unit=*, fmt=*) "Transforming G"
      call FourirG(G,1)
    end if

    return
  end subroutine init_solve


  function Ginteg(xq1,yq1, xq2,yq2, xp,yp)  result(G)
    real(kind=dp), intent(in) :: xq1,yq1, xq2,yq2, xp,yp
    real(kind=dp)             :: G
    real(kind=dp)             :: x1,x2,y1,y2,t
    x1 = xq1-xp
    x2 = xq2-xp
    y1 = yq1-yp
    y2 = yq2-yp

    if (x1+x2 < 0) then
      t = -x1
      x1 = -x2
      x2 = t
    end if
    if (y1+y2 < 0) then
      t = -y1
      y1 = -y2
      y2 = t
    end if

    G = Vprim(x2,y2)-Vprim(x1,y2)-Vprim(x2,y1)+Vprim(x1,y1)

    return
  end function Ginteg


  function Vprim(x,y)  result(VP)
    real(kind=dp), intent(in) :: x,y
    real(kind=dp)             :: VP
    real(kind=dp)             :: r

    r = sqrt(x**2+y**2)
    VP = y*log(x+r) + x*log(y+r)

    return
  end function Vprim


end module solv_cap


-- 
           Summary: Missed opportunities for vectorization due to PRE
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: spop at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33244


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/33244] Missed opportunities for vectorization due to PRE
  2007-08-30  2:55 [Bug tree-optimization/33244] New: Missed opportunities for vectorization due to PRE spop at gcc dot gnu dot org
@ 2007-08-30 15:24 ` dberlin at dberlin dot org
  2010-09-07 14:47 ` matz at gcc dot gnu dot org
  2010-09-08 12:35 ` matz at gcc dot gnu dot org
  2 siblings, 0 replies; 4+ messages in thread
From: dberlin at dberlin dot org @ 2007-08-30 15:24 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from dberlin at gcc dot gnu dot org  2007-08-30 15:24 -------
Subject: Re:  New: Missed opportunities for vectorization due to PRE

On 30 Aug 2007 02:55:17 -0000, spop at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
> The following loop showing up in the top time users in capacita.f90 is
> not vectorized because the loop latch block is non empty:
>
> ./capacita.f90:51: note: ===== analyze_loop_nest =====
> ./capacita.f90:51: note: === vect_analyze_loop_form ===
> ./capacita.f90:51: note: not vectorized: unexpected loop form.
> ./capacita.f90:51: note: bad loop form.
> ./capacita.f90:9: note: vectorized 0 loops in function.
>
> This block contains the following code that comes from the
> partial redundancy elimination pass:
>
>       bb_14 (preds = {bb_13 }, succs = {bb_13 })
>       {
>       <bb 14>:
>         # VUSE <SFT.109_593> { SFT.109 }
>         pretmp.166_821 = g.dim[1].stride;
>         goto <bb 13>;
>
>       }
>

PRE is just invariant hoisting.  If we didn't, something else would (LIM).


> Now, if I disable the PRE with -fno-tree-pre, I get another problem on
> the data dependence analysis:
>
>         base_address: &d1
>         offset from base address: 0
>         constant offset from base address: 0
>         step: 0
>         aligned to: 128
>         base_object: d1
>         symbol tag: d1
>         FAILED as dr address is invariant
>
> /home/seb/ex/capacita.f90:46: note: not vectorized: unhandled data-ref
> /home/seb/ex/capacita.f90:46: note: bad data references.
> /home/seb/ex/capacita.f90:4: note: vectorized 0 loops in function.
>
> This fail corresponds to the following code in tree-data-ref.c
>
>       /* FIXME -- data dependence analysis does not work correctly for objects
> with
>          invariant addresses.  Let us fail here until the problem is fixed.  */
>       if (dr_address_invariant_p (dr))
>         {
>           free_data_ref (dr);
>           if (dump_file && (dump_flags & TDF_DETAILS))
>             fprintf (dump_file, "\tFAILED as dr address is invariant\n");
>           ret = false;
>           break;
>         }
>
> Due to the following statement:
>
> # VUSE <d1_143> { d1 }
> d1.33_86 = d1;
>
> So here the data reference is for d1 that is a read with the following tree:
>
>     arg 1 <var_decl 0xb7be01cc d1 type <real_type 0xb7b4eaf8 real4>
>         addressable used public static SF file /home/seb/ex/capacita.f90 line
> 11 size <integer_cst 0xb7b4163c 32> unit size <integer_cst 0xb7b41428 4>
>         align 32
>         chain <var_decl 0xb7be0170 d2 type <real_type 0xb7b4eaf8 real4>
>             addressable used public static SF file /home/seb/ex/capacita.f90
> line 11 size <integer_cst 0xb7b4163c 32> unit size <integer_cst 0xb7b41428 4>
>             align 32 chain <var_decl 0xb7be0114 eps0>>>
>
> I don't really know how this could be handled as a data reference,
> because that statement has a VUSE but the type of d1 is scalar.

Yes, but it is a global, and should be looked at as any other load is.
:)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33244


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/33244] Missed opportunities for vectorization due to PRE
  2007-08-30  2:55 [Bug tree-optimization/33244] New: Missed opportunities for vectorization due to PRE spop at gcc dot gnu dot org
  2007-08-30 15:24 ` [Bug tree-optimization/33244] " dberlin at dberlin dot org
@ 2010-09-07 14:47 ` matz at gcc dot gnu dot org
  2010-09-08 12:35 ` matz at gcc dot gnu dot org
  2 siblings, 0 replies; 4+ messages in thread
From: matz at gcc dot gnu dot org @ 2010-09-07 14:47 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from matz at gcc dot gnu dot org  2010-09-07 14:41 -------
Since the fix for PR44710 we can if-convert the conditions in the inner loop.
With http://gcc.gnu.org/ml/gcc-patches/2010-09/msg00542.html we also
make sure that the latch block isn't filled, which in turn then triggers
the if-conversion.  This then reveals the rest of the problems, which are:

  * inlining needs to happen (our default parameters don't inline ginteg)
    The patch above ensures this by making the functions internal
  * a library with vectorized logf needs to be available (libacml_mv for
    instance)
    The patch above works around this by getting rid of calls to log/sqrt
  * loop interchange needs to happen, because in the original testcase
    we have:
      do i=0,Ng1
        do j=0,Ng2
          G(i,j) = ...
    exactly the wrong way around.  Our loop-interchange code is only
    capable of vectorizing perfect nests, which here doesn't exist
    as LIM and PRE move out some loop invariant expressions from the
    inner to the outer loop.  If we weren't doing that, that itself would
    already prevent vectorization.
    The patch above works around this by doing the interchange by hand.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33244


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/33244] Missed opportunities for vectorization due to PRE
  2007-08-30  2:55 [Bug tree-optimization/33244] New: Missed opportunities for vectorization due to PRE spop at gcc dot gnu dot org
  2007-08-30 15:24 ` [Bug tree-optimization/33244] " dberlin at dberlin dot org
  2010-09-07 14:47 ` matz at gcc dot gnu dot org
@ 2010-09-08 12:35 ` matz at gcc dot gnu dot org
  2 siblings, 0 replies; 4+ messages in thread
From: matz at gcc dot gnu dot org @ 2010-09-08 12:35 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from matz at gcc dot gnu dot org  2010-09-08 12:35 -------
Subject: Bug 33244

Author: matz
Date: Wed Sep  8 12:34:52 2010
New Revision: 163998

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=163998
Log:
        PR tree-optimization/33244
        * tree-ssa-sink.c (statement_sink_location): Don't sink into
        empty loop latches.

testsuite/
        PR tree-optimization/33244
        * gfortran.dg/vect/fast-math-vect-8.f90: New test.

Added:
    trunk/gcc/testsuite/gfortran.dg/vect/fast-math-vect-8.f90
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-sink.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33244


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-09-08 12:35 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-08-30  2:55 [Bug tree-optimization/33244] New: Missed opportunities for vectorization due to PRE spop at gcc dot gnu dot org
2007-08-30 15:24 ` [Bug tree-optimization/33244] " dberlin at dberlin dot org
2010-09-07 14:47 ` matz at gcc dot gnu dot org
2010-09-08 12:35 ` matz at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).