public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Razya Ladelsky <RAZYA@il.ibm.com>
To: Richard Guenther <richard.guenther@gmail.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>, Jakub Jelinek <jakub@redhat.com>
Subject: [PATCH, take 2] Fix PR tree-optimization/49960 ,Fix self data dependence
Date: Tue, 15 Nov 2011 15:13:00 -0000	[thread overview]
Message-ID: <OFE59B30C0.27072E7A-ONC2257949.0035FEEA-C2257949.0039D985@il.ibm.com> (raw)
In-Reply-To: <CAFiYyc0D8n6BU-CLqB132FBuxOqJdmiRBDTiZ6Spf8P0J5rbJg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2197 bytes --]

> > I hope it's clearer now, I will add a comment to the code, and submit 
it
> > before committing it.
> 
> No, it's not clearer, because it is not clear why you need to add the 
hack
> instead of avoiding the 2nd access function. And iff you add the hack it
> needs a comment why zero should be special (any other constant would
> be the same I suppose).
> 
> Btw, your fortran example does not compile and I don't believe the issue
> is still present after my last changes to dr_analyze_indices.  So, did
> you verify this on trunk?
> 
> Richard.

This patch fixes the failures described in 
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49960
It also fixes bzips when run with autopar enabled.

In both cases the self dependences are not handled correctly.
In the first case, a non affine access is analyzed:
in the second, the distance vector is not calculated correctly (the 
distance vector considered for for self dependences is always (0,0,...))
As  a result, the loops get wrongfully parallelized.

I modified the previous patch according to the last changes in the trunk,
which indeed do not requite special handling for the 2nd access function 
(as mentioned by Richard).
Another change is that the previous version of the patch eliminated 
compute_self_dependences function
as the calls to it were redundant, while this version considers the new 
call to compute_self_dependences from the vect code for gather (inserted 
lately by Jakub).

ChangeLog:
        PR tree-optimization/49960

        * tree-data-ref.c (initialize_data_dependence_relation): Add 
initializations. 
        Remove call to compute_self_dependence.
        (compute_affine_dependence): Remove the !DDR_SELF_REFERENCE 
condition.
        (compute_self_dependence): Remove old code. Add call to 
compute_affine_dependence.
        (compute_all_dependences): Remove call to compute_self_dependence. 

        Add call to compute_affine_dependence.
 
        testsuite/ChangeLog:
        PR tree-optimization/49960

        * gcc.dg/autopar/pr49960.c: New test.
        * gcc.dg/autopar/pr49960-1.c: New test.







Bootstrap and testsuite pass successfully for ppc64-redhat-linux.

OK for trunk?
Thank you,
Razya


[-- Attachment #2: pr49960.c --]
[-- Type: application/octet-stream, Size: 1276 bytes --]

/* { dg-do compile } */
/* { dg-options "-O2 -ftree-parallelize-loops=4 -fdump-tree-parloops-details -fdump-tree-optimized" } */

#include <stdio.h>
#define MB 100
#define NA 450
#define MA 400

int T[MA][MB],A[MA][NA],B[MB][NA];
void MRTRBR(int MA_1, int NA_1, int MB_1)
{
  int i,j, t,k;

  /* The outer most loop is not parallel because for different k's there
     is write-write dependency for T[i][j].  */
  
  /* The two inner loops don't get parallelized due to low number of 
     iterations.  */

  for (k = 3; k < NA_1; k++)
    for (i = 3; i < MA_1; i++)
      for (j = 3; j < MB_1; j++)
	{
	  t = T[i][j];
	  T[i][j] = t+2+A[i][k]*B[j][k];
	}
}
void main ()
{
  int j,i;
  
  for (i = 3; i < MA; i++)
    for (j = 3; j < MB; j++)
      T[i][j] = (i>j?i:j);
  
  MRTRBR (MA,NA,MB);
  
  for (i = MA-1; i < MA; i++)
    for (j = MB-10; j < MB; j++)
      printf ("i %d j %d T[i][j] = %d\n",i,j,T[i][j]);
}


/* Check that the outer most loop doesn't get parallelized (thus no loop gets parallelized)  */

/* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 0 "parloops" } } */
/* { dg-final { scan-tree-dump-times "loopfn" 0 "optimized" } } */
/* { dg-final { cleanup-tree-dump "parloops" } } */
/* { dg-final { cleanup-tree-dump "optimized" } } */

[-- Attachment #3: pr49960-1.c --]
[-- Type: application/octet-stream, Size: 984 bytes --]

/* { dg-do compile } */
/* { dg-options "-O2 -ftree-parallelize-loops=4 -fdump-tree-parloops-details -fdump-tree-optimized" } */

#include <stdlib.h>
#include <stdio.h>

int main() 
{
  unsigned int x, y, idx, H = 1024, W = 1024;
  
  int * tmps = (int *)malloc(H*W*sizeof(int));
  
  /* This loop gets parallelized even though output dependences exist 
     between writes to 'tmps' that prevent parallelization. 
     For example: tmps[1] = 1, ..., tmps[1] = 17.  */
  
  for(x = 1; x < H; x++) 
    {
      for(y = 1; y < W; y++) 
	{
	  idx = x*W+y;
	  tmps[idx % 4096] = idx;	  
	}
    }
  
  for(x = 1; x < 8; x++)
    printf("tmps[%d]=%d\n", x, tmps[x]);
  
  return 0;
}
/* Check that no loop gets parallelized.  */

/* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 0 "parloops" } } */
/* { dg-final { scan-tree-dump-times "loopfn" 0 "optimized" } } */
/* { dg-final { cleanup-tree-dump "parloops" } } */
/* { dg-final { cleanup-tree-dump "optimized" } } */

[-- Attachment #4: self_dependence_patch.txt --]
[-- Type: text/plain, Size: 3461 bytes --]

Index: gcc/tree-data-ref.c
===================================================================
--- gcc/tree-data-ref.c	(revision 181168)
+++ gcc/tree-data-ref.c	(working copy)
@@ -1389,13 +1389,30 @@ initialize_data_dependence_relation (struct data_r
      the data dependence tests, just initialize the ddr and return.  */
   if (operand_equal_p (DR_REF (a), DR_REF (b), 0))
     {
+     if (loop_nest
+        && !object_address_invariant_in_loop_p (VEC_index (loop_p, loop_nest, 0),
+       					        DR_BASE_OBJECT (a)))
+      {
+        DDR_ARE_DEPENDENT (res) = chrec_dont_know;
+        return res;
+      }
       DDR_AFFINE_P (res) = true;
       DDR_ARE_DEPENDENT (res) = NULL_TREE;
       DDR_SUBSCRIPTS (res) = VEC_alloc (subscript_p, heap, DR_NUM_DIMENSIONS (a));
       DDR_LOOP_NEST (res) = loop_nest;
       DDR_INNER_LOOP (res) = 0;
       DDR_SELF_REFERENCE (res) = true;
-      compute_self_dependence (res);
+      for (i = 0; i < DR_NUM_DIMENSIONS (a); i++)
+       {
+         struct subscript *subscript;
+
+         subscript = XNEW (struct subscript);
+         SUB_CONFLICTS_IN_A (subscript) = conflict_fn_not_known ();
+         SUB_CONFLICTS_IN_B (subscript) = conflict_fn_not_known ();
+         SUB_LAST_CONFLICT (subscript) = chrec_dont_know;
+         SUB_DISTANCE (subscript) = chrec_dont_know;
+         VEC_safe_push (subscript_p, heap, DDR_SUBSCRIPTS (res), subscript);
+       }
       return res;
     }
 
@@ -4040,8 +4057,7 @@ compute_affine_dependence (struct data_dependence_
     }
 
   /* Analyze only when the dependence relation is not yet known.  */
-  if (DDR_ARE_DEPENDENT (ddr) == NULL_TREE
-      && !DDR_SELF_REFERENCE (ddr))
+  if (DDR_ARE_DEPENDENT (ddr) == NULL_TREE)
     {
       dependence_stats.num_dependence_tests++;
 
@@ -4122,31 +4138,11 @@ compute_affine_dependence (struct data_dependence_
 void
 compute_self_dependence (struct data_dependence_relation *ddr)
 {
-  unsigned int i;
-  struct subscript *subscript;
-
   if (DDR_ARE_DEPENDENT (ddr) != NULL_TREE)
     return;
 
-  for (i = 0; VEC_iterate (subscript_p, DDR_SUBSCRIPTS (ddr), i, subscript);
-       i++)
-    {
-      if (SUB_CONFLICTS_IN_A (subscript))
-	free_conflict_function (SUB_CONFLICTS_IN_A (subscript));
-      if (SUB_CONFLICTS_IN_B (subscript))
-	free_conflict_function (SUB_CONFLICTS_IN_B (subscript));
-
-      /* The accessed index overlaps for each iteration.  */
-      SUB_CONFLICTS_IN_A (subscript)
-	= conflict_fn (1, affine_fn_cst (integer_zero_node));
-      SUB_CONFLICTS_IN_B (subscript)
-	= conflict_fn (1, affine_fn_cst (integer_zero_node));
-      SUB_LAST_CONFLICT (subscript) = chrec_dont_know;
-    }
-
-  /* The distance vector is the zero vector.  */
-  save_dist_v (ddr, lambda_vector_new (DDR_NB_LOOPS (ddr)));
-  save_dir_v (ddr, lambda_vector_new (DDR_NB_LOOPS (ddr)));
+  if (DDR_LOOP_NEST (ddr))
+    compute_affine_dependence (ddr, VEC_index (loop_p, DDR_LOOP_NEST (ddr), 0));
 }
 
 /* Compute in DEPENDENCE_RELATIONS the data dependence graph for all
@@ -4179,7 +4175,8 @@ compute_all_dependences (VEC (data_reference_p, he
       {
 	ddr = initialize_data_dependence_relation (a, a, loop_nest);
 	VEC_safe_push (ddr_p, heap, *dependence_relations, ddr);
-	compute_self_dependence (ddr);
+        if (loop_nest)
+   	  compute_affine_dependence (ddr, VEC_index (loop_p, loop_nest, 0));
       }
 }
 
=

  reply	other threads:[~2011-11-15 10:32 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-17  7:21 [patch] " Razya Ladelsky
2011-10-17  8:53 ` Richard Guenther
     [not found]   ` <OF746BCB18.CF82809F-ONC225792E.0051CE3F-C225792E.00564975@il.ibm.com>
     [not found]     ` <CAFiYyc2ykFPCW8A8vW=f5UbNa7zFRQObwL13D9ioXjCd_em9pQ@mail.gmail.com>
2011-10-21  9:26       ` Fwd: " Richard Guenther
2011-11-15 15:13         ` Razya Ladelsky [this message]
2011-11-15 18:54           ` [PATCH, take 2] " Richard Guenther
2011-11-21 13:32           ` Jakub Jelinek
2011-11-21 14:24             ` Razya Ladelsky
2011-11-21 14:54               ` Jakub Jelinek
2011-11-21 15:43                 ` Razya Ladelsky
2011-11-21 16:13                   ` Jakub Jelinek
2011-11-21 17:25                     ` Razya Ladelsky
2011-11-21 18:26                       ` Jakub Jelinek
2011-11-24 15:48                         ` Razya Ladelsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=OFE59B30C0.27072E7A-ONC2257949.0035FEEA-C2257949.0039D985@il.ibm.com \
    --to=razya@il.ibm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).