From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 5267 invoked by alias); 21 Nov 2007 15:37:21 -0000 Received: (qmail 5218 invoked by uid 22791); 21 Nov 2007 15:37:20 -0000 X-Spam-Check-By: sourceware.org Received: from mtagate7.de.ibm.com (HELO mtagate7.de.ibm.com) (195.212.29.156) by sourceware.org (qpsmtpd/0.31) with ESMTP; Wed, 21 Nov 2007 15:37:12 +0000 Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) by mtagate7.de.ibm.com (8.13.8/8.13.8) with ESMTP id lALFb8eF157090 for ; Wed, 21 Nov 2007 15:37:08 GMT Received: from d12av04.megacenter.de.ibm.com (d12av04.megacenter.de.ibm.com [9.149.165.229]) by d12nrmr1607.megacenter.de.ibm.com (8.13.8/8.13.8/NCO v8.7) with ESMTP id lALFb9rr2990322 for ; Wed, 21 Nov 2007 16:37:09 +0100 Received: from d12av04.megacenter.de.ibm.com (loopback [127.0.0.1]) by d12av04.megacenter.de.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id lALFb9VO017545 for ; Wed, 21 Nov 2007 16:37:09 +0100 Received: from d12mc102.megacenter.de.ibm.com (d12mc102.megacenter.de.ibm.com [9.149.167.114]) by d12av04.megacenter.de.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id lALFb8Gt017541 for ; Wed, 21 Nov 2007 16:37:08 +0100 In-Reply-To: Subject: Re: [PATCH 1/2][Modulo-sched] Fix the direction of the scheduling window To: Revital1 Eres Cc: Andrey Belevantsev , "Alexander Monakov" , gcc-patches@gcc.gnu.org X-Mailer: Lotus Notes Release 7.0 HF277 June 21, 2006 Message-ID: From: Ayal Zaks Date: Wed, 21 Nov 2007 17:24:00 -0000 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2007-11/txt/msg01108.txt.bz2 Revital1 Eres/Haifa/IBM wrote on 20/11/2007 15:47:59: > Hello, > > When determining the scheduling window of a node all types of edges are > taken into account. Currently when a node has both predecessors and > successors it will be set close to it's predecessors. In fact it can > be that a node has only one successor with true dep edge but it will be > scheduled close to it's allegedly predecessor; because of the anti-dep > edge between the two nodes. Trying to avoid this confusion and reduce the > life range of registers we choose to set the scheduled node close to it's > predecessors or close to it's successors based on true deps edges only. > > This change makes SMS succeed on the attached testcase with -funroll-loops > on SPU. The testcase was provided by Vladimir. > > Bootstrapped and tested together with the follow on patch (2/2) on ppc, > SPU and x86 with no new regressions. > > :ADDPATCH modulo-sched: > > OK for mainline? Yes, with minor comments below. Note that it may be better to set the directions (step = 1/-1) once when fixing the order of nodes, instead of counting all true register deps repeatedly every time we try to schedule a node, but otoh we have to traverse all succs and preds to compute the window anyhow. Ayal. > > Thanks, > Revital > > 2007-11-20 Ayal Zaks > Revital Eres > > * modulo-sched.c (get_sched_window): Fix the direction of the > scheduling window and add dump info. > > testsuite: > > 2007-11-20 Vladimir Yanovsky > > * gcc.dg/sms-3.c: New testcase. > attachment "patch_win_dir_again.txt": > Index: modulo-sched.c > =================================================================== > --- modulo-sched.c (revision 129721) > +++ modulo-sched.c (working copy) > @@ -1344,8 +1344,8 @@ > MAX (early_start, p_st + e->latency - (e->distance * ii)); > > if (dump_file) > - fprintf (dump_file, "pred st = %d; early_start = %d; ", p_st, > - early_start); > + fprintf (dump_file, "pred st = %d; early_start = %d; " > + " latency: %d", p_st, early_start, e->latency); Not sure what's the proper way to split a string; the following looks better I think: + fprintf (dump_file, + "pred st = %d; early_start = %d; latency: %d", + p_st, early_start, e->latency); > > if (e->data_type == MEM_DEP) > end = MIN (end, SCHED_TIME (v_node) + ii - 1); > @@ -1355,6 +1355,7 @@ > } > start = early_start; > end = MIN (end, early_start + ii); > + /* Schedule the node close to it's processors. */ ^^^^^^^^^^ predecessors > step = 1; > > if (dump_file) > @@ -1391,8 +1392,8 @@ > s_st - e->latency + (e->distance * ii)); > > if (dump_file) > - fprintf (dump_file, "succ st = %d; late_start = %d;", s_st, > - late_start); > + fprintf (dump_file, "succ st = %d; late_start = %d;" > + " latency = %d", s_st, late_start, e->latency); Same as above. > > if (e->data_type == MEM_DEP) > end = MAX (end, SCHED_TIME (v_node) - ii + 1); > @@ -1406,6 +1407,7 @@ > } > start = late_start; > end = MAX (end, late_start - ii); > + /* Schedule the node close to it's successors. */ > step = -1; > > if (dump_file) > @@ -1419,6 +1421,8 @@ > { > int early_start = INT_MIN; > int late_start = INT_MAX; > + int count_preds = 0; > + int count_succs = 0; > > start = INT_MIN; > end = INT_MAX; > @@ -1446,9 +1450,12 @@ > - (e->distance * ii)); > > if (dump_file) > - fprintf (dump_file, "pred st = %d; early_start = %d;", p_st, > - early_start); > + fprintf (dump_file, "pred st = %d; early_start = %d;" > + " latency = %d", p_st, early_start, e->latency); Same as above. > > + if (e->type == TRUE_DEP && e->data_type == REG_DEP) > + count_preds++; > + > if (e->data_type == MEM_DEP) > end = MIN (end, SCHED_TIME (v_node) + ii - 1); > } > @@ -1479,10 +1486,13 @@ > s_st - e->latency > + (e->distance * ii)); > > - if (dump_file) > - fprintf (dump_file, "succ st = %d; late_start = %d;", s_st, > - late_start); > + if (dump_file) > + fprintf (dump_file, "succ st = %d; late_start = %d;" > + " latency = %d", s_st, late_start, e->latency); Same as above. > > + if (e->type == TRUE_DEP && e->data_type == REG_DEP) > + count_succs++; > + > if (e->data_type == MEM_DEP) > start = MAX (start, SCHED_TIME (v_node) - ii + 1); > } > @@ -1493,6 +1503,16 @@ > start = MAX (start, early_start); > end = MIN (end, MIN (early_start + ii, late_start + 1)); > step = 1; > + /* If there are more successors than processors schedule the ^^^^^^^^^^ > + node close to it's successors. */ > + if (count_succs >= count_preds) > + { > + int old_start = start; > + > + start = end - 1; > + end = old_start - 1; > + step = -1; > + } > } > else /* psp is empty && pss is empty. */ > { attachment "sms-3.c.txt": > /* { dg-do compile } */ > /* { dg-options "-O2 -fmodulo-sched -funroll-loops -dm" } */ > > int X[100]; > int Y[100]; > > int > foo (int len, long *result) > { > int i; > > len = 1000; > long res = *result; > for (i = 0; i < len; i++) > res += X[i] * Y[i]; > > *result = res; > } > > /* { dg-final { cleanup-rtl-dump "sms" } } */ Better check that result is correct on some data, in addition to the fact that the loop is sms'ed.