From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 9651 invoked by alias); 1 Feb 2009 23:18:28 -0000 Received: (qmail 9642 invoked by uid 22791); 1 Feb 2009 23:18:28 -0000 X-SWARE-Spam-Status: No, hits=-2.3 required=5.0 tests=AWL,BAYES_00,SPF_PASS X-Spam-Check-By: sourceware.org Received: from mtagate7.de.ibm.com (HELO mtagate7.de.ibm.com) (195.212.29.156) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sun, 01 Feb 2009 23:18:22 +0000 Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) by mtagate7.de.ibm.com (8.13.8/8.13.8) with ESMTP id n11NIIBE037100 for ; Sun, 1 Feb 2009 23:18:18 GMT Received: from d12av04.megacenter.de.ibm.com (d12av04.megacenter.de.ibm.com [9.149.165.229]) by d12nrmr1607.megacenter.de.ibm.com (8.13.8/8.13.8/NCO v9.1) with ESMTP id n11NIJdb4108290 for ; Mon, 2 Feb 2009 00:18:19 +0100 Received: from d12av04.megacenter.de.ibm.com (loopback [127.0.0.1]) by d12av04.megacenter.de.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id n11NIIZh014275 for ; Mon, 2 Feb 2009 00:18:19 +0100 Received: from d12mc102.megacenter.de.ibm.com (d12mc102.megacenter.de.ibm.com [9.149.167.114]) by d12av04.megacenter.de.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id n11NIIE3014272; Mon, 2 Feb 2009 00:18:18 +0100 In-Reply-To: <7FB04A5C213E9943A72EE127DB74F0AD49E39C3E85@SJEXCHCCR02.corp.ad.broadcom.com> Subject: Re: Solve transitive closure issue in modulo scheduling To: "Bingfeng Mei" Cc: "Adrian Ashley" , "gcc@gcc.gnu.org" Message-ID: From: Ayal Zaks Date: Sun, 01 Feb 2009 23:18:00 -0000 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org X-SW-Source: 2009-02/txt/msg00021.txt.bz2 "Bingfeng Mei" wrote on 30/01/2009 14:44:01: > Hello, > I try to make modulo scheduling work more efficiently for our VLIW target. I > found one serious issue that prevents current SMS algorithm from achieving > high IPC is so-called "transitive closure" problem, where scheduling window is > only calculated using direct predecessors and successors. Because SMS is not > an iterative algorithm, this may cause failures in finding a valid schedule. Agreed. > Without splitting rows, some simple loops just cannot be scheduled not matter > how big the II is. With splitting rows, schedule can be found, but only at > bigger II. It may happen that even splitting rows will not help, e.g. when we repeatedly end up with a node having negative sched window. > GCC wiki (http://gcc.gnu.org/wiki/SwingModuloScheduling) lists this > as a TODO. Is there any work going on about this issue No, not to my knowledge. We had some testcase where this showed up, hence its appearance in the TODO, but afaicr some change caused it to disappear. > (the last wiki update > was one year ago)? If no one is working on it, I plan to do it. My idea is to > use the MinDist algorithm described in B. Rau's classic paper "iterative > modulo scheduling" (http://www.hpl.hp.com/techreports/94/HPL-94-115.html). The > same algorithm can also be used to compute better RecMII. The biggest concern > is complexity of computing MinDist matrix, which is O(N^3). N is number of > nodes in the loop. I remember somewhere GCC coding guide says "never write > quadratic algorithm" :-) Is this an absolute requirement? If yes, I will keep > it as our target-specific code (we are less concerned about compilation time). > Otherwise, I will try to make it more generic to see if it can make into > mainline in 4.5. Any comments? > The problem appears only when the DDG is cyclic, and for every cycle in the DDG, the problem may arise when trying to schedule the last node of the cycle, which has both predecessors and successors already scheduled. So you might try to connect only each such predecessor to every such successor with a transitive arc, to ensure that this last node will have a non-empty scheduling window. But this might not suffice; you may eventually need to wire (nearly) every pair of nodes in the strongly connected component. If this happens, you'd be better off with a dense graph representation (incidence matrix) than the current sparse one (adjaceny lists). An example would help see things more clearly. If you have a (small :) DDG demonstrating the need for transitive arcs, I'd be happy to have a look and advise what should be done. Ayal. > Cheers, > Bingfeng Mei > > Broadcom UK > >