From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4401 invoked by alias); 28 Oct 2010 18:44:38 -0000 Received: (qmail 4381 invoked by uid 22791); 28 Oct 2010 18:44:37 -0000 X-SWARE-Spam-Status: No, hits=-1.4 required=5.0 tests=AWL,BAYES_00,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mail.codesourcery.com (HELO mail.codesourcery.com) (38.113.113.100) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 28 Oct 2010 18:44:32 +0000 Received: (qmail 10938 invoked from network); 28 Oct 2010 18:44:31 -0000 Received: from unknown (HELO ?192.168.1.66?) (vries@127.0.0.2) by mail.codesourcery.com with ESMTPA; 28 Oct 2010 18:44:31 -0000 Message-ID: <4CC9C4EC.4060200@codesourcery.com> Date: Thu, 28 Oct 2010 20:45:00 -0000 From: Tom de Vries User-Agent: Thunderbird 2.0.0.24 (X11/20100411) MIME-Version: 1.0 To: Eric Botcazou CC: gcc-patches@gcc.gnu.org, Bernd Schmidt Subject: Re: new sign/zero extension elimination pass References: <4CBC698B.3080204@codesourcery.com> <201010221030.30363.ebotcazou@adacore.com> In-Reply-To: <201010221030.30363.ebotcazou@adacore.com> Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 7bit Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2010-10/txt/msg02435.txt.bz2 Eric Botcazou wrote: >> This new pass manages to analyze this pattern, and replace the sign_extend >> with a regcopy, which results in 1 less instruction in the assembly. The >> other passes that eliminate sign/zero extensions do no manage to do that. >> Combine doesn't work since the def and the use of reg 199 are in a >> different bb. Implicit_zee does not work here since it only combines an >> extension with the defs of its src operand, which is not applicable in this >> case. > > Was it originally written for a pre-4.3 compiler? No, it was written on top of a 4.5.1 compiler. > If not, why does it not use > the DF framework instead of recomputing DU chains manually? > The current form of the pass does not compute DU-chains. It treats all defs and uses of the same reg as if all defs reach all uses. This is imprecise, and I'm expecting to find an example where this is a limiting factor, but until now I haven't. >> The pass does a couple of linear scans over the insns, so it's not >> expensive in terms of runtime. > > At -O2 it can do up to 5 full scans over the insns AFAICS; the head comment in > the file has "a number of times". This appears to be quite suboptimal and you > rightfully mentioned it in the head comment. Couldn't we avoid doing a full > scan for each new round during the propagation phase, even without using a > fully-fledged iterative algorithm? > If we save the use size for each use in the second scan, rather than just saving the biggest use size for each reg, we only have to visit instructions where propagation can happen in a propagating scan. So we can make a list of instructions where that can happen, and iterate over those only. The second example I mentioned has 14 insns, of which 8 would be in such a list. Actually, currently it's 6, but it will be 8 if we also propagate over extensions, which we don't do yet. Either way, this is already better than doing a full scan. Another runtime improvement would be to keep track of the propagated size per insn, which we can use to avoid processing an insn (applying the transfer function) when it won't have an effect. A further runtime improvement would be to move insn out the the list when processed, and back into the list when necessary, using UD-chains. Essentially this is already an iterative algorithm with a worklist. The best version I can think now of is an iterative solution using DU/UD-chains, like this: - Calculate and store the size of each use. - Calculate and store the biggest use of each def (using DU-chains). - Push the progating defs where the biggest use size is smaller than word size on a stack. - Pop a def. - For the def, check if the propagated biggest use is bigger than the current biggest use. If not, back to pop a def. If so, store current biggest use as propagated biggest use. - For the def, propagate the biggest use of the def to the operands. - For each operand, if the use size was reduced, find the reaching definitions (using UD-chains). - For each propagating definition, recompute the biggest use. If that one was reduced below the propagated biggest use, push the def on the stack. - back to pop a def. I think it would make sense to do either the first 2 improvements (which are relatively easy to do on top of the current version), or to go directly for the last version (which would mean a rewrite). - Tom