From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21159 invoked by alias); 15 Aug 2007 18:36:03 -0000 Received: (qmail 20940 invoked by uid 22791); 15 Aug 2007 18:36:02 -0000 X-Spam-Check-By: sourceware.org Received: from mail.codesourcery.com (HELO mail.codesourcery.com) (65.74.133.4) by sourceware.org (qpsmtpd/0.31) with ESMTP; Wed, 15 Aug 2007 18:35:58 +0000 Received: (qmail 28684 invoked from network); 15 Aug 2007 18:35:56 -0000 Received: from unknown (HELO ?192.168.0.123?) (zack@127.0.0.2) by mail.codesourcery.com with ESMTPA; 15 Aug 2007 18:35:56 -0000 Message-ID: <46C34786.3@codesourcery.com> Date: Wed, 15 Aug 2007 18:44:00 -0000 From: Zack Weinberg User-Agent: Mozilla-Thunderbird 2.0.0.4 (X11/20070622) MIME-Version: 1.0 To: Segher Boessenkool CC: gcc@gcc.gnu.org Subject: Re: RFC: Simplify rules for ctz/clz patterns and RTL References: <46BD5B17.9080009@codesourcery.com> <62defd1ecb56eda092535c7147d5d3c6@kernel.crashing.org> In-Reply-To: <62defd1ecb56eda092535c7147d5d3c6@kernel.crashing.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org X-SW-Source: 2007-08/txt/msg00253.txt.bz2 Segher Boessenkool wrote: >> * I would like to do the same for __builtin_ctz, but there is a catch. >> The synthetic ctz sequence in terms of popcount (as presently >> implemented by ia64.md, and potentially usable for at least i386 and >> rs6000 as well if moved to optabs.c) produces the canonical behavior at >> zero, but the synthetic sequence in terms of clz (as presently >> implemented by optabs.c) produces the value -1 at zero. > > I suppose you're using (assuming 32-bit) > > ctz(x) := 31 - clz(x & -x) > > now, which gives -1 for 0; and the version you're looking for is > > ctz(x) := 32 - clz(~x & (x-1)) > > which gives 32 for 0. Thanks! That's, unfortunately, one more instruction, although I guess a lot of chips have "a & ~b" as one operation. > What does the popcount version look like? Never seen that before, > but I think it will be really expensive on PowerPC. ctz(x) := popcount(~x & (x-1)) Just the same thing as your version of the ctz-as-clz operation, but without the final adjustment. It looks like ~x & (x-1) turns any number into 000...111... where the boundary between zeroes and ones lies at the lowest 1 in the original. Is popcount really slow on PowerPC? (Compared to clz?) Ideally one would choose between the two expansions based on RTL costs, but the only architectures it matters for are i386 and powerpc, and neither of them define the cost of either clz or popcount. zw