From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 61212 invoked by alias); 9 Sep 2019 18:54:53 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 61204 invoked by uid 89); 9 Sep 2019 18:54:53 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-0.7 required=5.0 tests=BAYES_20,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.1 spammy=0x01010101ul, type_precision, popcount4ll.c, 0x01010101UL X-HELO: forward104p.mail.yandex.net Received: from forward104p.mail.yandex.net (HELO forward104p.mail.yandex.net) (77.88.28.107) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 09 Sep 2019 18:54:50 +0000 Received: from mxback9o.mail.yandex.net (mxback9o.mail.yandex.net [IPv6:2a02:6b8:0:1a2d::23]) by forward104p.mail.yandex.net (Yandex) with ESMTP id DFC184B00E83; Mon, 9 Sep 2019 21:54:47 +0300 (MSK) Received: from smtp1o.mail.yandex.net (smtp1o.mail.yandex.net [2a02:6b8:0:1a2d::25]) by mxback9o.mail.yandex.net (nwsmtp/Yandex) with ESMTP id hhqgUXTbqt-sltKGvV1; Mon, 09 Sep 2019 21:54:47 +0300 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bell-sw.com; s=mail; t=1568055287; bh=DnB8iudnL9ZbDcGzrt/E+1XgbPjuxJ5Qln5yIo3Cg+Q=; h=In-Reply-To:Subject:Cc:To:From:References:Date:Message-ID; b=L/Ou5ryaGtmugJAVJlI5VZInGG3zwqwMVomcNc49gHdBE4l9CBNOM4sHv+ULZ8xR6 3tt4uQQrLyFhTRsjiwAplRqQhq2qnRHqZq6oeebXBR4ds/uHnted338OSbcrvdBC0A R65WN/CQvyR6Nphu9b4OgL8aFVmw7v7IYXxSzY2c= Authentication-Results: mxback9o.mail.yandex.net; dkim=pass header.i=@bell-sw.com Received: by smtp1o.mail.yandex.net (nwsmtp/Yandex) with ESMTPSA id afleBFYAez-skQ0cCN6; Mon, 09 Sep 2019 21:54:46 +0300 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client certificate not present) Date: Mon, 09 Sep 2019 18:54:00 -0000 From: Dmitrij Pochepko To: Richard Biener Cc: GCC Patches Subject: Re: [PATCH] PR tree-optimization/90836 Missing popcount pattern matching Message-ID: <20190909185445.GA13823@DSTATION> References: <20190905153448.GA29753@DSTATION> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-SW-Source: 2019-09/txt/msg00607.txt.bz2 Hi, thank you for looking into it. On Fri, Sep 06, 2019 at 12:23:40PM +0200, Richard Biener wrote: > On Thu, Sep 5, 2019 at 5:35 PM Dmitrij Pochepko > wrote: > > > > This patch adds matching for Hamming weight (popcount) implementation. The following sources: > > > > int > > foo64 (unsigned long long a) > > { > > unsigned long long b = a; > > b -= ((b>>1) & 0x5555555555555555ULL); > > b = ((b>>2) & 0x3333333333333333ULL) + (b & 0x3333333333333333ULL); > > b = ((b>>4) + b) & 0x0F0F0F0F0F0F0F0FULL; > > b *= 0x0101010101010101ULL; > > return (int)(b >> 56); > > } > > > > and > > > > int > > foo32 (unsigned int a) > > { > > unsigned long b = a; > > b -= ((b>>1) & 0x55555555UL); > > b = ((b>>2) & 0x33333333UL) + (b & 0x33333333UL); > > b = ((b>>4) + b) & 0x0F0F0F0FUL; > > b *= 0x01010101UL; > > return (int)(b >> 24); > > } > > > > and equivalents are now recognized as popcount for platforms with hw popcount support. Bootstrapped and tested on x86_64-pc-linux-gnu and aarch64-linux-gnu systems with no regressions. > > > > (I have no write access to repo) > > +(simplify > + (convert > + (rshift > + (mult > > is the outer convert really necessary? That is, if we change > the simplification result to > > (convert (BUILT_IN_POPCOUNT @0)) > > wouldn't that be correct as well? Yes, this is better. I fixed it in the new version. > > Is the Hamming weight popcount > faster than the libgcc table-based approach? I wonder if we really > need to restrict this conversion to the case where the target > has an expander. > > + (mult > + (bit_and:c > > this doesn't need :c (second operand is a constant). Yes. Agree, this is redundant. > > + int shift = TYPE_PRECISION (long_long_unsigned_type_node) - prec; > + const unsigned long long c1 = 0x0101010101010101ULL >> shift, > > I think this mixes host and target properties. I guess intead of > 'const unsigned long long' you want to use 'const uint64_t' and > instead of TYPE_PRECISION (long_long_unsigned_type_node) 64? > Since you are later comparing with unsigned HOST_WIDE_INT > eventually unsigned HOST_WIDE_INT is better (that's always 64bit as well). Agree. It is better to use HOST_WIDE_INT. > > You are using tree_to_uhwi but nowhere verifying if @0 is unsigned. > What happens if 'prec' is > 64? (__int128 ...). Ah, I guess the > final selection will simply select nothing... > > Otherwise the patch looks reasonable, even if the pattern > is a bit unwieldly... ;) > > Does it work for targets where 'unsigned int' is smaller than 32bit? Yes. The only 16-bit-int architecture with popcount support on hw level is avr. I built gcc for avr and checked that 16-bit popcount algorithm is recognized successfully. Thanks, Dmitrij > > Thanks, > Richard. > > > > Thanks, > > Dmitrij > > > > > > gcc/ChangeLog: > > > > PR tree-optimization/90836 > > > > * gcc/match.pd (popcount): New pattern. > > > > gcc/testsuite/ChangeLog: > > > > PR tree-optimization/90836 > > > > * lib/target-supports.exp (check_effective_target_popcount) > > (check_effective_target_popcountll): New effective targets. > > * gcc.dg/tree-ssa/popcount4.c: New test. > > * gcc.dg/tree-ssa/popcount4l.c: New test. > > * gcc.dg/tree-ssa/popcount4ll.c: New test.