From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 14427 invoked by alias); 28 Sep 2011 12:36:26 -0000 Received: (qmail 14295 invoked by uid 22791); 28 Sep 2011 12:36:25 -0000 X-SWARE-Spam-Status: No, hits=-2.0 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from atrey.karlin.mff.cuni.cz (HELO atrey.karlin.mff.cuni.cz) (195.113.26.193) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 28 Sep 2011 12:36:07 +0000 Received: by atrey.karlin.mff.cuni.cz (Postfix, from userid 4018) id DABFFF070F; Wed, 28 Sep 2011 14:36:05 +0200 (CEST) Date: Wed, 28 Sep 2011 12:51:00 -0000 From: Jan Hubicka To: Jakub Jelinek Cc: Andi Kleen , Michael Zolotukhin , gcc-patches@gcc.gnu.org, Jan Hubicka , Richard Guenther , "H.J. Lu" , izamyatin@gmail.com, areg.melikadamyan@gmail.com Subject: Re: Use of vector instructions in memmov/memset expanding Message-ID: <20110928123605.GA31565@atrey.karlin.mff.cuni.cz> References: <20110715232425.GA24793@atrey.karlin.mff.cuni.cz> <20110928115546.GL2687@tyan-ft48-01.lab.bos.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110928115546.GL2687@tyan-ft48-01.lab.bos.redhat.com> User-Agent: Mutt/1.5.18 (2008-05-17) X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2011-09/txt/msg01826.txt.bz2 > On Wed, Sep 28, 2011 at 04:41:47AM -0700, Andi Kleen wrote: > > Michael Zolotukhin writes: > > > > > > Build and 'make check' was tested. > > > > Could you expand a bit on the performance benefits? Where does it help? > > Especially when glibc these days has very well optimized implementations > tuned for various CPUs and it is very unlikely beneficial to inline > memcpy/memset if they aren't really short or have unknown number of > iterations. I guess we should update the expansion tables so we produce function calls more often. I will look how things behave on my setup. Do you know glibc version numbers when the optimized string functions was introduced? Concerning inline SSE, I think it makes a lot of sense when we know size & alignment so we can output just few SSE moves instead of more integer moves. We definitely need some numbers for the loop variants. Honza