From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 14912 invoked by alias); 13 Dec 2013 00:14:37 -0000 Mailing-List: contact libc-ports-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: libc-ports-owner@sourceware.org Received: (qmail 14900 invoked by uid 89); 13 Dec 2013 00:14:36 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: multi.imgtec.com Received: from multi.imgtec.com (HELO multi.imgtec.com) (194.200.65.239) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Fri, 13 Dec 2013 00:14:35 +0000 Message-ID: <1386893669.2764.30.camel@ubuntu-sellcey> Subject: Re: [patch, mips] Improved memset for MIPS From: Steve Ellcey To: Carlos O'Donell CC: Andrew Pinski , "Joseph S. Myers" , Carlos O'Donell , "libc-ports@sourceware.org" Date: Fri, 13 Dec 2013 00:14:00 -0000 In-Reply-To: References: <93a232b5-9d0b-4a27-bbb5-16e3ae7c4b89@BAMAIL02.ba.imgtec.org> <1378483039.5770.302.camel@ubuntu-sellcey> <1378486241.5770.327.camel@ubuntu-sellcey> <1379526035.5770.414.camel@ubuntu-sellcey> <1379698355.5770.466.camel@ubuntu-sellcey> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit MIME-Version: 1.0 X-SEF-Processed: 7_3_0_01192__2013_12_13_00_14_33 X-SW-Source: 2013-12/txt/msg00014.txt.bz2 On Thu, 2013-12-12 at 19:01 -0500, Carlos O'Donell wrote: > > I noticed this patch causes some performance regressions on Octeon due > > to having 128 byte cache lines. > > Changing PREFETCH_CHUNK/PREFETCH_FOR_STORE to assume 128 byte cache > > line gives us the performance back and improves over the original code > > at least 15%. > > That is: > > # define PREFETCH_CHUNK 128 > > # define PREFETCH_FOR_STORE(chunk, reg) \ > > pref PREFETCH_STORE_HINT, (chunk)*128(reg); > > Submit a patch for that? > > We have microbenchmarks now, but the next hardest > part is going to be archiving data by device so that > the community can help track performance and point > out regressions like this. > > Cheers, > Carlos. Unless the change is under some kind of ifdef for Octeon changing this will probably slow down other MIPS chips. Most of them have 32 byte cache lines. Steve Ellcey sellcey@mips.com