public inbox for libc-ports@sourceware.org
 help / color / mirror / Atom feed
From: Carlos O'Donell <carlos_odonell@mentor.com>
To: Steve Ellcey <sellcey@mips.com>
Cc: Andrew T Pinski <pinskia@gmail.com>,
	Maxim Kuvyrkov	<maxim_kuvyrkov@mentor.com>,
	"Joseph S. Myers" <joseph@codesourcery.com>,
	<libc-ports@sourceware.org>
Subject: Re: [PATCH] Optimize MIPS memcpy
Date: Tue, 04 Sep 2012 15:14:00 -0000	[thread overview]
Message-ID: <50461AD3.50307@mentor.com> (raw)
In-Reply-To: <1346771341.14333.20.camel@ubuntu-sellcey>

On 9/4/2012 11:09 AM, Steve Ellcey wrote:
> On Mon, 2012-09-03 at 02:12 -0700, Andrew T Pinski wrote:
>> Forgot to CC libc-ports@ .
>> On Sat, 2012-09-01 at 18:15 +1200, Maxim Kuvyrkov wrote:
>>> This patch improves MIPS assembly implementations of memcpy.  Two optimizations are added:
>> prefetching of data for subsequent iterations of memcpy loop and pipelined expansion of unaligned
>> memcpy.  These optimizations speed up MIPS memcpy by about 10%.
>>>
>>> The prefetching part is straightforward: it adds prefetching of a cache line (32 bytes) for +1
>> iteration for unaligned case and +2 iteration for aligned case.  The rationale here is that it will
>> take prefetch to acquire data about same time as 1 iteration of unaligned loop or 2 iterations of aligned loop.  Values for these parameters were tuned on a modern MIPS processor.
>>>
>>
>> This might hurt Octeon as the cache line size there is 128 bytes.  Can
>> you say which modern MIPS processor which this has been tuned with?  And
>> is there a way to not hard code 32 in the assembly but in a macro
>> instead.
>>
>> Thanks,
>> Andrew Pinski
> 
> I've been looking at the MIPS memcpy and was planning on submitting a
> new version based on the one that MIPS submitted to Android.  It has
> prefetching like Maxim's though I found that using the load and 'prepare
> for store' hints instead of 'load streaming' and 'store streaming' hints
> gave me better results on the 74k and 24k that I did performance testing
> on.
> 
> This version has more unrolling too and between that and the hints
> difference I got a small performance improvement over Maxim's version
> when doing small memcpy's and a fairly substantial improvement on large
> memcpy's.
> 
> I also merged the 32 and 64 bit versions together so we would only have
> one copy to maintain.  I haven't tried building it as part of glibc yet,
> I have been testing it standalone first and was going to try and
> integrate it into glibc and submit it this week or next.  I'll attach it
> to this email so folks can look at it and I will see if I can
> parameterize the cache line size.  This one also assumes a 32 byte cache
> prefetch.

Exactly what benchmarks did you run to verify the performance gains?

The one thing I'd like to continue seeing is strong rationalization for
performance patches such that we have reproducible data in the event that
someone else comes along and wants to make a change.

For example see:
http://sourceware.org/glibc/wiki/benchmarking/results_2_17

and:
http://sourceware.org/glibc/wiki/benchmarking/benchmarks

Cheers,
Carlos.
-- 
Carlos O'Donell
Mentor Graphics / CodeSourcery
carlos_odonell@mentor.com
carlos@codesourcery.com
+1 (613) 963 1026

  reply	other threads:[~2012-09-04 15:14 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-01  6:16 Maxim Kuvyrkov
2012-09-01 16:37 ` Joseph S. Myers
2012-09-03  9:12 ` Andrew T Pinski
2012-09-03 17:12   ` Maxim Kuvyrkov
2012-09-04 15:09   ` Steve Ellcey
2012-09-04 15:14     ` Carlos O'Donell [this message]
2012-09-04 17:03       ` Steve Ellcey
2012-09-04 17:28         ` Carlos O'Donell
2012-09-05  0:43     ` Maxim Kuvyrkov
2012-09-06 16:25       ` Steve Ellcey
2012-09-06 18:43         ` Roland McGrath
2012-09-06 19:37           ` Steve Ellcey
2012-09-07 21:24         ` Maxim Kuvyrkov
2012-09-11  4:35         ` Maxim Kuvyrkov
2012-09-11 15:18           ` Steve Ellcey
2012-09-20  9:05             ` Maxim Kuvyrkov
2012-09-20 18:38               ` Steve Ellcey
2012-09-28  3:48                 ` Maxim Kuvyrkov
2012-10-06  4:43                   ` Maxim Kuvyrkov
2012-10-08 17:04                     ` Steve Ellcey
2012-10-08 22:31                       ` Maxim Kuvyrkov
2012-10-09 20:50                         ` Steve Ellcey
2012-10-15 17:49                         ` Steve Ellcey
2012-10-15 20:20                           ` Andrew Pinski
2012-10-15 20:34                             ` Steve Ellcey
2012-10-15 20:42                               ` Andrew Pinski
2012-10-15 20:50                                 ` Andrew Pinski
2012-10-15 21:36                                   ` Steve Ellcey
2012-10-15 21:47                                     ` Maxim Kuvyrkov
2012-10-17 17:30                                       ` Steve Ellcey
2012-10-29 18:00                                         ` Steve Ellcey
2012-10-29 18:03                                           ` Maxim Kuvyrkov
2012-10-30  7:16                                           ` Maxim Kuvyrkov
2012-10-30  7:19                                             ` Maxim Kuvyrkov
2012-10-30 17:46                                             ` Steve Ellcey
2012-10-30 21:56                                               ` Maxim Kuvyrkov
2012-10-30 22:19                                                 ` Steve Ellcey
2012-12-19  1:51                                                   ` Maxim Kuvyrkov
2012-12-19 16:59                                                     ` Steve Ellcey
2012-10-31 19:27                                         ` Andreas Jaeger
2012-10-31 20:04                                           ` Steve Ellcey
2012-10-15 22:10                                     ` Joseph S. Myers
2012-10-15 21:29                               ` Maciej W. Rozycki
2012-10-15 22:05                           ` Maxim Kuvyrkov
2012-09-21 18:47               ` Steve Ellcey
2012-09-21 18:57                 ` Joseph S. Myers
2012-09-21 20:41                   ` [PATCH] Optimize MIPS memcpy (mips glibc test results) Steve Ellcey
2012-09-21 20:49                     ` Joseph S. Myers
2012-09-21 20:56                       ` Steve Ellcey
2012-09-21 19:12                 ` [PATCH] Optimize MIPS memcpy Maxim Kuvyrkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50461AD3.50307@mentor.com \
    --to=carlos_odonell@mentor.com \
    --cc=joseph@codesourcery.com \
    --cc=libc-ports@sourceware.org \
    --cc=maxim_kuvyrkov@mentor.com \
    --cc=pinskia@gmail.com \
    --cc=sellcey@mips.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).