From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <libc-ports-return-4426-listarch-libc-ports=sources.redhat.com@sourceware.org>
Received: (qmail 27645 invoked by alias); 3 Sep 2013 19:31:18 -0000
Mailing-List: contact libc-ports-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <libc-ports.sourceware.org>
List-Subscribe: <mailto:libc-ports-subscribe@sourceware.org>
List-Post: <mailto:libc-ports@sourceware.org>
List-Help: <mailto:libc-ports-help@sourceware.org>, <http://sourceware.org/lists.html#faqs>
Sender: libc-ports-owner@sourceware.org
Received: (qmail 27632 invoked by uid 89); 3 Sep 2013 19:31:18 -0000
Received: from mail-we0-f177.google.com (HELO mail-we0-f177.google.com) (74.125.82.177) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Tue, 03 Sep 2013 19:31:18 +0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=0.7 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,KHOP_THREADED,NO_RELAYS autolearn=ham version=3.3.2
X-HELO: mail-we0-f177.google.com
Received: by mail-we0-f177.google.com with SMTP id q55so5184061wes.36        for <libc-ports@sourceware.org>; Tue, 03 Sep 2013 12:31:14 -0700 (PDT)
MIME-Version: 1.0
X-Received: by 10.180.208.49 with SMTP id mb17mr19561202wic.64.1378236674566; Tue, 03 Sep 2013 12:31:14 -0700 (PDT)
Received: by 10.216.179.5 with HTTP; Tue, 3 Sep 2013 12:31:14 -0700 (PDT)
In-Reply-To: <52260BD0.6090805@redhat.com>
References: <520894D5.7060207@linaro.org>	<CANu=DmiBHoymFKTvaW_VsdhWZEYwkfViz1tTeRgj7H80f0FntA@mail.gmail.com>	<5220D30B.9080306@redhat.com>	<CANu=DmiXLL9v1Z1KS0sBOs-pL8csEUGc9YE829_-tidKd-GruQ@mail.gmail.com>	<5220F1F0.80501@redhat.com>	<CANu=DmhA9QvSe6RS72Db2P=yyjC72fsE8d4QZKHEcNiwqxNMvw@mail.gmail.com>	<52260BD0.6090805@redhat.com>
Date: Tue, 03 Sep 2013 19:31:00 -0000
Message-ID: <CAAKybw99YcSoyU58w2iqHGRTQpajAtKX6JZp=r57bT37fjvQ2Q@mail.gmail.com>
Subject: Re: [PATCH] sysdeps/arm/armv7/multiarch/memcpy_impl.S: Improve performance.
From: "Ryan S. Arnold" <ryan.arnold@gmail.com>
To: "Carlos O'Donell" <carlos@redhat.com>
Cc: Will Newton <will.newton@linaro.org>, 	"libc-ports@sourceware.org" <libc-ports@sourceware.org>, Patch Tracking <patches@linaro.org>, 	=?UTF-8?B?T25kxZllaiBCw61sa2E=?= <neleai@seznam.cz>, 	Siddhesh Poyarekar <siddhesh@redhat.com>
Content-Type: text/plain; charset=UTF-8
X-IsSubscribed: yes
X-SW-Source: 2013-09/txt/msg00026.txt.bz2

On Tue, Sep 3, 2013 at 11:18 AM, Carlos O'Donell <carlos@redhat.com> wrote:
> We have one, it's the glibc microbenchmark, and we want to expand it,
> otherwise when ACME comes with their patch for ARM and breaks performance
> for targets that Linaro cares about I have no way to reject the patch
> objectively :-)

Can you be objective in analyzing performance when two different
people have differing opinions on what performance preconditions
should be coded against?

There are some cases that are obvious.. we know that from pipeline
analysis that certain instruction sequences can hinder performance.
That is objective and can be measured by a benchmark, but saying that
a particular change penalizes X sized copies but helps Y sized copies
when there are no published performance preconditions isn't.  It's a
difference in opinion of what's important.

PowerPC has had the luxury of not having their performance
pre-conditions contested.  PowerPC string performance is optimized
based upon customer data-set analysis.  So PowerPC's preconditions are
pretty concrete...  Optimize for aligned data in excess of 128-bytes
(I believe).

> You need to statistically analyze the numbers, assign weights to ranges,
> and come up with some kind of number that evaluates the results based
> on *some* formula. That is the only way we are going to keep moving
> performance forward (against some kind of criteria).

This sounds like establishing preconditions (what types of data will
be optimized for).

Unless technology evolves that you can statistically analyze data in
real time and adjust the implementation based on what you find (an
implementation with a different set of preconditions) to account for
this you're going to end up with a lot of in-fighting over
performance.

I've run into situations where I recommended that a customer code
their own string function implementation because they continually
encountered unaligned-data when copying-by-value in C++ functions and
PowerPC's string function implementations penalized unaligned copies
in preference for aligned copies.

Ryan