From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-201890-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 20319 invoked by alias); 21 Aug 2007 08:21:31 -0000
Received: (qmail 20094 invoked by uid 22791); 21 Aug 2007 08:21:29 -0000
X-Spam-Check-By: sourceware.org
Received: from mail.codesourcery.com (HELO mail.codesourcery.com) (65.74.133.4)     by sourceware.org (qpsmtpd/0.31) with ESMTP; Tue, 21 Aug 2007 08:21:24 +0000
Received: (qmail 999 invoked from network); 21 Aug 2007 08:21:22 -0000
Received: from unknown (HELO gateway) (10.0.0.100)   by mail.codesourcery.com with SMTP; 21 Aug 2007 08:21:22 -0000
Received: by gateway (Postfix, from userid 1010) 	id 4233A6C0D2; Tue, 21 Aug 2007 01:21:22 -0700 (PDT)
From: Richard Sandiford <richard@codesourcery.com>
To: Sandra Loosemore <sandra@codesourcery.com>
Mail-Followup-To: Sandra Loosemore <sandra@codesourcery.com>,GCC Patches <gcc-patches@gcc.gnu.org>,  Nigel Stephens <nigel@mips.com>,  Guy Morrogh <guym@mips.com>,  David Ung <davidu@mips.com>,  Thiemo Seufer <ths@mips.com>,  Mark Mitchell <mark@codesourcery.com>, richard@codesourcery.com
Cc: GCC Patches <gcc-patches@gcc.gnu.org>,  Nigel Stephens <nigel@mips.com>,  Guy Morrogh <guym@mips.com>,  David Ung <davidu@mips.com>,  Thiemo Seufer <ths@mips.com>,  Mark Mitchell <mark@codesourcery.com>
Subject: Re: PATCH: fine-tuning for can_store_by_pieces
References: <46C3343A.5080407@codesourcery.com> <87ps1nop2x.fsf@firetop.home> 	<46C778D6.5060808@codesourcery.com> <87y7g6r50c.fsf@firetop.home> 	<46CA222D.2050107@codesourcery.com>
Date: Tue, 21 Aug 2007 08:21:00 -0000
In-Reply-To: <46CA222D.2050107@codesourcery.com> (Sandra Loosemore's message 	of "Mon\, 20 Aug 2007 19\:22\:21 -0400")
Message-ID: <87ps1h5mda.fsf@firetop.home>
User-Agent: Gnus/5.110006 (No Gnus v0.6) Emacs/21.4 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
X-SW-Source: 2007-08/txt/msg01324.txt.bz2

Sandra Loosemore <sandra@codesourcery.com> writes:
> Richard Sandiford wrote:
>>> + #define MOVE_RATIO ((TARGET_MIPS16 || TARGET_MEMCPY) ? MIPS_CALL_RATIO : 2)
>> 
>> ...a comment in the original patch said that MOVE_RATIO effectively
>> counted memory-to-memory moves.  I think that was a useful comment,
>> and that the use of the old MIPS_CALL_RATIO above should be the new
>> MIPS_CALL_RATIO / 2.  Conveniently, that gives us the 3 that you had
>> in the original patch.  
>
> Except that 4 seems to be a better number, and that number doesn't
> fall out of this theory.

OK.  (Hence the question that completed that paragraph.  It was far from
obvious whether the MOVE_RATIO change from 3 to 4 had been determined
experimentally or not.  You just said "with and without -mabicalls",
which implied two sets of testing flags.)

> I guess I could run some tests with different values for CLEAR_RATIO
> too, and just document both numbers as being experimentally
> determined?

That does sound better, thanks.

>>> + /* STORE_BY_PIECES_P can be used when copying a constant string, but
>>> +    in that case each word takes 3 insns (lui, ori, sw), or more in
>>> +    64-bit mode, instead of 2 (lw, sw). So better to always fail this
>>> +    and let the move_by_pieces code copy the string from read-only
>>> +    memory.  */
>>> + 
>>> + #define STORE_BY_PIECES_P(SIZE, ALIGN) 0
>> 
>> You asked when lui/ori/sw might be faster.  Consider a three-word
>> store on a typical 2-way superscalar target:
>> 
>>   Cycle 1:    lui     lui
>>         2:    ori     ori
>>         3:    sw             lui
>>         4:            sw     ori
>>         5:                   sw
>> 
>> That's 5 cycles.  The equivalent lw/sw version is at least 6 cycles
>> (more if the read-only string is not in cache).
>
> OK, but what I was really asking was, is there a way to *test* for
> situations where we should generate the lui/ori/sw sequences instead
> of the lw/sw?  Some combination of TARGET_foo flags and/or the size of
> the string?

Well, I suppose:

    !optimize_size && !TARGET_MIPS16 && mips_issue_rate () > 1

is the condition under which the concerns above apply.  But, as with the
other !optimize_size bounds in your patch, the bound would need to be
determined experimentally.  Since we don't have an easy way of doing that,
I'm happy to preserve mainline's current behaviour there.

What did you think about the other suggestion: moving the magic
"1 instruction" bound for optimize_size from builtins.c to SET_RATIO?

Richard