From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 20319 invoked by alias); 21 Aug 2007 08:21:31 -0000 Received: (qmail 20094 invoked by uid 22791); 21 Aug 2007 08:21:29 -0000 X-Spam-Check-By: sourceware.org Received: from mail.codesourcery.com (HELO mail.codesourcery.com) (65.74.133.4) by sourceware.org (qpsmtpd/0.31) with ESMTP; Tue, 21 Aug 2007 08:21:24 +0000 Received: (qmail 999 invoked from network); 21 Aug 2007 08:21:22 -0000 Received: from unknown (HELO gateway) (10.0.0.100) by mail.codesourcery.com with SMTP; 21 Aug 2007 08:21:22 -0000 Received: by gateway (Postfix, from userid 1010) id 4233A6C0D2; Tue, 21 Aug 2007 01:21:22 -0700 (PDT) From: Richard Sandiford To: Sandra Loosemore Mail-Followup-To: Sandra Loosemore ,GCC Patches , Nigel Stephens , Guy Morrogh , David Ung , Thiemo Seufer , Mark Mitchell , richard@codesourcery.com Cc: GCC Patches , Nigel Stephens , Guy Morrogh , David Ung , Thiemo Seufer , Mark Mitchell Subject: Re: PATCH: fine-tuning for can_store_by_pieces References: <46C3343A.5080407@codesourcery.com> <87ps1nop2x.fsf@firetop.home> <46C778D6.5060808@codesourcery.com> <87y7g6r50c.fsf@firetop.home> <46CA222D.2050107@codesourcery.com> Date: Tue, 21 Aug 2007 08:21:00 -0000 In-Reply-To: <46CA222D.2050107@codesourcery.com> (Sandra Loosemore's message of "Mon\, 20 Aug 2007 19\:22\:21 -0400") Message-ID: <87ps1h5mda.fsf@firetop.home> User-Agent: Gnus/5.110006 (No Gnus v0.6) Emacs/21.4 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2007-08/txt/msg01324.txt.bz2 Sandra Loosemore writes: > Richard Sandiford wrote: >>> + #define MOVE_RATIO ((TARGET_MIPS16 || TARGET_MEMCPY) ? MIPS_CALL_RATIO : 2) >> >> ...a comment in the original patch said that MOVE_RATIO effectively >> counted memory-to-memory moves. I think that was a useful comment, >> and that the use of the old MIPS_CALL_RATIO above should be the new >> MIPS_CALL_RATIO / 2. Conveniently, that gives us the 3 that you had >> in the original patch. > > Except that 4 seems to be a better number, and that number doesn't > fall out of this theory. OK. (Hence the question that completed that paragraph. It was far from obvious whether the MOVE_RATIO change from 3 to 4 had been determined experimentally or not. You just said "with and without -mabicalls", which implied two sets of testing flags.) > I guess I could run some tests with different values for CLEAR_RATIO > too, and just document both numbers as being experimentally > determined? That does sound better, thanks. >>> + /* STORE_BY_PIECES_P can be used when copying a constant string, but >>> + in that case each word takes 3 insns (lui, ori, sw), or more in >>> + 64-bit mode, instead of 2 (lw, sw). So better to always fail this >>> + and let the move_by_pieces code copy the string from read-only >>> + memory. */ >>> + >>> + #define STORE_BY_PIECES_P(SIZE, ALIGN) 0 >> >> You asked when lui/ori/sw might be faster. Consider a three-word >> store on a typical 2-way superscalar target: >> >> Cycle 1: lui lui >> 2: ori ori >> 3: sw lui >> 4: sw ori >> 5: sw >> >> That's 5 cycles. The equivalent lw/sw version is at least 6 cycles >> (more if the read-only string is not in cache). > > OK, but what I was really asking was, is there a way to *test* for > situations where we should generate the lui/ori/sw sequences instead > of the lw/sw? Some combination of TARGET_foo flags and/or the size of > the string? Well, I suppose: !optimize_size && !TARGET_MIPS16 && mips_issue_rate () > 1 is the condition under which the concerns above apply. But, as with the other !optimize_size bounds in your patch, the bound would need to be determined experimentally. Since we don't have an easy way of doing that, I'm happy to preserve mainline's current behaviour there. What did you think about the other suggestion: moving the magic "1 instruction" bound for optimize_size from builtins.c to SET_RATIO? Richard