From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 23851 invoked by alias); 1 Oct 2009 21:05:25 -0000 Received: (qmail 23835 invoked by uid 22791); 1 Oct 2009 21:05:24 -0000 X-SWARE-Spam-Status: No, hits=-2.6 required=5.0 tests=AWL,BAYES_00,SPF_PASS X-Spam-Check-By: sourceware.org Received: from mail-ew0-f226.google.com (HELO mail-ew0-f226.google.com) (209.85.219.226) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 01 Oct 2009 21:05:19 +0000 Received: by ewy26 with SMTP id 26so596850ewy.29 for ; Thu, 01 Oct 2009 14:05:17 -0700 (PDT) Received: by 10.211.143.10 with SMTP id v10mr2041127ebn.57.1254431117318; Thu, 01 Oct 2009 14:05:17 -0700 (PDT) Received: from localhost (rsandifo.gotadsl.co.uk [82.133.89.107]) by mx.google.com with ESMTPS id 28sm156523eyg.20.2009.10.01.14.05.14 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 01 Oct 2009 14:05:15 -0700 (PDT) To: Bernd Schmidt Mail-Followup-To: Bernd Schmidt ,gcc-patches@gcc.gnu.org, rdsandiford@googlemail.com Cc: gcc-patches@gcc.gnu.org Subject: Re: Use simplify_replace_rtx rather than wrap_constant References: <87ljk1sr4c.fsf@firetop.home> <4AC35E54.7090404@t-online.de> <87y6nwdwvz.fsf@firetop.home> <4AC4A5CF.6060804@t-online.de> From: Richard Sandiford Date: Thu, 01 Oct 2009 21:05:00 -0000 In-Reply-To: <4AC4A5CF.6060804@t-online.de> (Bernd Schmidt's message of "Thu\, 01 Oct 2009 13\:51\:27 +0100") Message-ID: <87vdiyddv9.fsf@firetop.home> User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2009-10/txt/msg00086.txt.bz2 Bernd Schmidt writes: > Richard Sandiford wrote: > >> Do you think the changes are unsafe, or are you worried about the >> performance impact? > > The latter (performance and memory usage), and I don't like the idea of > having this change buried in a larger patch. So, if the other changes > depend on it, split it off and make sure it gets in first. > > I'd also like more information about which existing places you found > that made you think this change is needed (wrt. MEMs or possibly invalid > sharing). OK, well, let's go through each caller in turn: fwprop.c:forward_propagate_and_simplify() Here we're replacing a MEM with a constant. The extra rtx_equal_p() is safe but redundant. The copy_rtx() will be a no-op for sharable constants, but strictly speaking, it might be needed for unsharable constants. gcse.c:try_replace_reg() Here's we're replacing a register with either a new register or a gcse_constant_p(). The extra rtx_equal_p() is an operational no-op because simplify_replace_rtx() already uses rtx_equal_p() for registers. The copy_rtx() will be a no-op for registers. The gcse_constant_p() case is interesting because we have: /* Determine whether the rtx X should be treated as a constant for the purposes of GCSE's constant propagation. */ static bool gcse_constant_p (const_rtx x) { /* Consider a COMPARE of two integers constant. */ if (GET_CODE (x) == COMPARE && CONST_INT_P (XEXP (x, 0)) && CONST_INT_P (XEXP (x, 1))) return true; /* Consider a COMPARE of the same registers is a constant if they are not floating point registers. */ if (GET_CODE(x) == COMPARE && REG_P (XEXP (x, 0)) && REG_P (XEXP (x, 1)) && REGNO (XEXP (x, 0)) == REGNO (XEXP (x, 1)) && ! FLOAT_MODE_P (GET_MODE (XEXP (x, 0))) && ! FLOAT_MODE_P (GET_MODE (XEXP (x, 1)))) return true; /* Since X might be inserted more than once we have to take care that it is sharable. */ return CONSTANT_P (x) && (GET_CODE (x) != CONST || shared_const_p (x)); } The first two cases look a little hackish, but strictly speaking, those COMPAREs aren't shareable. We would need the copy_rtx() for them. (Or I suppose we could just say "well, it shouldn't matter if we share those particular unsharable expressions, because they'll probably be reduced or rejected". But that isn't being correct by design.) Also note that the comment above try_replace_reg() says: /* Try to replace all non-SET_DEST occurrences of FROM in INSN with TO. Returns nonzero is successful. */ so there's no guarantee that FROM occurs only once. In the usual case, we only consider sharable constants to be gcse_constant_p(), so the copy_rtx() will be a no-op. But try_simplify_rtx() nevertheless has: /* Usually we substitute easy stuff, so we won't copy everything. We however need to take care to not duplicate non-trivial CONST expressions. */ to = copy_rtx (to); TBH, I think this is exactly the sort of thing that happens when you leave it up to the caller rather than the callee to handle sharing (which is after all a correctness decision). Optimisation decisions like gcse_constant_p() shouldn't be affected by such an internal decision as sharability. gcse.c:cprop_jump() The second call replaces a register with a gcse_constant_p(), so this case is the same as try_replace_reg(). The first call replaces a setcc destination with a setcc source pattern, or with a REG_EQUAL note, if present. This can create invalid sharing in cases where no REG_EQUAL note is present. There's also no guarantee that the register is used only once in the jump, although that is of course the usual case. (Or again we could just say "well, it shouldn't matter if we share those particular expressions, because they'll probably be reduced or rejected". But that again isn't being correct by design.) gcse.c:bypass_block() Like cprop_jump(), the first call replaces a setcc destination with a setcc source, while the second call replaces a register with a gcse_constant_p(). The same analysis applies. loop-iv.c:replace_single_def_regs() This replaces a register with a function_invariant_p(). These invariants include (plus (fp|ap) (const_int ...)), which isn't sharable, so a copy_rtx() would be needed here for each use of the register. (Or each use minus 1 if we can guarantee that the original value is no longer needed.) Again, there is no guarantee that the "from" register is used only once in the argument to simplify_replace_rtx(), so the decision shouldn't be left up to the caller. loop-iv.c:replace_in_expr() This replaces a register with a simple_rhs_p(). These simple rhses include various kinds of unsharable binary operation, so the same copying requirements apply as for replace_single_def_regs(). loop-iv.c:implies_p() This function uses simplify_replace_rtx() to see whether something simplifies to "true". Sharing isn't a concern here, but the function can create plenty of throw-away rtx when used in this way, regardless of whether we use copy_rtx(). loop-iv.c:simplify_using_condition() This function replaces a REG with a CONSTANT_P. Some CONSTANT_Ps aren't sharable, so the copy here could be needed. If the users of the returned value are mindful that the returned value could have invalid sharing, then I think there should be a comment to say so. Otherwise the same "correctness by design" concerns apply as before. In short, I think implies_p() is the only case where both the following apply: (a) copy_rtx() might create new rtl (b) the creation of that new rtl is always unnecessary And like I say, the creation of _any_ new rtl in the implies_p() case is unnecessary, so I don't think it's a good example. Really, I think any assumption along the lines of "this sharing isn't strictly correct, but it shouldn't matter in this case" is as bad as the problem I'm trying to fix. It feels like premature optimisation. Richard