From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 28727 invoked by alias); 8 Jan 2019 09:27:21 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 28337 invoked by uid 89); 8 Jan 2019 09:27:21 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-10.9 required=5.0 tests=BAYES_00,GIT_PATCH_2,GIT_PATCH_3,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=Say, touched X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 08 Jan 2019 09:27:19 +0000 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 41331BDEA; Tue, 8 Jan 2019 09:27:18 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-116-18.ams2.redhat.com [10.36.116.18]) by smtp.corp.redhat.com (Postfix) with ESMTPS id D322318BB8; Tue, 8 Jan 2019 09:27:17 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id x089RFH8002921; Tue, 8 Jan 2019 10:27:16 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id x089REJt002918; Tue, 8 Jan 2019 10:27:14 +0100 Date: Tue, 08 Jan 2019 09:27:00 -0000 From: Jakub Jelinek To: Uros Bizjak Cc: Jeff Law , "gcc-patches@gcc.gnu.org" Subject: Re: [PATCH] Optimize away x86 mem stores of what the mem contains already (PR rtl-optimization/79593) Message-ID: <20190108092714.GX30353@tucnak> Reply-To: Jakub Jelinek References: <20190107225116.GU30353@tucnak> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-IsSubscribed: yes X-SW-Source: 2019-01/txt/msg00372.txt.bz2 On Tue, Jan 08, 2019 at 08:29:03AM +0100, Uros Bizjak wrote: > Is there a reason stack registers are excluded? Before stackreg pass, > these registers are just like other hard registers. I was just afraid of those, after stack pass removing a store but not load would result in the stack getting out of sync, and I wasn't sure if the stack pass will tolerate a RA chosen stack reg not being touched anymore. Also, doesn't loading a float/double/long double into a stack reg and storing again canonicalize it in any way? Say NaNs? Or unnormal or pseudo-denormal long doubles etc.? Tried that now with a sNaN and the load/store didn't change it, but haven't tried with exceptions enabled to see if it would raise one. If that is fine, I'll leave that out. Looking around, I was using a wrong macro anyway and am surprised nothing complained, it should have been STACK_REG_P rather than STACK_REGNO_P. > Other that that, there is no need for REG_P predicate; after reload we > don't have subregs and register_operand will match only hard regs. Ok, I wasn't sure because "register_operand" allows (subreg (reg)) even if reload_completed, just disallows subregs of mem after that. > Also, please put peep2_reg_dead_p predicate in the pattern predicate. I don't see how, that would mean I'd have to write two peephole2s instead of one. It tries to deal with two different cases, one is where the temporary reg is dead, in that case we can optimize away both the load or store, the second case is where the temporary reg isn't dead, in that case we can optimize away the store, but not the load. With the optimizing away of both load and store I was just trying to do a cheap DCE there. Looking around more, I actually think I need to replace (match_dup 1) with (match_operand 2 "memory_operand"), add rtx_equal_p (operands[1], operands[2]) and !MEM_VOLATILE_P (operands[2]), because apparently rtx_equal_p doesn't check the MEM_VOLATILE_P bit. > > 2019-01-07 Jakub Jelinek > > > > PR rtl-optimization/79593 > > * config/i386/i386.md (reg = mem; mem = reg): New define_peephole2. > > > > --- gcc/config/i386/i386.md.jj 2019-01-01 12:37:31.564738571 +0100 > > +++ gcc/config/i386/i386.md 2019-01-07 17:11:21.056392168 +0100 > > @@ -18740,6 +18740,21 @@ (define_peephole2 > > const0_rtx); > > }) > > > > +;; Attempt to optimize away memory stores of values the memory already > > +;; has. See PR79593. > > +(define_peephole2 > > + [(set (match_operand 0 "register_operand") > > + (match_operand 1 "memory_operand")) > > + (set (match_dup 1) (match_dup 0))] > > + "REG_P (operands[0]) > > + && !STACK_REGNO_P (operands[0]) > > + && !MEM_VOLATILE_P (operands[1])" > > + [(set (match_dup 0) (match_dup 1))] > > +{ > > + if (peep2_reg_dead_p (1, operands[0])) > > + DONE; > > +}) > > + > > ;; Attempt to always use XOR for zeroing registers (including FP modes). > > (define_peephole2 > > [(set (match_operand 0 "general_reg_operand") Jakub