From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 49656 invoked by alias); 16 Dec 2016 13:50:15 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 49549 invoked by uid 89); 16 Dec 2016 13:50:14 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-5.0 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=late X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 16 Dec 2016 13:50:12 +0000 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3E949824 for ; Fri, 16 Dec 2016 13:50:11 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-116-54.ams2.redhat.com [10.36.116.54]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id uBGD6ejQ007258 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Fri, 16 Dec 2016 08:06:42 -0500 Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id uBGD6bOX011704; Fri, 16 Dec 2016 14:06:38 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id uBGD6Zd5011703; Fri, 16 Dec 2016 14:06:35 +0100 Date: Fri, 16 Dec 2016 13:57:00 -0000 From: Jakub Jelinek To: Richard Biener Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Optimiza aggregate a = b = c = {} (PR c/78408, take 2) Message-ID: <20161216130635.GP21933@tucnak> Reply-To: Jakub Jelinek References: <20161125193219.GP3541@tucnak.redhat.com> <20161213113648.GP3541@tucnak.redhat.com> <20161215164126.GJ21933@tucnak> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.7.1 (2016-10-04) X-IsSubscribed: yes X-SW-Source: 2016-12/txt/msg01474.txt.bz2 On Fri, Dec 16, 2016 at 01:50:54PM +0100, Richard Biener wrote: > > +static void > > +optimize_memcpy (gimple_stmt_iterator *gsip, tree dest, tree src, tree len) > > +{ > > + gimple *stmt = gsi_stmt (*gsip); > > + if (gimple_has_volatile_ops (stmt) > > + || TREE_THIS_VOLATILE (dest) > > + || TREE_THIS_VOLATILE (src)) > > The last two should be redundant and/or not necessary. I've been doing this if stmt is __builtin_memcpy and the actual objects are TREE_THIS_VOLATILE, then I wanted to punt. But I guess if we never transform __builtin_memcpy into anything other than __builtin_memset, then it isn't a problem. > > + return; > > + > > + tree vuse = gimple_vuse (stmt); > > + if (vuse == NULL) > > + return; > > + > > + gimple *defstmt = SSA_NAME_DEF_STMT (vuse); > > + tree src2 = NULL_TREE, len2 = NULL_TREE; > > + HOST_WIDE_INT offset, offset2; > > + tree val = integer_zero_node; > > + if (gimple_store_p (defstmt) > > + && gimple_assign_single_p (defstmt) > > + && TREE_CODE (gimple_assign_rhs1 (defstmt)) == CONSTRUCTOR > > + && CONSTRUCTOR_NELTS (gimple_assign_rhs1 (defstmt)) == 0 > > Should be always true for stores from constructors. Ok, can I just gcc_checking_assert verify it or is that not worth it? > > + && !gimple_clobber_p (defstmt)) > > + src2 = gimple_assign_lhs (defstmt); > > + else if (gimple_call_builtin_p (defstmt, BUILT_IN_MEMSET) > > + && TREE_CODE (gimple_call_arg (defstmt, 0)) == ADDR_EXPR > > + && TREE_CODE (gimple_call_arg (defstmt, 1)) == INTEGER_CST) > > + { > > + src2 = TREE_OPERAND (gimple_call_arg (defstmt, 0), 0); > > + len2 = gimple_call_arg (defstmt, 2); > > + val = gimple_call_arg (defstmt, 1); > > + if (!integer_zerop (val) && is_gimple_assign (stmt)) > > Please add a comment here. I think below ... (*) > > > + src2 = NULL_TREE; > > + } > > + > > + if (src2 == NULL_TREE) > > + return; > > + > > + if (len == NULL_TREE) > > + len = (TREE_CODE (src) == COMPONENT_REF > > + ? DECL_SIZE_UNIT (TREE_OPERAND (src, 1)) > > + : TYPE_SIZE_UNIT (TREE_TYPE (src))); > > + if (len2 == NULL_TREE) > > + len2 = (TREE_CODE (src2) == COMPONENT_REF > > + ? DECL_SIZE_UNIT (TREE_OPERAND (src2, 1)) > > + : TYPE_SIZE_UNIT (TREE_TYPE (src2))); > > + if (len == NULL_TREE > > + || TREE_CODE (len) != INTEGER_CST > > + || len2 == NULL_TREE > > + || TREE_CODE (len2) != INTEGER_CST) > > + return; > > + > > + src = get_addr_base_and_unit_offset (src, &offset); > > + src2 = get_addr_base_and_unit_offset (src2, &offset2); > > + if (src == NULL_TREE > > + || src2 == NULL_TREE > > + || offset < offset2) > > + return; > > + > > + if (!operand_equal_p (src, src2, 0)) > > + return; > > + > > + /* [ src + offset2, src + offset2 + len2 - 1 ] is set to val. > > + Make sure that > > + [ src + offset, src + offset + len - 1 ] is a subset of that. */ > > + if (wi::to_widest (len) + (offset - offset2) > wi::to_widest (len2)) > > I think you can use ::to_offset which is more efficient. That is reasonable. > > + return; > > + > > + if (dump_file && (dump_flags & TDF_DETAILS)) > > + { > > + fprintf (dump_file, "Simplified\n "); > > + print_gimple_stmt (dump_file, stmt, 0, dump_flags); > > + fprintf (dump_file, "after previous\n "); > > + print_gimple_stmt (dump_file, defstmt, 0, dump_flags); > > + } > > + > > + if (is_gimple_assign (stmt)) > > (*) a better check would be if val is zero because even a memcpy > could be optimized to an assignment from {}. I suppose you're > doing it this way because of simplicity and because we're > late in the compilation and thus it doesn't matter in practice > for optimization? (the 2nd destination could become > non-TREE_ADDRESSABLE, enabling better alias disambiguation) Is memset and = {} equivalent even for aggregates with paddings? If val is not zero, then I can only handle it if stmt is memcpy, (or in theory if stmt is assignment, and dest is addressable it could be transformed into memset). If val is zero, and dest isn't volatile and the size matches the size of dest, then perhaps it could be transformed into = {} (if there are no paddings/bitfields or if it doesn't matter), I haven't done that for simplicity indeed. Jakub