From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id 797033858C55 for ; Thu, 13 Oct 2022 13:19:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 797033858C55 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id A9232338B6 for ; Thu, 13 Oct 2022 13:19:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1665667197; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type; bh=UEKsoTIVHACDBQtWsFoPGp9Vwg+zz0hew3xylbMQ49c=; b=YN+KF+OCDpNf2mqa9RT68vN5SNzIRGdUkfk0EPPUFsGNgEPmSpi6jVPdih9DDPyITjPuPv ksGVgwhzKHfytjK0z7iukhIK0fVKXt97WJ2Lxdq+cI6RMMzbrMWjQqxQHp+W7DZRdGE2kE MqTqpJ6jIARVnCJ98lGbn3jQ0j8tviY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1665667197; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type; bh=UEKsoTIVHACDBQtWsFoPGp9Vwg+zz0hew3xylbMQ49c=; b=SPtfY00kmt5PEQ0lzoqkSCefjR+bBYtBdjAT1JSsGorzapUZd+uVSBo6U6ph1L0RBiq+sS Avha/avv5QzZ4CAA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 8C67013AAA for ; Thu, 13 Oct 2022 13:19:57 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id hJwsIX0QSGMkLQAAMHmgww (envelope-from ) for ; Thu, 13 Oct 2022 13:19:57 +0000 Date: Thu, 13 Oct 2022 15:19:56 +0200 (CEST) From: Richard Biener To: gcc-patches@gcc.gnu.org Subject: [PATCH] tree-optimization/107247 - reduce SLP reduction accumulator MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Message-Id: <20221013131957.8C67013AAA@imap2.suse-dmz.suse.de> X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: The following makes sure to reduce a multi-vector SLP reduction accumulator to a single vector using vector operations if easily possible (if the number of lanes in the vector type is a multiple of the number of scalar accumulators). Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. PR tree-optimization/107247 * tree-vect-loop.cc (vect_create_epilog_for_reduction): Reduce multi vector SLP reduction accumulators. Check the adjusted number of accumulator vectors against one for the re-use in the epilogue. --- gcc/tree-vect-loop.cc | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index b1442a93581..98a943d8a4b 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -5642,9 +5642,21 @@ vect_create_epilog_for_reduction (loop_vec_info loop_vinfo, we may end up with more than one vector result. Here we reduce them to one vector. + The same is true for a SLP reduction, e.g., + # a1 = phi + # b1 = phi + a2 = operation (a1) + b2 = operation (a2), + + where we can end up with more than one vector as well. We can + easily accumulate vectors when the number of vector elements is + a multiple of the SLP group size. + The same is true if we couldn't use a single defuse cycle. */ if (REDUC_GROUP_FIRST_ELEMENT (stmt_info) || direct_slp_reduc + || (slp_reduc + && constant_multiple_p (TYPE_VECTOR_SUBPARTS (vectype), group_size)) || ncopies > 1) { gimple_seq stmts = NULL; @@ -6233,7 +6245,7 @@ vect_create_epilog_for_reduction (loop_vec_info loop_vinfo, /* Record this operation if it could be reused by the epilogue loop. */ if (STMT_VINFO_REDUC_TYPE (reduc_info) == TREE_CODE_REDUCTION - && vec_num == 1) + && reduc_inputs.length () == 1) loop_vinfo->reusable_accumulators.put (scalar_results[0], { orig_reduc_input, reduc_info }); -- 2.35.3