From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by sourceware.org (Postfix) with ESMTPS id 58CBA3858CDA for ; Mon, 26 Sep 2022 06:58:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 58CBA3858CDA Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1664175488; x=1695711488; h=from:to:cc:subject:date:message-id; bh=T9fsgca47wCRL8e9ux74A6f3jmVSO5+MArjt7RiQU5w=; b=AiB0X9eJeO6ZEPh61IdWbyTs3BwMIXsmVph0NKybGS3G/aoNqygYSZZB WC49ENkRO0HEEr5G9aOICJut0Uj8Cmk41Bugv5w3TiVE6AUV1AKoGQUn9 jf/feCrj0vTsorHdC7HPGYS8dYTGlk9I/rJXveQ9tvCqm5g+zekt2QMZv mL41CbUWiwjdQOOUwOWd3hPCOE64LUqSC7oLRebHeRnXa2B6UUINqjDeg hySWFnXd9ZC0Do4xyGerHzg3VpGcYknPbfKMxQjIlGriTkrEEoNRtL37w ejRmg8xOxyB51tY0SLDsZ05Qlk5I7xVkvX+tQG7t/KaohnRB9RmklPVbB A==; X-IronPort-AV: E=McAfee;i="6500,9779,10481"; a="301854893" X-IronPort-AV: E=Sophos;i="5.93,345,1654585200"; d="scan'208";a="301854893" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Sep 2022 23:58:06 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10481"; a="623226285" X-IronPort-AV: E=Sophos;i="5.93,345,1654585200"; d="scan'208";a="623226285" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga007.fm.intel.com with ESMTP; 25 Sep 2022 23:58:05 -0700 Received: from shliclel314.sh.intel.com (shliclel314.sh.intel.com [10.239.240.214]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 8C7951005687; Mon, 26 Sep 2022 14:58:04 +0800 (CST) From: Liwei Xu To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, wilson@tuliptree.org, admin@levyhsu.com Subject: [PATCH] Optimize nested permutation to single VEC_PERM_EXPR [PR54346] Date: Mon, 26 Sep 2022 14:56:04 +0800 Message-Id: <20220926065604.783193-1-liwei.xu@intel.com> X-Mailer: git-send-email 2.18.2 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This patch implemented the optimization in PR 54346, which Merges c = VEC_PERM_EXPR ; d = VEC_PERM_EXPR ; to d = VEC_PERM_EXPR ; Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} tree-ssa/forwprop-19.c fail to pass but I'm not sure whether it is ok to removed it. gcc/ChangeLog: PR target/54346 * match.pd: Merge the index of VCST then generates the new vec_perm. gcc/testsuite/ChangeLog: PR target/54346 * gcc.dg/pr54346.c: New test. Co-authored-by: liuhongt --- gcc/match.pd | 41 ++++++++++++++++++++++++++++++++++ gcc/testsuite/gcc.dg/pr54346.c | 13 +++++++++++ 2 files changed, 54 insertions(+) create mode 100755 gcc/testsuite/gcc.dg/pr54346.c diff --git a/gcc/match.pd b/gcc/match.pd index 345bcb701a5..9219b0a10e1 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -8086,6 +8086,47 @@ and, (minus (mult (vec_perm @1 @1 @3) @2) @4))) +/* (PR54346) Merge + c = VEC_PERM_EXPR ; + d = VEC_PERM_EXPR ; + to + d = VEC_PERM_EXPR ; */ + +(simplify + (vec_perm (vec_perm@0 @1 @2 VECTOR_CST@3) @0 VECTOR_CST@4) + (with + { + if(!TYPE_VECTOR_SUBPARTS (type).is_constant()) + return NULL_TREE; + + tree op0; + machine_mode result_mode = TYPE_MODE (type); + machine_mode op_mode = TYPE_MODE (TREE_TYPE (@1)); + int nelts = TYPE_VECTOR_SUBPARTS (type).to_constant(); + vec_perm_builder builder0; + vec_perm_builder builder1; + vec_perm_builder builder2 (nelts, nelts, 1); + + if (!tree_to_vec_perm_builder (&builder0, @3) + || !tree_to_vec_perm_builder (&builder1, @4)) + return NULL_TREE; + + vec_perm_indices sel0 (builder0, 2, nelts); + vec_perm_indices sel1 (builder1, 1, nelts); + + for (int i = 0; i < nelts; i++) + builder2.quick_push (sel0[sel1[i].to_constant()]); + + vec_perm_indices sel2 (builder2, 2, nelts); + + if (!can_vec_perm_const_p (result_mode, op_mode, sel2, false)) + return NULL_TREE; + + op0 = vec_perm_indices_to_tree (TREE_TYPE (@4), sel2); + } + (vec_perm @1 @2 { op0; }))) + + /* Match count trailing zeroes for simplify_count_trailing_zeroes in fwprop. The canonical form is array[((x & -x) * C) >> SHIFT] where C is a magic constant which when multiplied by a power of 2 contains a unique value diff --git a/gcc/testsuite/gcc.dg/pr54346.c b/gcc/testsuite/gcc.dg/pr54346.c new file mode 100755 index 00000000000..d87dc3a79a5 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr54346.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fdump-tree-dse1" } */ + +typedef int veci __attribute__ ((vector_size (4 * sizeof (int)))); + +void fun (veci a, veci b, veci *i) +{ + veci c = __builtin_shuffle (a, b, __extension__ (veci) {1, 4, 2, 7}); + *i = __builtin_shuffle (c, __extension__ (veci) { 7, 2, 1, 5 }); +} + +/* { dg-final { scan-tree-dump "VEC_PERM_EXPR.*{ 3, 6, 0, 0 }" "dse1" } } */ +/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 1 "dse1" } } */ \ No newline at end of file -- 2.18.2