From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by sourceware.org (Postfix) with ESMTPS id 454913858D33 for ; Tue, 27 Jun 2023 05:38:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 454913858D33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1687844290; x=1719380290; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=dUY6R+4HbDmLYDbmaIgD3ZI7TwsdOcXWyKdqZDv3YTA=; b=N6SvmTaAHcirs6ZRYLUaEfVlK2FpjYgki7HiH++TqlOLxogI9zZsCVTH 2wBYMWavreYYezcGSjGcomsXDWHRvl8sjDrxuQhW4/M9gUhG/ngRQVPv9 l5W9tK6SRbQWoOwS+An66JZ96WdThWGJXuOw5GTqQqEJgBQo6cci3/QJ7 AJsLYXbKdO+1E3i8zZOCtJ09wC4hSwSpy1ppQFCC8DX+C87So7JHCXIpM nVfI6m3+2KKSB0erAjrVgAswhhes6GPOeD8nH/5w1Ou94KnBSvn28GjAi MUT4++YzhMSPveZMNV+ADOebs/rmbtJZoY+ehvXmSqcjT6D6fmF2m5tvm A==; X-IronPort-AV: E=McAfee;i="6600,9927,10753"; a="364025862" X-IronPort-AV: E=Sophos;i="6.01,161,1684825200"; d="scan'208";a="364025862" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Jun 2023 22:38:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10753"; a="963036423" X-IronPort-AV: E=Sophos;i="6.01,161,1684825200"; d="scan'208";a="963036423" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga006.fm.intel.com with ESMTP; 26 Jun 2023 22:38:07 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 56387100568C; Tue, 27 Jun 2023 13:38:06 +0800 (CST) From: liuhongt To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com Subject: [PATCH] [x86] Refine maskstore patterns with UNSPEC_MASKMOV. Date: Tue, 27 Jun 2023 13:38:06 +0800 Message-Id: <20230627053806.2880955-1-hongtao.liu@intel.com> X-Mailer: git-send-email 2.39.1.388.g2fc9e9ca3c MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: At the rtl level, we cannot guarantee that the maskstore is not optimized to other full-memory accesses, as the current implementations are equivalent in terms of pattern, to solve this potential problem, this patch refines the pattern of the maskstore and the intrinsics with unspec. One thing I'm not sure is VCOND_EXPR, should VCOND_EXPR also expect fault suppression for masked-out elements? Currently we're still using vec_merge for both AVX2 and AVX512 target. ------------------------ Similar like r14-2070-gc79476da46728e If mem_addr points to a memory region with less than whole vector size bytes of accessible memory and k is a mask that would prevent reading the inaccessible bytes from mem_addr, add UNSPEC_MASKMOV to prevent it to be transformed to any other whole memory access instructions. Bootstrapped and regtested on x86_64-pc-linu-gnu{-m32,}. Ready to push to trunk. gcc/ChangeLog: PR rtl-optimization/110237 * config/i386/sse.md (_store_mask): Refine with UNSPEC_MASKMOV. (maskstore_store_mask): New define_insn, it's renamed from original _store_mask. --- gcc/config/i386/sse.md | 69 ++++++++++++++++++++++++++++++++++-------- 1 file changed, 57 insertions(+), 12 deletions(-) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 3b50c7117f8..812cfca4b92 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -1608,7 +1608,7 @@ (define_insn "_blendm" (set_attr "prefix" "evex") (set_attr "mode" "")]) -(define_insn "_store_mask" +(define_insn "*_store_mask" [(set (match_operand:V48_AVX512VL 0 "memory_operand" "=m") (vec_merge:V48_AVX512VL (match_operand:V48_AVX512VL 1 "register_operand" "v") @@ -1636,7 +1636,7 @@ (define_insn "_store_mask" (set_attr "memory" "store") (set_attr "mode" "")]) -(define_insn "_store_mask" +(define_insn "*_store_mask" [(set (match_operand:VI12HFBF_AVX512VL 0 "memory_operand" "=m") (vec_merge:VI12HFBF_AVX512VL (match_operand:VI12HFBF_AVX512VL 1 "register_operand" "v") @@ -27008,21 +27008,66 @@ (define_expand "maskstore" "TARGET_AVX") (define_expand "maskstore" - [(set (match_operand:V48H_AVX512VL 0 "memory_operand") - (vec_merge:V48H_AVX512VL - (match_operand:V48H_AVX512VL 1 "register_operand") - (match_dup 0) - (match_operand: 2 "register_operand")))] + [(set (match_operand:V48_AVX512VL 0 "memory_operand") + (unspec:V48_AVX512VL + [(match_operand:V48_AVX512VL 1 "register_operand") + (match_dup 0) + (match_operand: 2 "register_operand")] + UNSPEC_MASKMOV))] "TARGET_AVX512F") (define_expand "maskstore" - [(set (match_operand:VI12_AVX512VL 0 "memory_operand") - (vec_merge:VI12_AVX512VL - (match_operand:VI12_AVX512VL 1 "register_operand") - (match_dup 0) - (match_operand: 2 "register_operand")))] + [(set (match_operand:VI12HFBF_AVX512VL 0 "memory_operand") + (unspec:VI12HFBF_AVX512VL + [(match_operand:VI12HFBF_AVX512VL 1 "register_operand") + (match_dup 0) + (match_operand: 2 "register_operand")] + UNSPEC_MASKMOV))] "TARGET_AVX512BW") +(define_insn "_store_mask" + [(set (match_operand:V48_AVX512VL 0 "memory_operand" "=m") + (unspec:V48_AVX512VL + [(match_operand:V48_AVX512VL 1 "register_operand" "v") + (match_dup 0) + (match_operand: 2 "register_operand" "Yk")] + UNSPEC_MASKMOV))] + "TARGET_AVX512F" +{ + if (FLOAT_MODE_P (GET_MODE_INNER (mode))) + { + if (misaligned_operand (operands[0], mode)) + return "vmovu\t{%1, %0%{%2%}|%0%{%2%}, %1}"; + else + return "vmova\t{%1, %0%{%2%}|%0%{%2%}, %1}"; + } + else + { + if (misaligned_operand (operands[0], mode)) + return "vmovdqu\t{%1, %0%{%2%}|%0%{%2%}, %1}"; + else + return "vmovdqa\t{%1, %0%{%2%}|%0%{%2%}, %1}"; + } +} + [(set_attr "type" "ssemov") + (set_attr "prefix" "evex") + (set_attr "memory" "store") + (set_attr "mode" "")]) + +(define_insn "_store_mask" + [(set (match_operand:VI12HFBF_AVX512VL 0 "memory_operand" "=m") + (unspec:VI12HFBF_AVX512VL + [(match_operand:VI12HFBF_AVX512VL 1 "register_operand" "v") + (match_dup 0) + (match_operand: 2 "register_operand" "Yk")] + UNSPEC_MASKMOV))] + "TARGET_AVX512BW" + "vmovdqu\t{%1, %0%{%2%}|%0%{%2%}, %1}" + [(set_attr "type" "ssemov") + (set_attr "prefix" "evex") + (set_attr "memory" "store") + (set_attr "mode" "")]) + (define_expand "cbranch4" [(set (reg:CC FLAGS_REG) (compare:CC (match_operand:VI48_AVX 1 "register_operand") -- 2.39.1.388.g2fc9e9ca3c