From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from out28-85.mail.aliyun.com (out28-85.mail.aliyun.com [115.124.28.85]) by sourceware.org (Postfix) with ESMTPS id F2A1E388206F for ; Fri, 14 Jun 2024 01:32:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F2A1E388206F Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=levyhsu.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=levyhsu.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org F2A1E388206F Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=115.124.28.85 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1718328752; cv=none; b=pWZoJNpzogdpFYLKwkhLuqZ03Sffo5c/PW/rtCow3P9wqcTlolUyCBqEOuRqaK0vo8VTt9ceptwJ0iyCY8GOEpWmNr2psJV93kUold7Xz8RRl6oxoNqr7enJlLXpZ97NkfDPxrWrkDwxomFVcggcjFVvbHSwEeX2vAKKpVvD9j4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1718328752; c=relaxed/simple; bh=r/HhE+GwPN780VgkAbERdk+Ne3FlOTjsXMD451sfMVc=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=NjU3K0W4Kn8IsE9ivrtTfpo3N62l6+z8Bj6t5ct0XQih3TbcCkLuSWkZB8QS48Cpi5vysJwjjjV54df4jJ8/VuWyQllBqUH8NCWh50gJaQaf6pWVO0Dj+cpx5I5IoNlvR9PZAqhg1tEU4Wqdi/q6CVKJhwKaReF/gwgdnNBjl1o= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=levyhsu.com; s=default; t=1718328747; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=MpuE6wSqoL7YYrbDdtn+btgAUxdLiYwXwOfVBeES8Fw=; b=nQPKZ+r9keN9h5wI1xx79Kqtknyx7i+dsLc6u7gMdnoxoXcwRRffvDRkbJCydAOo14XO3Q+UJIsU0wUxqjW4xTxrl5LqLCxefr9Z0JJYjTBZrIqwi3/bOn5h5Bbf2IL2cvRKqHip5aR8dvjnzstNc7tdiOc/tLYjUVcJGp6csu6mvqLVmVrzMp3HD2yydRz4fDjQA5tqiPG5WdlExBXGVcozV1Opc6yDMTHD5Cp2yiph1KHimyDjFIsZSJRXi+FxZ/llVuHjkwTxeRB9EwamnFKrdKGl3Sb+ijV76Q+hN0K5cUoF0B0ZJjifCNI3cHiRE3Okohhhk2eV+cDTyZGRzA== X-Alimail-AntiSpam:AC=CONTINUE;BC=0.06436259|-1;BR=01201311R921S93rulernew2018_352553_221103;CH=green;DM=|CONTINUE|false|;DS=SPAM|spam_ad|0.866356-0.000439444-0.133205;FP=0|0|0|0|0|-1|-1|-1;HT=maildocker-contentspam033068161075;MF=admin@levyhsu.com;NM=1;PH=DS;RN=5;RT=5;SR=0;TI=SMTPD_---.Y1AFtG._1718328741; Received: from ip-10-0-136-122.us-west-2.compute.internal(mailfrom:admin@levyhsu.com fp:SMTPD_---.Y1AFtG._1718328741) by smtp.aliyun-inc.com; Fri, 14 Jun 2024 09:32:25 +0800 From: Levy Hsu To: gcc-patches@gcc.gnu.org Cc: admin@levyhsu.com, liwei.xu@intel.com, crazylht@gmail.com, ubizjak@gmail.com Subject: [PATCH] x86: Emit cvtne2ps2bf16 for odd increasing perm in __builtin_shufflevector Date: Fri, 14 Jun 2024 01:32:03 +0000 Message-ID: <20240614013211.274643-1-admin@levyhsu.com> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,TXREP,T_SCC_BODY_TEXT_LINE,T_SPF_PERMERROR,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_vectorize_vec_perm_const): Convert BF to HI using subreg. * config/i386/predicates.md (vcvtne2ps2bf_parallel): New define_insn_and_split. * config/i386/sse.md (vpermt2_sepcial_bf16_shuffle_): New predicates matches odd increasing perm. gcc/testsuite/ChangeLog: * gcc.target/i386/vpermt2-special-bf16-shufflue.c: New test. --- gcc/config/i386/i386-expand.cc | 4 +-- gcc/config/i386/predicates.md | 11 ++++++ gcc/config/i386/sse.md | 35 +++++++++++++++++++ .../i386/vpermt2-special-bf16-shufflue.c | 27 ++++++++++++++ 4 files changed, 75 insertions(+), 2 deletions(-) create mode 100755 gcc/testsuite/gcc.target/i386/vpermt2-special-bf16-shufflue.c diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index 312329e550b..3d599c0651a 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -23657,8 +23657,8 @@ ix86_vectorize_vec_perm_const (machine_mode vmode, machine_mode op_mode, if (GET_MODE_SIZE (vmode) == 64 && !TARGET_EVEX512) return false; - /* For HF mode vector, convert it to HI using subreg. */ - if (GET_MODE_INNER (vmode) == HFmode) + /* For HF and BF mode vector, convert it to HI using subreg. */ + if (GET_MODE_INNER (vmode) == HFmode || GET_MODE_INNER (vmode) == BFmode) { machine_mode orig_mode = vmode; vmode = mode_for_vector (HImode, diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md index 7afe3100cb7..1676c50de71 100644 --- a/gcc/config/i386/predicates.md +++ b/gcc/config/i386/predicates.md @@ -2322,3 +2322,14 @@ return true; }) + +;; Check that each element is odd and incrementally increasing from 1 +(define_predicate "vcvtne2ps2bf_parallel" + (and (match_code "const_vector") + (match_code "const_int" "a")) +{ + for (int i = 0; i < XVECLEN (op, 0); ++i) + if (INTVAL (XVECEXP (op, 0, i)) != (2 * i + 1)) + return false; + return true; +}) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 680a46a0b08..5ddd1c0a778 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -30698,3 +30698,38 @@ "TARGET_AVXVNNIINT16" "vpdp\t{%3, %2, %0|%0, %2, %3}" [(set_attr "prefix" "vex")]) + +(define_mode_attr hi_cvt_bf + [(V8HI "v8bf") (V16HI "v16bf") (V32HI "v32bf")]) + +(define_mode_attr HI_CVT_BF + [(V8HI "V8BF") (V16HI "V16BF") (V32HI "V32BF")]) + +(define_insn_and_split "vpermt2_sepcial_bf16_shuffle_" + [(set (match_operand:VI2_AVX512F 0 "register_operand") + (unspec:VI2_AVX512F + [(match_operand:VI2_AVX512F 1 "vcvtne2ps2bf_parallel") + (match_operand:VI2_AVX512F 2 "register_operand") + (match_operand:VI2_AVX512F 3 "nonimmediate_operand")] + UNSPEC_VPERMT2))] + "TARGET_AVX512VL && TARGET_AVX512BF16 && ix86_pre_reload_split ()" + "#" + "&& 1" + [(const_int 0)] +{ + rtx op0 = gen_reg_rtx (mode); + operands[2] = lowpart_subreg (mode, + force_reg (mode, operands[2]), + mode); + operands[3] = lowpart_subreg (mode, + force_reg (mode, operands[3]), + mode); + + emit_insn (gen_avx512f_cvtne2ps2bf16_(op0, + operands[3], + operands[2])); + emit_move_insn (operands[0], lowpart_subreg (mode, op0, + mode)); + DONE; +} +[(set_attr "mode" "")]) diff --git a/gcc/testsuite/gcc.target/i386/vpermt2-special-bf16-shufflue.c b/gcc/testsuite/gcc.target/i386/vpermt2-special-bf16-shufflue.c new file mode 100755 index 00000000000..5c65f2a9884 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/vpermt2-special-bf16-shufflue.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mavx512bf16 -mavx512vl" } */ +/* { dg-final { scan-assembler-not "vpermi2b" } } */ +/* { dg-final { scan-assembler-times "vcvtne2ps2bf16" 3 } } */ + +typedef __bf16 v8bf __attribute__((vector_size(16))); +typedef __bf16 v16bf __attribute__((vector_size(32))); +typedef __bf16 v32bf __attribute__((vector_size(64))); + +v8bf foo0(v8bf a, v8bf b) +{ + return __builtin_shufflevector(a, b, 1, 3, 5, 7, 9, 11, 13, 15); +} + +v16bf foo1(v16bf a, v16bf b) +{ + return __builtin_shufflevector(a, b, 1, 3, 5, 7, 9, 11, 13, 15, + 17, 19, 21, 23, 25, 27, 29, 31); +} + +v32bf foo2(v32bf a, v32bf b) +{ + return __builtin_shufflevector(a, b, 1, 3, 5, 7, 9, 11, 13, 15, + 17, 19, 21, 23, 25, 27, 29, 31, + 33, 35, 37, 39, 41, 43, 45, 47, + 49, 51, 53, 55, 57, 59, 61, 63); +} -- 2.31.1