From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by sourceware.org (Postfix) with ESMTPS id 652C53858C3A for ; Wed, 22 Dec 2021 03:28:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 652C53858C3A X-IronPort-AV: E=McAfee;i="6200,9189,10205"; a="227385554" X-IronPort-AV: E=Sophos;i="5.88,224,1635231600"; d="scan'208";a="227385554" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Dec 2021 19:28:24 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,224,1635231600"; d="scan'208";a="616990919" Received: from fmsmsx601.amr.corp.intel.com ([10.18.126.81]) by orsmga004.jf.intel.com with ESMTP; 21 Dec 2021 19:28:24 -0800 Received: from fmsmsx606.amr.corp.intel.com (10.18.126.86) by fmsmsx601.amr.corp.intel.com (10.18.126.81) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Tue, 21 Dec 2021 19:28:23 -0800 Received: from FMSEDG603.ED.cps.intel.com (10.1.192.133) by fmsmsx606.amr.corp.intel.com (10.18.126.86) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20 via Frontend Transport; Tue, 21 Dec 2021 19:28:23 -0800 Received: from NAM10-BN7-obe.outbound.protection.outlook.com (104.47.70.103) by edgegateway.intel.com (192.55.55.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2308.20; Tue, 21 Dec 2021 19:28:08 -0800 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Ewv/Glrqnvr+9KEg6nE6LDfmGB0/r61PgrVqSrBCykJ2uKshV0SGv0U1NmZkz4MZYI3Mu3mpDiQKA/aGqdlOoZBmSoyY9XZasW0mM1oIOlUoPh9DGy5b4AErjUjcUkaqHduPuRs+BTXRNFedc+ZGmWNOM+E9NtauRICaUGK06Q0PyVWfQzIP7dJCdWp9L0w/Nav6tQQ3tLYw5mLXIGQODnhsVmS4ZprmraUZzgIUgqEGRN+f1PVnLMoqkGMBb7/Ik4Td6H9A7JxknT2drcWxXgZd08eqsKKrdlRguYpySRZGpU/VjOKDLfjDO7toPlpXDUJGrL7dPwIYREeYtOt7AQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Pb9BQl7TGun9wzifMen9pgfdqEnysvJqxwamnKWy7jc=; b=c46+nxsj3lCops5FexoJtMBlrmCWn8AN6hVu36aFkDiU+xYHZZwYheBSuEkmhhly8fJm1BMPX9OcWF6PqAEqwncx6bpbj9OOcQu+RG8UX2yYfxuMTHluLXR/mu9RXO51XRfT5s5jfUaexVtBmB8i992S2LbRkBF0yRLmjnXTBzZKebPzrfSd6L0HuR34JbGqfkdx8uwFks3FPnF2urFgFr4BKf+JPq5fj9jTxCaOsST7EQ341WNst5LTPgpPCQNPo2NoIx/fvPjXOPx/8aPHxVBGjUJbppVYSPwq5uDFj66XaA1GhLmULmanATsoaPeH20nAe8WPm15ubDZRfWQS5w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Received: from BN9PR11MB5483.namprd11.prod.outlook.com (2603:10b6:408:104::10) by BN8PR11MB3713.namprd11.prod.outlook.com (2603:10b6:408:8e::28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4823.17; Wed, 22 Dec 2021 03:28:06 +0000 Received: from BN9PR11MB5483.namprd11.prod.outlook.com ([fe80::408a:d875:74aa:9dc0]) by BN9PR11MB5483.namprd11.prod.outlook.com ([fe80::408a:d875:74aa:9dc0%4]) with mapi id 15.20.4801.020; Wed, 22 Dec 2021 03:28:06 +0000 From: "Kong, Lingling" To: "Liu, Hongtao" , "gcc-patches@gcc.gnu.org" CC: "Kong, Lingling" Subject: [PATCH] i386: Enable intrinsics that convert float and bf16 data to each other. Thread-Topic: [PATCH] i386: Enable intrinsics that convert float and bf16 data to each other. Thread-Index: Adf248zyUlrz9537SMCDeWZR9RSQ8A== Date: Wed, 22 Dec 2021 03:28:06 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 11.6.200.16 dlp-reaction: no-action x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 89e0ad0e-9cdd-464c-4400-08d9c4fb11ce x-ms-traffictypediagnostic: BN8PR11MB3713:EE_ x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:213; x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: ypJhzWjGt5UxIlWBNzk9v24iu6ljP9tGQcO3ElLhjN25AeyzeAa8wW5RjUaENyaD69uf3oB/7XmRUImRsHGKO7PBqThrjzwu49lvwABnaRpAmq+K2C2fQHHZ8esQFulHzFtl7SDCy1ScLM6A8ewWt9h2TCzDgbYID0PqPV8KGg6b1ls+dWbTxGRGfFOU8qhZ+Y9ghjUcV523zZLPTRsMKyewG78LqkPX+wU2C+NgJBhFKl6SVBvr2q3j4ObTfW38GexeW/RfPKFqeHV/7kaPEIoGvb8YJuSi+wlTtbFrGs6TT+5YX5NYMp+bgXGIqRvBPTUKoCGIY/ezffdznBOvA0szAYjKB2iS8LKcPLFWwnSrBoUNj0u8JiTu8SQ15j+ZEtc/ZFtYG4HmZszUOV39uO99aUsP1AtRo8bYnQD4pVkuTGYwab9AZk48lDpi8OVz5ZFWSL6OabKqqO+VqWjMlqHccLRucNaCClrsZLLe2r/7SsBeoOiTL57jShxNP9NAQo32nNkntVGhgzjAuEreMsSE0KNQq3IP8In6PvCEosle4h3B23Bg3m/pDR93Y4f7jDhEGaARJJGkSD5p5Xj6lTsGIHoT4AtwIcTOK5uHh4fmSw0paNirstZmVAqme74ylA2B2onf7mMKlmfvo236e7nZCAqjo0HZmjO3afeC8sEoUBv170xYs4Py4XwRqgGOgfRM47sEDP1lB81/XLFDtim0qboooXRu592AUeXLfMVFMriHlCjoBH+rO/UIrNe11DgBjOlblMdPIQI60IGqWhEgoJl1WnIBxiBzOgRICHA= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BN9PR11MB5483.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(366004)(508600001)(8936002)(38070700005)(186003)(110136005)(86362001)(316002)(30864003)(2906002)(38100700002)(122000001)(82960400001)(26005)(76116006)(66446008)(66476007)(64756008)(66556008)(66946007)(55016003)(84970400001)(52536014)(5660300002)(33656002)(9686003)(107886003)(7696005)(4326008)(71200400001)(8676002)(6506007); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?hVc8kI76ZDBsIyPOJES9kzwasysGjRuq++H6HUhl0PyMCxwJE5dWNUxPyccL?= =?us-ascii?Q?Q9dY592+aZ/HKq+6hq0FVEImPTJKWsbl724QCELM24/YALECYtEGK/NIB/FS?= =?us-ascii?Q?423X2CdaEKDMeiWMAbgvZrNEvfjDMR5RSPhJAvcCDrH0GK/z1ulPrnryJVA6?= =?us-ascii?Q?vHDi8tfe3JWJj8NBceXJo8MbJsETkbKRb4igduRPdlw9nCR/SZprRhgMLfrJ?= =?us-ascii?Q?eTBzGwX2HDuJGuAU5wwqY7Po6xP8AwqRn8Fm1/bP1hhzNT+X3k1NVcJdlJyS?= =?us-ascii?Q?/+V5UOFtdikuxvHq2At0N35D49P5GGrfkr64yWpyOrLyM2aNKT5YP80B7pqt?= =?us-ascii?Q?zWcxWMIFHUQcUmX19/jufDrVxbdM1l/IsuOpHQnfz6p5Vhh3ukw7nl/y3dGw?= =?us-ascii?Q?zsFRU4rKTNhT68lJn64TPVjDcOxgiQYlUpzIF1JamHKYpr2iXZ25ELLWNi4E?= =?us-ascii?Q?tUE2YL2ZrXDQQG8QK+FnE1IpWNeaavzTNOTcuDfJuQQg8ndVDVIu+BBnnemQ?= =?us-ascii?Q?TxHGWYu1DXzXs/MtWWpOy9a+sfBgxn3H0iQwSexjZeMSpvT/hffmwoIxOzFH?= =?us-ascii?Q?gRlBwjg0I2lHL82sPtPBOy8lRD4tuzqNlJCgBbhWPKGQy1pPvQZ458vAg8wU?= =?us-ascii?Q?BbkR/RQPyiKQBHL9tEz3hQenorRvZ2OY4UEbEbiYCzHNlxiVeXSNyUipqWLn?= =?us-ascii?Q?PZX1ZFXXw537vSZnwit1NMNojsBUJaqaTIardMLO6DvyEufvkdhg6DFVoPc6?= =?us-ascii?Q?hh7ifIe4AdGEvZZ2BHVZ3rcDDIAkdGMgPdx4uUxdEluQwrO/TxXuXjanc83J?= =?us-ascii?Q?lc2s8tBI3nYJ6WmdY62p3CvrNtV+m1BoXiFL4kCi3UWvlmJjZiA0y5cl2D+o?= =?us-ascii?Q?aPwE/uJkxVXIRJfH+wBr/47enyZ0Gu54TyyTNoXW/cs9w+CHPAKJrFMJhDju?= =?us-ascii?Q?iK2z2aLjI4rD0ywPouifjDArF1JtfUonVC3fHBkArBu7SQBJocHkgR6mevXh?= =?us-ascii?Q?8OOAps3rMr5WX5QgDsrYcSXVaK7xBXxOgC+TtzjeZwm8A63eur+VnY6XjfW/?= =?us-ascii?Q?TBpbwUvuHzSfahLVHKWv9RTGcUAVE92hDiOKtxpFjE4+fvJJDME4cuCincJR?= =?us-ascii?Q?Ek1F6Qk3dqMXGHGDy9wi/iY0933X0Dv1GgsNJV/GxySV7xJBD3sGXMT0lPpS?= =?us-ascii?Q?izH/QLoyFkCdecA81SOVMk0r6f2ui9niHaAj1t5F8kNM3fJgytUUGs90nDML?= =?us-ascii?Q?6czZh7X1OxYWms74G4EgrFQcBINEebmtUE01Rp8Fep/rbsyPUsUpHmlUaeNR?= =?us-ascii?Q?tsLpp4QAsCNbKRGTt5qitDHMo0V1H0bgM87pI1Chj02An1QB+aRWk1VSnk5n?= =?us-ascii?Q?n91F05i8wMN4K76bgHaSXprP50sS/cUSWYyHa7j80QC6ZpzXMJw+SHxZ5W3D?= =?us-ascii?Q?IxmbIHu2wOPGKdQpKJyA7mdocr4gKFADm4kXJsDblxAOgB31TJpNt5HMt1ro?= =?us-ascii?Q?09Tt1V4VYZ9tdQu1yyjx1LT5tRWSGaf4tF8IsrflTojebcDe1O8Rfr5DgScO?= =?us-ascii?Q?qH0CBcdXhVSZz0Q1TFAtF35CcCnM2nwelsbmlzt+hw/VbiZ6lJ+2X7ZG/1u+?= =?us-ascii?Q?GLxurOpOWwJ1yK4ZWfNDg6g=3D?= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: BN9PR11MB5483.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 89e0ad0e-9cdd-464c-4400-08d9c4fb11ce X-MS-Exchange-CrossTenant-originalarrivaltime: 22 Dec 2021 03:28:06.2795 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: AALH+LbPrVy4twCoF7cT802ZPgBuyyib9rU7qCLKoMrjeiKBeutxhh60l7EcjDtppBwtqzLf/eCEej/7GDow3Q== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN8PR11MB3713 X-OriginatorOrg: intel.com X-Spam-Status: No, score=-14.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Dec 2021 03:28:27 -0000 Hi, This patch is to enable intrinsics that convert float and bf16 data to each= other. Ok for master? gcc/ChangeLog: * config/i386/avx512bf16intrin.h (_mm_cvtsbh_ss): Add new intrinsic. (_mm512_cvtpbh_ps): Likewise. (_mm512_maskz_cvtpbh_ps): Likewise. (_mm512_mask_cvtpbh_ps): Likewise. * config/i386/avx512bf16vlintrin.h (_mm_cvtness_sbh): Likewise. (_mm_cvtpbh_ps): Likewise. (_mm256_cvtpbh_ps): Likewise. (_mm_maskz_cvtpbh_ps): Likewise. (_mm256_maskz_cvtpbh_ps): Likewise. (_mm_mask_cvtpbh_ps): Likewise. (_mm256_mask_cvtpbh_ps): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512bf16-cvtsbh2ss-1.c: New test. * gcc.target/i386/avx512bf16-vcvtpbh2ps-1.c: Ditto. * gcc.target/i386/avx512bf16vl-cvtness2sbh-1.c: Ditto. * gcc.target/i386/avx512bf16vl-vcvtpbh2ps-1.c: Ditto. --- gcc/config/i386/avx512bf16intrin.h | 36 +++++++++++ gcc/config/i386/avx512bf16vlintrin.h | 63 +++++++++++++++++++ .../gcc.target/i386/avx512bf16-cvtsbh2ss-1.c | 15 +++++ .../gcc.target/i= 386/avx512bf16-vcvtpbh2ps-1.c | 20 ++++++ .../i386/avx512bf16vl-cvtness2sbh-1.c | 14 +++++ .../i386/avx512bf16vl-vcvtpbh2ps-1.c | 29 +++++++++ 6 files changed, 177 insertions(+) create mode 100644 gcc/testsuite/gcc.target/i386/avx512bf16-cvtsbh2ss-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx512bf16-vcvtpbh2ps-1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx512bf16vl-cvtness2sbh-= 1.c create mode 100644 gcc/testsuite/gcc.target/i386/avx512bf16vl-vcvtpbh2ps-1= .c diff --git a/gcc/config/i386/avx512bf16intrin.h b/gcc/config/i386/avx512bf1= 6intrin.h index 9afc6bd7d2b..6b62dc3e398 100644 --- a/gcc/config/i386/avx512bf16intrin.h +++ b/gcc/config/i386/avx512bf16intrin.h @@ -41,6 +41,16 @@ typedef short __v32bh __attribute__ ((__vector_size__ (6= 4))); vector types, and their scalar components. */ typedef short __m512bh = __attribute__ ((__vector_size__ (64), __may_alias__)); =20 +/* Convert One BF16 Data to One Single Float Data. */ extern __inline=20 +float __attribute__ ((__gnu_inline__, __always_inline__,=20 +__artificial__)) _mm_cvtsbh_ss (__bfloat16 __A) { + union{ float a; unsigned int b;} __tmp; + __tmp.b =3D ((unsigned int)(__A)) << 16; + return __tmp.a; +} + /* vcvtne2ps2bf16 */ =20 extern __inline __m512bh @@ -110,6 +120,32 @@ _mm512_maskz_dpbf16_ps (__mmask16 __A, __m512 __B, __m= 512bh __C, __m512bh __D) return (__m512)__builtin_ia32_dpbf16ps_v16sf_maskz(__B, __C, __D, __A); = } =20 +extern __inline __m512 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))=20 +_mm512_cvtpbh_ps (__m256bh __A) { + return (__m512)_mm512_castsi512_ps ((__m512i)_mm512_slli_epi32 ( + (__m512i)_mm512_cvtepi16_epi32 ((__m256i)__A), 16)); } + +extern __inline __m512 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))=20 +_mm512_maskz_cvtpbh_ps (__mmask16 __U, __m256bh __A) { + return (__m512)_mm512_castsi512_ps ((__m512i) _mm512_slli_epi32 ( + (__m512i)_mm512_maskz_cvtepi16_epi32 ( + (__mmask16)__U, (__m256i)__A), 16)); +} + +extern __inline __m512 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))=20 +_mm512_mask_cvtpbh_ps (__m512 __S, __mmask16 __U, __m256bh __A) { + return (__m512)_mm512_castsi512_ps ((__m512i)(_mm512_mask_slli_epi32 ( + (__m512i)__S, (__mmask16)__U, + (__m512i)_mm512_cvtepi16_epi32 ((__m256i)__A), 16))); } + #ifdef __DISABLE_AVX512BF16__ #undef __DISABLE_AVX512BF16__ #pragma GCC pop_options diff --git a/gcc/config/i386/avx512bf16vlintrin.h b/gcc/config/i386/avx512b= f16vlintrin.h index 6dd396d4008..5e6a6503aa6 100644 --- a/gcc/config/i386/avx512bf16vlintrin.h +++ b/gcc/config/i386/avx512bf16vlintrin.h @@ -43,6 +43,7 @@ typedef short __v8bh __attribute__ ((__vector_size__ (16)= )); typedef short __m256bh __attribute__ ((__vector_size__ (32), __may_ali= as__)); typedef short __m128bh __attribute__ ((__vector_size__ (16), __may= _alias__)); =20 +typedef unsigned short __bfloat16; /* vcvtne2ps2bf16 */ =20 extern __inline __m256bh @@ -175,6 +176,68 @@ _mm_maskz_dpbf16_ps (__mmask8 __A, __m128 __B, __m128b= h __C, __m128bh __D) return (__m128)__builtin_ia32_dpbf16ps_v4sf_maskz(__B, __C, __D, __A); = } =20 +extern __inline __bfloat16 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))=20 +_mm_cvtness_sbh (float __A) { + __v4sf __V =3D {__A, 0, 0, 0}; + __v8hi __R =3D __builtin_ia32_cvtneps2bf16_v4sf_mask ((__v4sf)__V, + (__v8hi)_mm_undefined_si128 (), (__mmask8)-1); + return __R[0]; +} + +extern __inline __m128 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))=20 +_mm_cvtpbh_ps (__m128bh __A) { + return (__m128)_mm_castsi128_ps ((__m128i)_mm_slli_epi32 ( + (__m128i)_mm_cvtepi16_epi32 ((__m128i)__A), 16)); } + +extern __inline __m256 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))=20 +_mm256_cvtpbh_ps (__m128bh __A) { + return (__m256)_mm256_castsi256_ps ((__m256i)_mm256_slli_epi32 ( + (__m256i)_mm256_cvtepi16_epi32 ((__m128i)__A), 16)); } + +extern __inline __m128 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))=20 +_mm_maskz_cvtpbh_ps (__mmask8 __U, __m128bh __A) { + return (__m128)_mm_castsi128_ps ((__m128i)_mm_slli_epi32 ( + (__m128i)_mm_maskz_cvtepi16_epi32 ( + (__mmask8)__U, (__m128i)__A), 16)); +} + +extern __inline __m256 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))=20 +_mm256_maskz_cvtpbh_ps (__mmask8 __U, __m128bh __A) { + return (__m256)_mm256_castsi256_ps ((__m256i)_mm256_slli_epi32 ( + (__m256i)_mm256_maskz_cvtepi16_epi32 ( + (__mmask8)__U, (__m128i)__A), 16)); +} + +extern __inline __m128 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))=20 +_mm_mask_cvtpbh_ps (__m128 __S, __mmask8 __U, __m128bh __A) { + return (__m128)_mm_castsi128_ps ((__m128i)_mm_mask_slli_epi32 ( + (__m128i)__S, (__mmask8)__U, (__m128i)_mm_cvtepi16_epi32 ( + (__m128i)__A), 16)); +} + +extern __inline __m256 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))=20 +_mm256_mask_cvtpbh_ps (__m256 __S, __mmask8 __U, __m128bh __A) { + return (__m256)_mm256_castsi256_ps ((__m256i)_mm256_mask_slli_epi32 ( + (__m256i)__S, (__mmask8)__U, (__m256i)_mm256_cvtepi16_epi32 ( + (__m128i)__A), 16)); +} + #ifdef __DISABLE_AVX512BF16VL__ #undef __DISABLE_AVX512BF16VL__ #pragma GCC pop_options diff --git a/gcc/testsuite/gcc.target/i386/avx512bf16-cvtsbh2ss-1.c b/gcc/t= estsuite/gcc.target/i386/avx512bf16-cvtsbh2ss-1.c new file mode 100644 index 00000000000..bf29a69a5b5 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512bf16-cvtsbh2ss-1.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512bf16 -O2" } */ +/* { dg-final { scan-assembler-times "sall\[ \\t\]+\[^\{\n\]*16" 1 } }=20 +*/ +/* { dg-final { scan-assembler-times "movl" 1 } } */ + +#include + +volatile __bfloat16 x1; +volatile float res; + +void extern +avx512bf16_test (void) +{ + res =3D _mm_cvtsbh_ss (x1); +} diff --git a/gcc/testsuite/gcc.target/i386/avx512bf16-vcvtpbh2ps-1.c b/gcc/= testsuite/gcc.target/i386/avx512bf16-vcvtpbh2ps-1.c new file mode 100644 index 00000000000..a2ae4bef455 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512bf16-vcvtpbh2ps-1.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512bf16 -O2" } */ +/* { dg-final { scan-assembler-times "vpmovsxwd\[=20 +\\t\]+\[^\n\]*%zmm\[0-9\](?:\n|\[ \\t\]+#)" 2 } } */ +/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$16,=20 +%zmm\[0-9]\+, %zmm\[0-9]\+(?:\n|\[ \\t\]+#)" 2 } } */ +/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$16,=20 +%zmm\[0-9]\+, %zmm\[0-9]\+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vpmovsxwd\[=20 +\\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n\]*%zmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n +|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m256bh x1; +volatile __m512 res; +volatile __mmask16 m16; + +void extern +avx512bf16_test (void) +{ + res =3D _mm512_cvtpbh_ps (x1); + res =3D _mm512_mask_cvtpbh_ps (res, m16, x1); + res =3D _mm512_maskz_cvtpbh_ps (m16, x1); } diff --git a/gcc/testsuite/gcc.target/i386/avx512bf16vl-cvtness2sbh-1.c b/g= cc/testsuite/gcc.target/i386/avx512bf16vl-cvtness2sbh-1.c new file mode 100644 index 00000000000..8f21b1bfdae --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512bf16vl-cvtness2sbh-1.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512bf16 -mavx512vl -O2" } */ +/* { dg-final { scan-assembler-times "vcvtneps2bf16\[=20 +\\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\r]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 1 }=20 +} */ + +#include + +volatile __bfloat16 res; +volatile float x1; + +void extern +avx512bf16_test (void) +{ + res =3D _mm_cvtness_sbh (x1); +} diff --git a/gcc/testsuite/gcc.target/i386/avx512bf16vl-vcvtpbh2ps-1.c b/gc= c/testsuite/gcc.target/i386/avx512bf16vl-vcvtpbh2ps-1.c new file mode 100644 index 00000000000..98f458b49f7 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512bf16vl-vcvtpbh2ps-1.c @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512bf16 -mavx512vl -O2" } */ +/* { dg-final { scan-assembler-times "vpmovsxwd\[=20 +\\t\]+\[^\n\]*%ymm\[0-9\](?:\n|\[ \\t\]+#)" 2 } } */ +/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$16,=20 +%ymm\[0-9]\+, %ymm\[0-9]\+(?:\n|\[ \\t\]+#)" 2 } } */ +/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$16,=20 +%ymm\[0-9]\+, %ymm\[0-9]\+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vpmovsxwd\[=20 +\\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n +|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vpmovsxwd\[=20 +\\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+(?:\n|\[ \\t\]+#)" 2 }=20 +} */ +/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$16,=20 +%xmm\[0-9]\+, %xmm\[0-9]\+(?:\n|\[ \\t\]+#)" 2 } } */ +/* { dg-final { scan-assembler-times "vpslld\[ \t]\+\\\$16,=20 +%xmm\[0-9]\+, %xmm\[0-9]\+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */ +/* { dg-final { scan-assembler-times "vpmovsxwd\[=20 +\\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n +|\[ \\t\]+#)" 1 } } */ + +#include + +volatile __m128bh x1; +volatile __m128 res1; +volatile __m256 res2; +volatile __mmask8 m8; + +void extern +avx512bf16_test (void) +{ + res2 =3D _mm256_cvtpbh_ps (x1); + res2 =3D _mm256_mask_cvtpbh_ps (res2, m8, x1); + res2 =3D _mm256_maskz_cvtpbh_ps (m8, x1); + =20 + res1 =3D _mm_cvtpbh_ps (x1); + res1 =3D _mm_mask_cvtpbh_ps (res1, m8, x1); + res1 =3D _mm_maskz_cvtpbh_ps (m8, x1); +} -- 2.18.1