From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 94621 invoked by alias); 29 Feb 2020 14:16:21 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 94549 invoked by uid 89); 29 Feb 2020 14:16:14 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-18.8 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,GIT_PATCH_2,GIT_PATCH_3,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=i386c, i386.c, UD:i386.c, UD:predicates.md X-HELO: mail-pj1-f68.google.com Received: from mail-pj1-f68.google.com (HELO mail-pj1-f68.google.com) (209.85.216.68) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sat, 29 Feb 2020 14:16:13 +0000 Received: by mail-pj1-f68.google.com with SMTP id ep11so2497669pjb.2 for ; Sat, 29 Feb 2020 06:16:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=DwSzdmwKs5X6kMLKWowdL+m0KfYgpN3NnBnHBYvISS0=; b=Bitsu7jlWo+v+9zK+5MUPNEsU3LjufGPSQm0OQxi+9uGfggpMvGGsxuXOI4ilyhF/0 bDa34rJqrPf6eh8htsficGXRWeyRDbDv3FkeZ92M4FfrIYqVVREtXkywvGuGJY4xSUS8 8xwCcZJEqTDDqYwV/eL/gjHGjgrqxsKcYugYKUyvdOAXzQCdGqTTAMWM6N3iuoa2BNQ0 ap+LjacVZKFZO3CgeA8JZhItBsGgYPUKxqPABTqAtliEB+L4dqYy8P6EtKJ2cmNDWEHU EZNPKHd8TG+h9PZVlPArYqrpRJEmNKqAm8zabREbd4x32zPStrdUBffQRvxfLoVU3RVr vmYg== Return-Path: Received: from gnu-cfl-2.localdomain (c-73-93-86-59.hsd1.ca.comcast.net. [73.93.86.59]) by smtp.gmail.com with ESMTPSA id g10sm15214881pfo.166.2020.02.29.06.16.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 29 Feb 2020 06:16:09 -0800 (PST) Received: from gnu-cfl-2.hsd1.ca.comcast.net (localhost [IPv6:::1]) by gnu-cfl-2.localdomain (Postfix) with ESMTP id A8081C011C; Sat, 29 Feb 2020 06:16:08 -0800 (PST) From: "H.J. Lu" To: gcc-patches@gcc.gnu.org Cc: Jakub Jelinek , Jeffrey Law , Jan Hubicka , Uros Bizjak Subject: V2 [PATCH 0/6] i386: Properly encode xmm16-xmm31/ymm16-ymm31 for vector move Date: Sat, 29 Feb 2020 14:16:00 -0000 Message-Id: <20200229141608.88967-1-hjl.tools@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-IsSubscribed: yes X-SW-Source: 2020-02/txt/msg01693.txt.bz2 This patch set was originally submitted in Feb 2019: https://gcc.gnu.org/ml/gcc-patches/2019-02/msg01841.html I broke it into 6 smaller patches for easy review. On x86, when AVX and AVX512 are enabled, vector move instructions can be encoded with either 2-byte/3-byte VEX (AVX) or 4-byte EVEX (AVX512): 0: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 4: 62 f1 fd 08 6f d1 vmovdqa64 %xmm1,%xmm2 We prefer VEX encoding over EVEX since VEX is shorter. Also AVX512F only supports 512-bit vector moves. AVX512F + AVX512VL supports 128-bit and 256-bit vector moves. xmm16-xmm31 and ymm16-ymm31 are disallowed in 128-bit and 256-bit modes when AVX512VL is disabled. Mode attributes on x86 vector move patterns indicate target preferences of vector move encoding. For scalar register to register move, we can use 512-bit vector move instructions to move 32-bit/64-bit scalar if AVX512VL isn't available. With AVX512F and AVX512VL, we should use VEX encoding for 128-bit/256-bit vector moves if upper 16 vector registers aren't used. This patch adds a function, ix86_output_ssemov, to generate vector moves: 1. If zmm registers are used, use EVEX encoding. 2. If xmm16-xmm31/ymm16-ymm31 registers aren't used, SSE or VEX encoding will be generated. 3. If xmm16-xmm31/ymm16-ymm31 registers are used: a. With AVX512VL, AVX512VL vector moves will be generated. b. Without AVX512VL, xmm16-xmm31/ymm16-ymm31 register to register move will be done with zmm register move. There is no need to set mode attribute to XImode explicitly since ix86_output_ssemov can properly encode xmm16-xmm31/ymm16-ymm31 registers with and without AVX512VL. Tested on AVX2 and AVX512 with and without --with-arch=native. H.J. Lu (6): i386: Properly encode vector registers in vector move i386: Use ix86_output_ssemov for DImode TYPE_SSEMOV i386: Use ix86_output_ssemov for SImode TYPE_SSEMOV i386: Use ix86_output_ssemov for DFmode TYPE_SSEMOV i386: Use ix86_output_ssemov for SFmode TYPE_SSEMOV i386: Use ix86_output_ssemov for MMX TYPE_SSEMOV gcc/config/i386/i386-protos.h | 2 + gcc/config/i386/i386.c | 242 ++++++++++++++++++ gcc/config/i386/i386.md | 212 +-------------- gcc/config/i386/mmx.md | 29 +-- gcc/config/i386/predicates.md | 5 - gcc/config/i386/sse.md | 98 +------ .../gcc.target/i386/avx512vl-vmovdqa64-1.c | 7 +- gcc/testsuite/gcc.target/i386/pr89229-2a.c | 15 ++ gcc/testsuite/gcc.target/i386/pr89229-2b.c | 13 + gcc/testsuite/gcc.target/i386/pr89229-2c.c | 6 + gcc/testsuite/gcc.target/i386/pr89229-3a.c | 16 ++ gcc/testsuite/gcc.target/i386/pr89229-3b.c | 12 + gcc/testsuite/gcc.target/i386/pr89229-3c.c | 6 + gcc/testsuite/gcc.target/i386/pr89229-4a.c | 17 ++ gcc/testsuite/gcc.target/i386/pr89229-4b.c | 6 + gcc/testsuite/gcc.target/i386/pr89229-4c.c | 7 + gcc/testsuite/gcc.target/i386/pr89229-5a.c | 17 ++ gcc/testsuite/gcc.target/i386/pr89229-5b.c | 6 + gcc/testsuite/gcc.target/i386/pr89229-5c.c | 7 + gcc/testsuite/gcc.target/i386/pr89229-6a.c | 16 ++ gcc/testsuite/gcc.target/i386/pr89229-6b.c | 7 + gcc/testsuite/gcc.target/i386/pr89229-6c.c | 6 + gcc/testsuite/gcc.target/i386/pr89229-7a.c | 16 ++ gcc/testsuite/gcc.target/i386/pr89229-7b.c | 6 + gcc/testsuite/gcc.target/i386/pr89229-7c.c | 6 + gcc/testsuite/gcc.target/i386/pr89346.c | 15 ++ 26 files changed, 465 insertions(+), 330 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-2a.c create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-2b.c create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-2c.c create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-3a.c create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-3b.c create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-3c.c create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4a.c create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4b.c create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4c.c create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5a.c create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5b.c create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5c.c create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6a.c create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6b.c create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6c.c create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7a.c create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7b.c create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7c.c create mode 100644 gcc/testsuite/gcc.target/i386/pr89346.c -- 2.24.1