From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 107544 invoked by alias); 17 Mar 2019 20:47:36 -0000 Mailing-List: contact binutils-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: binutils-owner@sourceware.org Received: (qmail 107529 invoked by uid 89); 17 Mar 2019 20:47:36 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=HX-Spam-Relays-External:209.85.210.195, H*RU:2607, HX-Spam-Relays-External:2607, H*r:2607 X-HELO: mail-pf1-f195.google.com Received: from mail-pf1-f195.google.com (HELO mail-pf1-f195.google.com) (209.85.210.195) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sun, 17 Mar 2019 20:47:28 +0000 Received: by mail-pf1-f195.google.com with SMTP id d25so9801936pfn.8 for ; Sun, 17 Mar 2019 13:47:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=kOJbeIf+RX+omcbX4zkxVQe+U7pPt30XVcKygwtA8BU=; b=WXuIehzPIRTlu75JQaYO+PVuVKuO8JdEZcQkiA25R+BmbWZ2E76Q0MYTs3tndL8CYn Aus3LmRF1EIb1x8jQPC+RXZuQ5Em5p0G8uMlYptDpphUo3sjZoJ1zsUZu+4FKjxWolT2 VBwCwofuuNdRlTSMNpXPJwYYDJteYAOv/JVF6qsicPCCS7vgM/RQbmaYiVv/A43H0IJ/ O11G9kfzEjERlPh77gLkTqiA688z7rjVpZVoE6V5c31sywqsnAsWjlzNgG16MS2o6tjv qrlwUcNRcACo/cvc5HOaNvWTM6eB0aKdyualPg8LBLWJQWUa6DVfrO45gWM3RRS23EU1 AgHA== Return-Path: Received: from gnu-gram-1.localdomain ([2607:fb90:20a:1689:c99d:1526:2cde:2978]) by smtp.gmail.com with ESMTPSA id o5sm32244536pgc.16.2019.03.17.13.47.19 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sun, 17 Mar 2019 13:47:24 -0700 (PDT) Received: by gnu-gram-1.localdomain (Postfix, from userid 1000) id 7A3D0E0031; Mon, 18 Mar 2019 04:47:12 +0800 (CST) Date: Sun, 17 Mar 2019 20:47:00 -0000 From: "H.J. Lu" To: binutils@sourceware.org Subject: V2 [PATCH] x86: Optimize EVEX vector load/store instructions Message-ID: <20190317204712.GA6721@gmail.com> References: <20190315235414.11609-1-hjl.tools@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190315235414.11609-1-hjl.tools@gmail.com> User-Agent: Mutt/1.11.3 (2019-02-01) X-IsSubscribed: yes X-SW-Source: 2019-03/txt/msg00093.txt.bz2 On Sat, Mar 16, 2019 at 07:54:14AM +0800, H.J. Lu wrote: > When there is no write mask, we can encode lower 16 128-bit/256-bit > vector register load and store instructions as VEX vector register > load and store instructions with -O2. > > gas/ > > PR gas/24348 > * config/tc-i386.c (optimize_encoding): Encode EVEX 128-bit and > 256-bit vector register load/store instructions as VEX vector > register load/store instructions for -O2. > (md_parse_option): Set optimize to INT_MAX for -Os. > * doc/c-i386.texi: Update -O2 documentation. > This is the patch I am checking in. H.J. ---- When there is no write mask, we can encode lower 16 128-bit/256-bit vector register load and store instructions as VEX vector register load and store instructions with -O1. gas/ PR gas/24348 * config/tc-i386.c (optimize_encoding): Encode EVEX 128-bit and 256-bit vector register load/store instructions as VEX vector register load/store instructions for -O1. * doc/c-i386.texi: Update -O1 documentation. * testsuite/gas/i386/i386.exp: Run PR gas/24348 tests. * testsuite/gas/i386/optimize-1.s: Add tests for EVEX vector load/store instructions. * testsuite/gas/i386/optimize-2.s: Likewise. * testsuite/gas/i386/optimize-3.s: Likewise. * testsuite/gas/i386/optimize-5.s: Likewise. * testsuite/gas/i386/x86-64-optimize-2.s: Likewise. * testsuite/gas/i386/x86-64-optimize-3.s: Likewise. * testsuite/gas/i386/x86-64-optimize-4.s: Likewise. * testsuite/gas/i386/x86-64-optimize-5.s: Likewise. * testsuite/gas/i386/x86-64-optimize-6.s: Likewise. * testsuite/gas/i386/optimize-1.d: Updated. * testsuite/gas/i386/optimize-2.d: Likewise. * testsuite/gas/i386/optimize-3.d: Likewise. * testsuite/gas/i386/optimize-4.d: Likewise. * testsuite/gas/i386/optimize-5.d: Likewise. * testsuite/gas/i386/x86-64-optimize-2.d: Likewise. * testsuite/gas/i386/x86-64-optimize-3.d: Likewise. * testsuite/gas/i386/x86-64-optimize-4.d: Likewise. * testsuite/gas/i386/x86-64-optimize-5.d: Likewise. * testsuite/gas/i386/x86-64-optimize-6.d: Likewise. * testsuite/gas/i386/optimize-7.d: New file. * testsuite/gas/i386/optimize-7.s: Likewise. * testsuite/gas/i386/x86-64-optimize-8.d: Likewise. * testsuite/gas/i386/x86-64-optimize-8.s: Likewise. opcodes/ PR gas/24348 * i386-opc.tbl: Add Optimize to vmovdqa32, vmovdqa64, vmovdqu8, vmovdqu16, vmovdqu32 and vmovdqu64. * i386-tbl.h: Regenerated. --- gas/config/tc-i386.c | 50 ++++++++++ gas/doc/c-i386.texi | 4 +- gas/testsuite/gas/i386/i386.exp | 2 + gas/testsuite/gas/i386/optimize-1.d | 36 +++++++ gas/testsuite/gas/i386/optimize-1.s | 42 ++++++++ gas/testsuite/gas/i386/optimize-1a.d | 36 +++++++ gas/testsuite/gas/i386/optimize-2.d | 72 ++++++++++++++ gas/testsuite/gas/i386/optimize-2.s | 84 ++++++++++++++++ gas/testsuite/gas/i386/optimize-3.d | 6 ++ gas/testsuite/gas/i386/optimize-3.s | 7 ++ gas/testsuite/gas/i386/optimize-4.d | 36 +++++++ gas/testsuite/gas/i386/optimize-5.d | 42 ++++++++ gas/testsuite/gas/i386/optimize-5.s | 7 ++ gas/testsuite/gas/i386/optimize-7.d | 12 +++ gas/testsuite/gas/i386/optimize-7.s | 6 ++ gas/testsuite/gas/i386/x86-64-optimize-2.d | 48 +++++++++ gas/testsuite/gas/i386/x86-64-optimize-2.s | 56 +++++++++++ gas/testsuite/gas/i386/x86-64-optimize-2a.d | 48 +++++++++ gas/testsuite/gas/i386/x86-64-optimize-3.d | 90 +++++++++++++++++ gas/testsuite/gas/i386/x86-64-optimize-3.s | 105 ++++++++++++++++++++ gas/testsuite/gas/i386/x86-64-optimize-4.d | 6 ++ gas/testsuite/gas/i386/x86-64-optimize-4.s | 7 ++ gas/testsuite/gas/i386/x86-64-optimize-5.d | 54 ++++++++++ gas/testsuite/gas/i386/x86-64-optimize-5.s | 7 ++ gas/testsuite/gas/i386/x86-64-optimize-6.d | 54 ++++++++++ gas/testsuite/gas/i386/x86-64-optimize-6.s | 7 ++ gas/testsuite/gas/i386/x86-64-optimize-8.d | 12 +++ gas/testsuite/gas/i386/x86-64-optimize-8.s | 6 ++ opcodes/i386-opc.tbl | 12 +-- opcodes/i386-tbl.h | 12 +-- 30 files changed, 953 insertions(+), 13 deletions(-) create mode 100644 gas/testsuite/gas/i386/optimize-7.d create mode 100644 gas/testsuite/gas/i386/optimize-7.s create mode 100644 gas/testsuite/gas/i386/x86-64-optimize-8.d create mode 100644 gas/testsuite/gas/i386/x86-64-optimize-8.s diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c index 856c18d672..fa060759ae 100644 --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -4075,6 +4075,56 @@ optimize_encoding (void) i.types[j].bitfield.ymmword = 0; } } + else if ((cpu_arch_flags.bitfield.cpuavx + || cpu_arch_isa_flags.bitfield.cpuavx) + && i.vec_encoding != vex_encoding_evex + && !i.types[0].bitfield.zmmword + && !i.mask + && is_evex_encoding (&i.tm) + && (i.tm.base_opcode == 0x666f + || (i.tm.base_opcode ^ Opcode_SIMD_IntD) == 0x666f + || i.tm.base_opcode == 0xf36f + || (i.tm.base_opcode ^ Opcode_SIMD_IntD) == 0xf36f + || i.tm.base_opcode == 0xf26f + || (i.tm.base_opcode ^ Opcode_SIMD_IntD) == 0xf26f) + && i.tm.extension_opcode == None) + { + /* Optimize: -O1: + VOP, one of vmovdqa32, vmovdqa64, vmovdqu8, vmovdqu16, + vmovdqu32 and vmovdqu64: + EVEX VOP %xmmM, %xmmN + -> VEX vmovdqa|vmovdqu %xmmM, %xmmN (M and N < 16) + EVEX VOP %ymmM, %ymmN + -> VEX vmovdqa|vmovdqu %ymmM, %ymmN (M and N < 16) + EVEX VOP %xmmM, mem + -> VEX vmovdqa|vmovdqu %xmmM, mem (M < 16) + EVEX VOP %ymmM, mem + -> VEX vmovdqa|vmovdqu %ymmM, mem (M < 16) + EVEX VOP mem, %xmmN + -> VEX mvmovdqa|vmovdquem, %xmmN (N < 16) + EVEX VOP mem, %ymmN + -> VEX vmovdqa|vmovdqu mem, %ymmN (N < 16) + */ + if (i.tm.base_opcode == 0xf26f) + i.tm.base_opcode = 0xf36f; + else if ((i.tm.base_opcode ^ Opcode_SIMD_IntD) == 0xf26f) + i.tm.base_opcode = 0xf36f ^ Opcode_SIMD_IntD; + i.tm.opcode_modifier.vex + = i.types[0].bitfield.ymmword ? VEX256 : VEX128; + i.tm.opcode_modifier.vexw = VEXW0; + i.tm.opcode_modifier.evex = 0; + i.tm.opcode_modifier.masking = 0; + i.tm.opcode_modifier.disp8memshift = 0; + i.memshift = 0; + for (j = 0; j < 2; j++) + if (operand_type_check (i.types[j], disp) + && i.op[j].disps->X_op == O_constant) + { + i.types[j].bitfield.disp8 + = fits_in_disp8 (i.op[j].disps->X_add_number); + break; + } + } } /* This is the guts of the machine-dependent assembler. LINE points to a diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi index 7e5f5c257e..4acd5ff616 100644 --- a/gas/doc/c-i386.texi +++ b/gas/doc/c-i386.texi @@ -456,7 +456,9 @@ immediate as 32-bit register load instructions with 31-bit or 32-bits immediates, encode 64-bit register clearing instructions with 32-bit register clearing instructions and encode 256-bit/512-bit VEX/EVEX vector register clearing instructions with 128-bit VEX vector register -clearing instructions. @samp{-O2} includes @samp{-O1} optimization plus +clearing instructions as well as encode 128-bit/256-bit EVEX vector +register load/store instructions with VEX vector register load/store +instructions. @samp{-O2} includes @samp{-O1} optimization plus encodes 256-bit/512-bit EVEX vector register clearing instructions with 128-bit EVEX vector register clearing instructions. @samp{-Os} includes @samp{-O2} optimization plus encodes 16-bit, 32-bit diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp index 798bfb564a..3067b4a1f1 100644 --- a/gas/testsuite/gas/i386/i386.exp +++ b/gas/testsuite/gas/i386/i386.exp @@ -476,6 +476,7 @@ if [expr ([istarget "i*86-*-*"] || [istarget "x86_64-*-*"]) && [gas_32_check]] run_dump_test "optimize-6a" run_dump_test "optimize-6b" run_dump_test "optimize-6c" + run_dump_test "optimize-7" # These tests require support for 8 and 16 bit relocs, # so we only run them for ELF and COFF targets. @@ -990,6 +991,7 @@ if [expr ([istarget "i*86-*-*"] || [istarget "x86_64-*-*"]) && [gas_64_check]] t run_dump_test "x86-64-optimize-7a" run_dump_test "x86-64-optimize-7b" run_dump_test "x86-64-optimize-7c" + run_dump_test "x86-64-optimize-8" if { ![istarget "*-*-aix*"] && ![istarget "*-*-beos*"] diff --git a/gas/testsuite/gas/i386/optimize-1.d b/gas/testsuite/gas/i386/optimize-1.d index 4358c19c21..70c802c002 100644 --- a/gas/testsuite/gas/i386/optimize-1.d +++ b/gas/testsuite/gas/i386/optimize-1.d @@ -62,4 +62,40 @@ Disassembly of section .text: +[a-f0-9]+: c5 f4 47 e9 kxorw %k1,%k1,%k5 +[a-f0-9]+: c5 f4 42 e9 kandnw %k1,%k1,%k5 +[a-f0-9]+: c5 f4 42 e9 kandnw %k1,%k1,%k5 + +[a-f0-9]+: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 + +[a-f0-9]+: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 f9 6f 50 7f vmovdqa 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 f9 6f 50 7f vmovdqa 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 f9 7f 88 80 00 00 00 vmovdqa %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 f9 7f 88 80 00 00 00 vmovdqa %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fd 6f d1 vmovdqa %ymm1,%ymm2 + +[a-f0-9]+: c5 fd 6f d1 vmovdqa %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fd 6f 50 7f vmovdqa 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fd 6f 50 7f vmovdqa 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fd 7f 88 80 00 00 00 vmovdqa %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fd 7f 88 80 00 00 00 vmovdqa %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%eax\) #pass diff --git a/gas/testsuite/gas/i386/optimize-1.s b/gas/testsuite/gas/i386/optimize-1.s index f61a176de8..6dcfbc2799 100644 --- a/gas/testsuite/gas/i386/optimize-1.s +++ b/gas/testsuite/gas/i386/optimize-1.s @@ -72,3 +72,45 @@ _start: kandnd %k1, %k1, %k5 kandnq %k1, %k1, %k5 + + vmovdqa32 %xmm1, %xmm2 + vmovdqa64 %xmm1, %xmm2 + vmovdqu8 %xmm1, %xmm2 + vmovdqu16 %xmm1, %xmm2 + vmovdqu32 %xmm1, %xmm2 + vmovdqu64 %xmm1, %xmm2 + + vmovdqa32 127(%eax), %xmm2 + vmovdqa64 127(%eax), %xmm2 + vmovdqu8 127(%eax), %xmm2 + vmovdqu16 127(%eax), %xmm2 + vmovdqu32 127(%eax), %xmm2 + vmovdqu64 127(%eax), %xmm2 + + vmovdqa32 %xmm1, 128(%eax) + vmovdqa64 %xmm1, 128(%eax) + vmovdqu8 %xmm1, 128(%eax) + vmovdqu16 %xmm1, 128(%eax) + vmovdqu32 %xmm1, 128(%eax) + vmovdqu64 %xmm1, 128(%eax) + + vmovdqa32 %ymm1, %ymm2 + vmovdqa64 %ymm1, %ymm2 + vmovdqu8 %ymm1, %ymm2 + vmovdqu16 %ymm1, %ymm2 + vmovdqu32 %ymm1, %ymm2 + vmovdqu64 %ymm1, %ymm2 + + vmovdqa32 127(%eax), %ymm2 + vmovdqa64 127(%eax), %ymm2 + vmovdqu8 127(%eax), %ymm2 + vmovdqu16 127(%eax), %ymm2 + vmovdqu32 127(%eax), %ymm2 + vmovdqu64 127(%eax), %ymm2 + + vmovdqa32 %ymm1, 128(%eax) + vmovdqa64 %ymm1, 128(%eax) + vmovdqu8 %ymm1, 128(%eax) + vmovdqu16 %ymm1, 128(%eax) + vmovdqu32 %ymm1, 128(%eax) + vmovdqu64 %ymm1, 128(%eax) diff --git a/gas/testsuite/gas/i386/optimize-1a.d b/gas/testsuite/gas/i386/optimize-1a.d index e6e6d81fe4..cee2383d84 100644 --- a/gas/testsuite/gas/i386/optimize-1a.d +++ b/gas/testsuite/gas/i386/optimize-1a.d @@ -63,4 +63,40 @@ Disassembly of section .text: +[a-f0-9]+: c5 f4 47 e9 kxorw %k1,%k1,%k5 +[a-f0-9]+: c5 f4 42 e9 kandnw %k1,%k1,%k5 +[a-f0-9]+: c5 f4 42 e9 kandnw %k1,%k1,%k5 + +[a-f0-9]+: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 + +[a-f0-9]+: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 f9 6f 50 7f vmovdqa 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 f9 6f 50 7f vmovdqa 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 f9 7f 88 80 00 00 00 vmovdqa %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 f9 7f 88 80 00 00 00 vmovdqa %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fd 6f d1 vmovdqa %ymm1,%ymm2 + +[a-f0-9]+: c5 fd 6f d1 vmovdqa %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fd 6f 50 7f vmovdqa 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fd 6f 50 7f vmovdqa 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fd 7f 88 80 00 00 00 vmovdqa %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fd 7f 88 80 00 00 00 vmovdqa %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%eax\) #pass diff --git a/gas/testsuite/gas/i386/optimize-2.d b/gas/testsuite/gas/i386/optimize-2.d index e8a516997a..19467f5c01 100644 --- a/gas/testsuite/gas/i386/optimize-2.d +++ b/gas/testsuite/gas/i386/optimize-2.d @@ -17,4 +17,76 @@ Disassembly of section .text: +[a-f0-9]+: f7 c7 7f 00 00 00 test \$0x7f,%edi +[a-f0-9]+: 66 f7 c7 7f 00 test \$0x7f,%di +[a-f0-9]+: c5 f1 55 e9 vandnpd %xmm1,%xmm1,%xmm5 + +[a-f0-9]+: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 + +[a-f0-9]+: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 f9 6f 50 7f vmovdqa 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 f9 6f 50 7f vmovdqa 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 f9 7f 88 80 00 00 00 vmovdqa %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 f9 7f 88 80 00 00 00 vmovdqa %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fd 6f d1 vmovdqa %ymm1,%ymm2 + +[a-f0-9]+: c5 fd 6f d1 vmovdqa %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fd 6f 50 7f vmovdqa 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fd 6f 50 7f vmovdqa 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fd 7f 88 80 00 00 00 vmovdqa %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fd 7f 88 80 00 00 00 vmovdqa %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%eax\) + +[a-f0-9]+: 62 f1 7d 48 6f d1 vmovdqa32 %zmm1,%zmm2 + +[a-f0-9]+: 62 f1 fd 48 6f d1 vmovdqa64 %zmm1,%zmm2 + +[a-f0-9]+: 62 f1 7f 48 6f d1 vmovdqu8 %zmm1,%zmm2 + +[a-f0-9]+: 62 f1 ff 48 6f d1 vmovdqu16 %zmm1,%zmm2 + +[a-f0-9]+: 62 f1 7e 48 6f d1 vmovdqu32 %zmm1,%zmm2 + +[a-f0-9]+: 62 f1 fe 48 6f d1 vmovdqu64 %zmm1,%zmm2 + +[a-f0-9]+: 62 f1 7d 28 6f d1 vmovdqa32 %ymm1,%ymm2 + +[a-f0-9]+: 62 f1 fd 28 6f d1 vmovdqa64 %ymm1,%ymm2 + +[a-f0-9]+: 62 f1 7f 08 6f d1 vmovdqu8 %xmm1,%xmm2 + +[a-f0-9]+: 62 f1 ff 08 6f d1 vmovdqu16 %xmm1,%xmm2 + +[a-f0-9]+: 62 f1 7e 08 6f d1 vmovdqu32 %xmm1,%xmm2 + +[a-f0-9]+: 62 f1 fe 08 6f d1 vmovdqu64 %xmm1,%xmm2 + +[a-f0-9]+: 62 f1 7d 29 6f d1 vmovdqa32 %ymm1,%ymm2\{%k1\} + +[a-f0-9]+: 62 f1 fd 29 6f d1 vmovdqa64 %ymm1,%ymm2\{%k1\} + +[a-f0-9]+: 62 f1 7f 09 6f d1 vmovdqu8 %xmm1,%xmm2\{%k1\} + +[a-f0-9]+: 62 f1 ff 09 6f d1 vmovdqu16 %xmm1,%xmm2\{%k1\} + +[a-f0-9]+: 62 f1 7e 09 6f d1 vmovdqu32 %xmm1,%xmm2\{%k1\} + +[a-f0-9]+: 62 f1 fe 09 6f d1 vmovdqu64 %xmm1,%xmm2\{%k1\} + +[a-f0-9]+: 62 f1 7d 29 6f 10 vmovdqa32 \(%eax\),%ymm2\{%k1\} + +[a-f0-9]+: 62 f1 fd 29 6f 10 vmovdqa64 \(%eax\),%ymm2\{%k1\} + +[a-f0-9]+: 62 f1 7f 09 6f 10 vmovdqu8 \(%eax\),%xmm2\{%k1\} + +[a-f0-9]+: 62 f1 ff 09 6f 10 vmovdqu16 \(%eax\),%xmm2\{%k1\} + +[a-f0-9]+: 62 f1 7e 09 6f 10 vmovdqu32 \(%eax\),%xmm2\{%k1\} + +[a-f0-9]+: 62 f1 fe 09 6f 10 vmovdqu64 \(%eax\),%xmm2\{%k1\} + +[a-f0-9]+: 62 f1 7d 29 7f 08 vmovdqa32 %ymm1,\(%eax\)\{%k1\} + +[a-f0-9]+: 62 f1 fd 29 7f 08 vmovdqa64 %ymm1,\(%eax\)\{%k1\} + +[a-f0-9]+: 62 f1 7f 09 7f 08 vmovdqu8 %xmm1,\(%eax\)\{%k1\} + +[a-f0-9]+: 62 f1 ff 09 7f 08 vmovdqu16 %xmm1,\(%eax\)\{%k1\} + +[a-f0-9]+: 62 f1 7e 09 7f 08 vmovdqu32 %xmm1,\(%eax\)\{%k1\} + +[a-f0-9]+: 62 f1 fe 09 7f 08 vmovdqu64 %xmm1,\(%eax\)\{%k1\} + +[a-f0-9]+: 62 f1 7d 89 6f d1 vmovdqa32 %xmm1,%xmm2\{%k1\}\{z\} + +[a-f0-9]+: 62 f1 fd 89 6f d1 vmovdqa64 %xmm1,%xmm2\{%k1\}\{z\} + +[a-f0-9]+: 62 f1 7f 89 6f d1 vmovdqu8 %xmm1,%xmm2\{%k1\}\{z\} + +[a-f0-9]+: 62 f1 ff 89 6f d1 vmovdqu16 %xmm1,%xmm2\{%k1\}\{z\} + +[a-f0-9]+: 62 f1 7e 89 6f d1 vmovdqu32 %xmm1,%xmm2\{%k1\}\{z\} + +[a-f0-9]+: 62 f1 fe 89 6f d1 vmovdqu64 %xmm1,%xmm2\{%k1\}\{z\} #pass diff --git a/gas/testsuite/gas/i386/optimize-2.s b/gas/testsuite/gas/i386/optimize-2.s index c9b57a8dd1..0a4fb23167 100644 --- a/gas/testsuite/gas/i386/optimize-2.s +++ b/gas/testsuite/gas/i386/optimize-2.s @@ -13,3 +13,87 @@ _start: test $0x7f, %di vandnpd %zmm1, %zmm1, %zmm5 + + vmovdqa32 %xmm1, %xmm2 + vmovdqa64 %xmm1, %xmm2 + vmovdqu8 %xmm1, %xmm2 + vmovdqu16 %xmm1, %xmm2 + vmovdqu32 %xmm1, %xmm2 + vmovdqu64 %xmm1, %xmm2 + + vmovdqa32 127(%eax), %xmm2 + vmovdqa64 127(%eax), %xmm2 + vmovdqu8 127(%eax), %xmm2 + vmovdqu16 127(%eax), %xmm2 + vmovdqu32 127(%eax), %xmm2 + vmovdqu64 127(%eax), %xmm2 + + vmovdqa32 %xmm1, 128(%eax) + vmovdqa64 %xmm1, 128(%eax) + vmovdqu8 %xmm1, 128(%eax) + vmovdqu16 %xmm1, 128(%eax) + vmovdqu32 %xmm1, 128(%eax) + vmovdqu64 %xmm1, 128(%eax) + + vmovdqa32 %ymm1, %ymm2 + vmovdqa64 %ymm1, %ymm2 + vmovdqu8 %ymm1, %ymm2 + vmovdqu16 %ymm1, %ymm2 + vmovdqu32 %ymm1, %ymm2 + vmovdqu64 %ymm1, %ymm2 + + vmovdqa32 127(%eax), %ymm2 + vmovdqa64 127(%eax), %ymm2 + vmovdqu8 127(%eax), %ymm2 + vmovdqu16 127(%eax), %ymm2 + vmovdqu32 127(%eax), %ymm2 + vmovdqu64 127(%eax), %ymm2 + + vmovdqa32 %ymm1, 128(%eax) + vmovdqa64 %ymm1, 128(%eax) + vmovdqu8 %ymm1, 128(%eax) + vmovdqu16 %ymm1, 128(%eax) + vmovdqu32 %ymm1, 128(%eax) + vmovdqu64 %ymm1, 128(%eax) + + vmovdqa32 %zmm1, %zmm2 + vmovdqa64 %zmm1, %zmm2 + vmovdqu8 %zmm1, %zmm2 + vmovdqu16 %zmm1, %zmm2 + vmovdqu32 %zmm1, %zmm2 + vmovdqu64 %zmm1, %zmm2 + + {evex} vmovdqa32 %ymm1, %ymm2 + {evex} vmovdqa64 %ymm1, %ymm2 + {evex} vmovdqu8 %xmm1, %xmm2 + {evex} vmovdqu16 %xmm1, %xmm2 + {evex} vmovdqu32 %xmm1, %xmm2 + {evex} vmovdqu64 %xmm1, %xmm2 + + vmovdqa32 %ymm1, %ymm2{%k1} + vmovdqa64 %ymm1, %ymm2{%k1} + vmovdqu8 %xmm1, %xmm2{%k1} + vmovdqu16 %xmm1, %xmm2{%k1} + vmovdqu32 %xmm1, %xmm2{%k1} + vmovdqu64 %xmm1, %xmm2{%k1} + + vmovdqa32 (%eax), %ymm2{%k1} + vmovdqa64 (%eax), %ymm2{%k1} + vmovdqu8 (%eax), %xmm2{%k1} + vmovdqu16 (%eax), %xmm2{%k1} + vmovdqu32 (%eax), %xmm2{%k1} + vmovdqu64 (%eax), %xmm2{%k1} + + vmovdqa32 %ymm1, (%eax){%k1} + vmovdqa64 %ymm1, (%eax){%k1} + vmovdqu8 %xmm1, (%eax){%k1} + vmovdqu16 %xmm1, (%eax){%k1} + vmovdqu32 %xmm1, (%eax){%k1} + vmovdqu64 %xmm1, (%eax){%k1} + + vmovdqa32 %xmm1, %xmm2{%k1}{z} + vmovdqa64 %xmm1, %xmm2{%k1}{z} + vmovdqu8 %xmm1, %xmm2{%k1}{z} + vmovdqu16 %xmm1, %xmm2{%k1}{z} + vmovdqu32 %xmm1, %xmm2{%k1}{z} + vmovdqu64 %xmm1, %xmm2{%k1}{z} diff --git a/gas/testsuite/gas/i386/optimize-3.d b/gas/testsuite/gas/i386/optimize-3.d index f251a3626d..cd43243b49 100644 --- a/gas/testsuite/gas/i386/optimize-3.d +++ b/gas/testsuite/gas/i386/optimize-3.d @@ -9,4 +9,10 @@ Disassembly of section .text: 0+ <_start>: +[a-f0-9]+: a9 7f 00 00 00 test \$0x7f,%eax + +[a-f0-9]+: 62 f1 7d 28 6f d1 vmovdqa32 %ymm1,%ymm2 + +[a-f0-9]+: 62 f1 fd 28 6f d1 vmovdqa64 %ymm1,%ymm2 + +[a-f0-9]+: 62 f1 7f 08 6f d1 vmovdqu8 %xmm1,%xmm2 + +[a-f0-9]+: 62 f1 ff 08 6f d1 vmovdqu16 %xmm1,%xmm2 + +[a-f0-9]+: 62 f1 7e 08 6f d1 vmovdqu32 %xmm1,%xmm2 + +[a-f0-9]+: 62 f1 fe 08 6f d1 vmovdqu64 %xmm1,%xmm2 #pass diff --git a/gas/testsuite/gas/i386/optimize-3.s b/gas/testsuite/gas/i386/optimize-3.s index 536bf0cfb2..a70893c15d 100644 --- a/gas/testsuite/gas/i386/optimize-3.s +++ b/gas/testsuite/gas/i386/optimize-3.s @@ -4,3 +4,10 @@ .text _start: {nooptimize} testl $0x7f, %eax + + {nooptimize} vmovdqa32 %ymm1, %ymm2 + {nooptimize} vmovdqa64 %ymm1, %ymm2 + {nooptimize} vmovdqu8 %xmm1, %xmm2 + {nooptimize} vmovdqu16 %xmm1, %xmm2 + {nooptimize} vmovdqu32 %xmm1, %xmm2 + {nooptimize} vmovdqu64 %xmm1, %xmm2 diff --git a/gas/testsuite/gas/i386/optimize-4.d b/gas/testsuite/gas/i386/optimize-4.d index 9f99dadf34..2df84654d6 100644 --- a/gas/testsuite/gas/i386/optimize-4.d +++ b/gas/testsuite/gas/i386/optimize-4.d @@ -62,6 +62,42 @@ Disassembly of section .text: +[a-f0-9]+: c5 f4 47 e9 kxorw %k1,%k1,%k5 +[a-f0-9]+: c5 f4 42 e9 kandnw %k1,%k1,%k5 +[a-f0-9]+: c5 f4 42 e9 kandnw %k1,%k1,%k5 + +[a-f0-9]+: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 + +[a-f0-9]+: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 f9 6f 50 7f vmovdqa 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 f9 6f 50 7f vmovdqa 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 f9 7f 88 80 00 00 00 vmovdqa %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 f9 7f 88 80 00 00 00 vmovdqa %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fd 6f d1 vmovdqa %ymm1,%ymm2 + +[a-f0-9]+: c5 fd 6f d1 vmovdqa %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fd 6f 50 7f vmovdqa 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fd 6f 50 7f vmovdqa 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fd 7f 88 80 00 00 00 vmovdqa %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fd 7f 88 80 00 00 00 vmovdqa %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%eax\) +[a-f0-9]+: 62 f1 f5 08 55 e9 vandnpd %xmm1,%xmm1,%xmm5 +[a-f0-9]+: 62 f1 f5 08 55 e9 vandnpd %xmm1,%xmm1,%xmm5 #pass diff --git a/gas/testsuite/gas/i386/optimize-5.d b/gas/testsuite/gas/i386/optimize-5.d index cfd0df04a4..ecc1ab139a 100644 --- a/gas/testsuite/gas/i386/optimize-5.d +++ b/gas/testsuite/gas/i386/optimize-5.d @@ -62,6 +62,48 @@ Disassembly of section .text: +[a-f0-9]+: c5 f4 47 e9 kxorw %k1,%k1,%k5 +[a-f0-9]+: c5 f4 42 e9 kandnw %k1,%k1,%k5 +[a-f0-9]+: c5 f4 42 e9 kandnw %k1,%k1,%k5 + +[a-f0-9]+: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 + +[a-f0-9]+: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 f9 6f 50 7f vmovdqa 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 f9 6f 50 7f vmovdqa 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%eax\),%xmm2 + +[a-f0-9]+: c5 f9 7f 88 80 00 00 00 vmovdqa %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 f9 7f 88 80 00 00 00 vmovdqa %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%eax\) + +[a-f0-9]+: c5 fd 6f d1 vmovdqa %ymm1,%ymm2 + +[a-f0-9]+: c5 fd 6f d1 vmovdqa %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fd 6f 50 7f vmovdqa 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fd 6f 50 7f vmovdqa 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%eax\),%ymm2 + +[a-f0-9]+: c5 fd 7f 88 80 00 00 00 vmovdqa %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fd 7f 88 80 00 00 00 vmovdqa %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%eax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%eax\) +[a-f0-9]+: 62 f1 f5 08 55 e9 vandnpd %xmm1,%xmm1,%xmm5 +[a-f0-9]+: 62 f1 f5 08 55 e9 vandnpd %xmm1,%xmm1,%xmm5 + +[a-f0-9]+: 62 f1 7d 28 6f d1 vmovdqa32 %ymm1,%ymm2 + +[a-f0-9]+: 62 f1 fd 28 6f d1 vmovdqa64 %ymm1,%ymm2 + +[a-f0-9]+: 62 f1 7f 08 6f d1 vmovdqu8 %xmm1,%xmm2 + +[a-f0-9]+: 62 f1 ff 08 6f d1 vmovdqu16 %xmm1,%xmm2 + +[a-f0-9]+: 62 f1 7e 08 6f d1 vmovdqu32 %xmm1,%xmm2 + +[a-f0-9]+: 62 f1 fe 08 6f d1 vmovdqu64 %xmm1,%xmm2 #pass diff --git a/gas/testsuite/gas/i386/optimize-5.s b/gas/testsuite/gas/i386/optimize-5.s index 66c762bd3b..77d60edb69 100644 --- a/gas/testsuite/gas/i386/optimize-5.s +++ b/gas/testsuite/gas/i386/optimize-5.s @@ -6,3 +6,10 @@ {evex} vandnpd %zmm1, %zmm1, %zmm5 {evex} vandnpd %ymm1, %ymm1, %ymm5 + + {evex} vmovdqa32 %ymm1, %ymm2 + {evex} vmovdqa64 %ymm1, %ymm2 + {evex} vmovdqu8 %xmm1, %xmm2 + {evex} vmovdqu16 %xmm1, %xmm2 + {evex} vmovdqu32 %xmm1, %xmm2 + {evex} vmovdqu64 %xmm1, %xmm2 diff --git a/gas/testsuite/gas/i386/optimize-7.d b/gas/testsuite/gas/i386/optimize-7.d new file mode 100644 index 0000000000..92ca7a6c75 --- /dev/null +++ b/gas/testsuite/gas/i386/optimize-7.d @@ -0,0 +1,12 @@ +#as: -O2 -march=+noavx +#objdump: -drw +#name: optimized encoding 7 with -O2 + +.*: +file format .* + + +Disassembly of section .text: + +0+ <_start>: + +[a-f0-9]+: 62 f1 7d 28 6f d1 vmovdqa32 %ymm1,%ymm2 +#pass diff --git a/gas/testsuite/gas/i386/optimize-7.s b/gas/testsuite/gas/i386/optimize-7.s new file mode 100644 index 0000000000..261b4afa27 --- /dev/null +++ b/gas/testsuite/gas/i386/optimize-7.s @@ -0,0 +1,6 @@ +# Check instructions with optimized encoding + + .allow_index_reg + .text +_start: + vmovdqa32 %ymm1, %ymm2 diff --git a/gas/testsuite/gas/i386/x86-64-optimize-2.d b/gas/testsuite/gas/i386/x86-64-optimize-2.d index fa031e8893..7d7340fae0 100644 --- a/gas/testsuite/gas/i386/x86-64-optimize-2.d +++ b/gas/testsuite/gas/i386/x86-64-optimize-2.d @@ -106,4 +106,52 @@ Disassembly of section .text: +[a-f0-9]+: 62 e1 f5 08 fb c1 vpsubq %xmm1,%xmm1,%xmm16 +[a-f0-9]+: 62 b1 f5 00 fb c9 vpsubq %xmm17,%xmm17,%xmm1 +[a-f0-9]+: 62 b1 f5 00 fb c9 vpsubq %xmm17,%xmm17,%xmm1 + +[a-f0-9]+: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 + +[a-f0-9]+: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c4 41 79 6f e3 vmovdqa %xmm11,%xmm12 + +[a-f0-9]+: c4 41 79 6f e3 vmovdqa %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c5 f9 6f 50 7f vmovdqa 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 f9 6f 50 7f vmovdqa 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 f9 7f 88 80 00 00 00 vmovdqa %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 f9 7f 88 80 00 00 00 vmovdqa %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fd 6f d1 vmovdqa %ymm1,%ymm2 + +[a-f0-9]+: c5 fd 6f d1 vmovdqa %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c4 41 7d 6f e3 vmovdqa %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7d 6f e3 vmovdqa %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c5 fd 6f 50 7f vmovdqa 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fd 6f 50 7f vmovdqa 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fd 7f 88 80 00 00 00 vmovdqa %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fd 7f 88 80 00 00 00 vmovdqa %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%rax\) #pass diff --git a/gas/testsuite/gas/i386/x86-64-optimize-2.s b/gas/testsuite/gas/i386/x86-64-optimize-2.s index 10ce788ffb..1275610e55 100644 --- a/gas/testsuite/gas/i386/x86-64-optimize-2.s +++ b/gas/testsuite/gas/i386/x86-64-optimize-2.s @@ -114,3 +114,59 @@ _start: vpsubq %ymm1, %ymm1, %ymm16 vpsubq %zmm17, %zmm17, %zmm1 vpsubq %ymm17, %ymm17, %ymm1 + + vmovdqa32 %xmm1, %xmm2 + vmovdqa64 %xmm1, %xmm2 + vmovdqu8 %xmm1, %xmm2 + vmovdqu16 %xmm1, %xmm2 + vmovdqu32 %xmm1, %xmm2 + vmovdqu64 %xmm1, %xmm2 + + vmovdqa32 %xmm11, %xmm12 + vmovdqa64 %xmm11, %xmm12 + vmovdqu8 %xmm11, %xmm12 + vmovdqu16 %xmm11, %xmm12 + vmovdqu32 %xmm11, %xmm12 + vmovdqu64 %xmm11, %xmm12 + + vmovdqa32 127(%rax), %xmm2 + vmovdqa64 127(%rax), %xmm2 + vmovdqu8 127(%rax), %xmm2 + vmovdqu16 127(%rax), %xmm2 + vmovdqu32 127(%rax), %xmm2 + vmovdqu64 127(%rax), %xmm2 + + vmovdqa32 %xmm1, 128(%rax) + vmovdqa64 %xmm1, 128(%rax) + vmovdqu8 %xmm1, 128(%rax) + vmovdqu16 %xmm1, 128(%rax) + vmovdqu32 %xmm1, 128(%rax) + vmovdqu64 %xmm1, 128(%rax) + + vmovdqa32 %ymm1, %ymm2 + vmovdqa64 %ymm1, %ymm2 + vmovdqu8 %ymm1, %ymm2 + vmovdqu16 %ymm1, %ymm2 + vmovdqu32 %ymm1, %ymm2 + vmovdqu64 %ymm1, %ymm2 + + vmovdqa32 %ymm11, %ymm12 + vmovdqa64 %ymm11, %ymm12 + vmovdqu8 %ymm11, %ymm12 + vmovdqu16 %ymm11, %ymm12 + vmovdqu32 %ymm11, %ymm12 + vmovdqu64 %ymm11, %ymm12 + + vmovdqa32 127(%rax), %ymm2 + vmovdqa64 127(%rax), %ymm2 + vmovdqu8 127(%rax), %ymm2 + vmovdqu16 127(%rax), %ymm2 + vmovdqu32 127(%rax), %ymm2 + vmovdqu64 127(%rax), %ymm2 + + vmovdqa32 %ymm1, 128(%rax) + vmovdqa64 %ymm1, 128(%rax) + vmovdqu8 %ymm1, 128(%rax) + vmovdqu16 %ymm1, 128(%rax) + vmovdqu32 %ymm1, 128(%rax) + vmovdqu64 %ymm1, 128(%rax) diff --git a/gas/testsuite/gas/i386/x86-64-optimize-2a.d b/gas/testsuite/gas/i386/x86-64-optimize-2a.d index 9c6466d4ae..532a1458bc 100644 --- a/gas/testsuite/gas/i386/x86-64-optimize-2a.d +++ b/gas/testsuite/gas/i386/x86-64-optimize-2a.d @@ -107,4 +107,52 @@ Disassembly of section .text: +[a-f0-9]+: 62 e1 f5 28 fb c1 vpsubq %ymm1,%ymm1,%ymm16 +[a-f0-9]+: 62 b1 f5 40 fb c9 vpsubq %zmm17,%zmm17,%zmm1 +[a-f0-9]+: 62 b1 f5 20 fb c9 vpsubq %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 + +[a-f0-9]+: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c4 41 79 6f e3 vmovdqa %xmm11,%xmm12 + +[a-f0-9]+: c4 41 79 6f e3 vmovdqa %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c5 f9 6f 50 7f vmovdqa 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 f9 6f 50 7f vmovdqa 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 f9 7f 88 80 00 00 00 vmovdqa %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 f9 7f 88 80 00 00 00 vmovdqa %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fd 6f d1 vmovdqa %ymm1,%ymm2 + +[a-f0-9]+: c5 fd 6f d1 vmovdqa %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c4 41 7d 6f e3 vmovdqa %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7d 6f e3 vmovdqa %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c5 fd 6f 50 7f vmovdqa 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fd 6f 50 7f vmovdqa 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fd 7f 88 80 00 00 00 vmovdqa %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fd 7f 88 80 00 00 00 vmovdqa %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%rax\) #pass diff --git a/gas/testsuite/gas/i386/x86-64-optimize-3.d b/gas/testsuite/gas/i386/x86-64-optimize-3.d index f85c0af05e..74336a4fe2 100644 --- a/gas/testsuite/gas/i386/x86-64-optimize-3.d +++ b/gas/testsuite/gas/i386/x86-64-optimize-3.d @@ -25,4 +25,94 @@ Disassembly of section .text: +[a-f0-9]+: 41 f6 c1 7f test \$0x7f,%r9b +[a-f0-9]+: 41 f6 c1 7f test \$0x7f,%r9b +[a-f0-9]+: c5 f1 55 e9 vandnpd %xmm1,%xmm1,%xmm5 + +[a-f0-9]+: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 + +[a-f0-9]+: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c4 41 79 6f e3 vmovdqa %xmm11,%xmm12 + +[a-f0-9]+: c4 41 79 6f e3 vmovdqa %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c5 f9 6f 50 7f vmovdqa 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 f9 6f 50 7f vmovdqa 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 f9 7f 88 80 00 00 00 vmovdqa %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 f9 7f 88 80 00 00 00 vmovdqa %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fd 6f d1 vmovdqa %ymm1,%ymm2 + +[a-f0-9]+: c5 fd 6f d1 vmovdqa %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c4 41 7d 6f e3 vmovdqa %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7d 6f e3 vmovdqa %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c5 fd 6f 50 7f vmovdqa 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fd 6f 50 7f vmovdqa 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fd 7f 88 80 00 00 00 vmovdqa %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fd 7f 88 80 00 00 00 vmovdqa %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%rax\) + +[a-f0-9]+: 62 b1 7d 08 6f d5 vmovdqa32 %xmm21,%xmm2 + +[a-f0-9]+: 62 b1 fd 08 6f d5 vmovdqa64 %xmm21,%xmm2 + +[a-f0-9]+: 62 b1 7f 08 6f d5 vmovdqu8 %xmm21,%xmm2 + +[a-f0-9]+: 62 b1 ff 08 6f d5 vmovdqu16 %xmm21,%xmm2 + +[a-f0-9]+: 62 b1 7e 08 6f d5 vmovdqu32 %xmm21,%xmm2 + +[a-f0-9]+: 62 b1 fe 08 6f d5 vmovdqu64 %xmm21,%xmm2 + +[a-f0-9]+: 62 f1 7d 48 6f d1 vmovdqa32 %zmm1,%zmm2 + +[a-f0-9]+: 62 f1 fd 48 6f d1 vmovdqa64 %zmm1,%zmm2 + +[a-f0-9]+: 62 f1 7f 48 6f d1 vmovdqu8 %zmm1,%zmm2 + +[a-f0-9]+: 62 f1 ff 48 6f d1 vmovdqu16 %zmm1,%zmm2 + +[a-f0-9]+: 62 f1 7e 48 6f d1 vmovdqu32 %zmm1,%zmm2 + +[a-f0-9]+: 62 f1 fe 48 6f d1 vmovdqu64 %zmm1,%zmm2 + +[a-f0-9]+: 62 f1 7d 28 6f d1 vmovdqa32 %ymm1,%ymm2 + +[a-f0-9]+: 62 f1 fd 28 6f d1 vmovdqa64 %ymm1,%ymm2 + +[a-f0-9]+: 62 f1 7f 08 6f d1 vmovdqu8 %xmm1,%xmm2 + +[a-f0-9]+: 62 f1 ff 08 6f d1 vmovdqu16 %xmm1,%xmm2 + +[a-f0-9]+: 62 f1 7e 08 6f d1 vmovdqu32 %xmm1,%xmm2 + +[a-f0-9]+: 62 f1 fe 08 6f d1 vmovdqu64 %xmm1,%xmm2 + +[a-f0-9]+: 62 f1 7d 29 6f d1 vmovdqa32 %ymm1,%ymm2\{%k1\} + +[a-f0-9]+: 62 f1 fd 29 6f d1 vmovdqa64 %ymm1,%ymm2\{%k1\} + +[a-f0-9]+: 62 f1 7f 09 6f d1 vmovdqu8 %xmm1,%xmm2\{%k1\} + +[a-f0-9]+: 62 f1 ff 09 6f d1 vmovdqu16 %xmm1,%xmm2\{%k1\} + +[a-f0-9]+: 62 f1 7e 09 6f d1 vmovdqu32 %xmm1,%xmm2\{%k1\} + +[a-f0-9]+: 62 f1 fe 09 6f d1 vmovdqu64 %xmm1,%xmm2\{%k1\} + +[a-f0-9]+: 62 f1 7d 29 6f 10 vmovdqa32 \(%rax\),%ymm2\{%k1\} + +[a-f0-9]+: 62 f1 fd 29 6f 10 vmovdqa64 \(%rax\),%ymm2\{%k1\} + +[a-f0-9]+: 62 f1 7f 09 6f 10 vmovdqu8 \(%rax\),%xmm2\{%k1\} + +[a-f0-9]+: 62 f1 ff 09 6f 10 vmovdqu16 \(%rax\),%xmm2\{%k1\} + +[a-f0-9]+: 62 f1 7e 09 6f 10 vmovdqu32 \(%rax\),%xmm2\{%k1\} + +[a-f0-9]+: 62 f1 fe 09 6f 10 vmovdqu64 \(%rax\),%xmm2\{%k1\} + +[a-f0-9]+: 62 f1 7d 29 7f 08 vmovdqa32 %ymm1,\(%rax\)\{%k1\} + +[a-f0-9]+: 62 f1 fd 29 7f 08 vmovdqa64 %ymm1,\(%rax\)\{%k1\} + +[a-f0-9]+: 62 f1 7f 09 7f 08 vmovdqu8 %xmm1,\(%rax\)\{%k1\} + +[a-f0-9]+: 62 f1 ff 09 7f 08 vmovdqu16 %xmm1,\(%rax\)\{%k1\} + +[a-f0-9]+: 62 f1 7e 09 7f 08 vmovdqu32 %xmm1,\(%rax\)\{%k1\} + +[a-f0-9]+: 62 f1 fe 09 7f 08 vmovdqu64 %xmm1,\(%rax\)\{%k1\} + +[a-f0-9]+: 62 f1 7d 89 6f d1 vmovdqa32 %xmm1,%xmm2\{%k1\}\{z\} + +[a-f0-9]+: 62 f1 fd 89 6f d1 vmovdqa64 %xmm1,%xmm2\{%k1\}\{z\} + +[a-f0-9]+: 62 f1 7f 89 6f d1 vmovdqu8 %xmm1,%xmm2\{%k1\}\{z\} + +[a-f0-9]+: 62 f1 ff 89 6f d1 vmovdqu16 %xmm1,%xmm2\{%k1\}\{z\} + +[a-f0-9]+: 62 f1 7e 89 6f d1 vmovdqu32 %xmm1,%xmm2\{%k1\}\{z\} + +[a-f0-9]+: 62 f1 fe 89 6f d1 vmovdqu64 %xmm1,%xmm2\{%k1\}\{z\} #pass diff --git a/gas/testsuite/gas/i386/x86-64-optimize-3.s b/gas/testsuite/gas/i386/x86-64-optimize-3.s index 4a52a25ddd..d9c2eb86cb 100644 --- a/gas/testsuite/gas/i386/x86-64-optimize-3.s +++ b/gas/testsuite/gas/i386/x86-64-optimize-3.s @@ -21,3 +21,108 @@ _start: test $0x7f, %r9b vandnpd %zmm1, %zmm1, %zmm5 + + vmovdqa32 %xmm1, %xmm2 + vmovdqa64 %xmm1, %xmm2 + vmovdqu8 %xmm1, %xmm2 + vmovdqu16 %xmm1, %xmm2 + vmovdqu32 %xmm1, %xmm2 + vmovdqu64 %xmm1, %xmm2 + + vmovdqa32 %xmm11, %xmm12 + vmovdqa64 %xmm11, %xmm12 + vmovdqu8 %xmm11, %xmm12 + vmovdqu16 %xmm11, %xmm12 + vmovdqu32 %xmm11, %xmm12 + vmovdqu64 %xmm11, %xmm12 + + vmovdqa32 127(%rax), %xmm2 + vmovdqa64 127(%rax), %xmm2 + vmovdqu8 127(%rax), %xmm2 + vmovdqu16 127(%rax), %xmm2 + vmovdqu32 127(%rax), %xmm2 + vmovdqu64 127(%rax), %xmm2 + + vmovdqa32 %xmm1, 128(%rax) + vmovdqa64 %xmm1, 128(%rax) + vmovdqu8 %xmm1, 128(%rax) + vmovdqu16 %xmm1, 128(%rax) + vmovdqu32 %xmm1, 128(%rax) + vmovdqu64 %xmm1, 128(%rax) + + vmovdqa32 %ymm1, %ymm2 + vmovdqa64 %ymm1, %ymm2 + vmovdqu8 %ymm1, %ymm2 + vmovdqu16 %ymm1, %ymm2 + vmovdqu32 %ymm1, %ymm2 + vmovdqu64 %ymm1, %ymm2 + + vmovdqa32 %ymm11, %ymm12 + vmovdqa64 %ymm11, %ymm12 + vmovdqu8 %ymm11, %ymm12 + vmovdqu16 %ymm11, %ymm12 + vmovdqu32 %ymm11, %ymm12 + vmovdqu64 %ymm11, %ymm12 + + vmovdqa32 127(%rax), %ymm2 + vmovdqa64 127(%rax), %ymm2 + vmovdqu8 127(%rax), %ymm2 + vmovdqu16 127(%rax), %ymm2 + vmovdqu32 127(%rax), %ymm2 + vmovdqu64 127(%rax), %ymm2 + + vmovdqa32 %ymm1, 128(%rax) + vmovdqa64 %ymm1, 128(%rax) + vmovdqu8 %ymm1, 128(%rax) + vmovdqu16 %ymm1, 128(%rax) + vmovdqu32 %ymm1, 128(%rax) + vmovdqu64 %ymm1, 128(%rax) + + vmovdqa32 %xmm21, %xmm2 + vmovdqa64 %xmm21, %xmm2 + vmovdqu8 %xmm21, %xmm2 + vmovdqu16 %xmm21, %xmm2 + vmovdqu32 %xmm21, %xmm2 + vmovdqu64 %xmm21, %xmm2 + + vmovdqa32 %zmm1, %zmm2 + vmovdqa64 %zmm1, %zmm2 + vmovdqu8 %zmm1, %zmm2 + vmovdqu16 %zmm1, %zmm2 + vmovdqu32 %zmm1, %zmm2 + vmovdqu64 %zmm1, %zmm2 + + {evex} vmovdqa32 %ymm1, %ymm2 + {evex} vmovdqa64 %ymm1, %ymm2 + {evex} vmovdqu8 %xmm1, %xmm2 + {evex} vmovdqu16 %xmm1, %xmm2 + {evex} vmovdqu32 %xmm1, %xmm2 + {evex} vmovdqu64 %xmm1, %xmm2 + + vmovdqa32 %ymm1, %ymm2{%k1} + vmovdqa64 %ymm1, %ymm2{%k1} + vmovdqu8 %xmm1, %xmm2{%k1} + vmovdqu16 %xmm1, %xmm2{%k1} + vmovdqu32 %xmm1, %xmm2{%k1} + vmovdqu64 %xmm1, %xmm2{%k1} + + vmovdqa32 (%rax), %ymm2{%k1} + vmovdqa64 (%rax), %ymm2{%k1} + vmovdqu8 (%rax), %xmm2{%k1} + vmovdqu16 (%rax), %xmm2{%k1} + vmovdqu32 (%rax), %xmm2{%k1} + vmovdqu64 (%rax), %xmm2{%k1} + + vmovdqa32 %ymm1, (%rax){%k1} + vmovdqa64 %ymm1, (%rax){%k1} + vmovdqu8 %xmm1, (%rax){%k1} + vmovdqu16 %xmm1, (%rax){%k1} + vmovdqu32 %xmm1, (%rax){%k1} + vmovdqu64 %xmm1, (%rax){%k1} + + vmovdqa32 %xmm1, %xmm2{%k1}{z} + vmovdqa64 %xmm1, %xmm2{%k1}{z} + vmovdqu8 %xmm1, %xmm2{%k1}{z} + vmovdqu16 %xmm1, %xmm2{%k1}{z} + vmovdqu32 %xmm1, %xmm2{%k1}{z} + vmovdqu64 %xmm1, %xmm2{%k1}{z} diff --git a/gas/testsuite/gas/i386/x86-64-optimize-4.d b/gas/testsuite/gas/i386/x86-64-optimize-4.d index 10e7b02d3a..18fdeb1442 100644 --- a/gas/testsuite/gas/i386/x86-64-optimize-4.d +++ b/gas/testsuite/gas/i386/x86-64-optimize-4.d @@ -9,4 +9,10 @@ Disassembly of section .text: 0+ <_start>: +[a-f0-9]+: a9 7f 00 00 00 test \$0x7f,%eax + +[a-f0-9]+: 62 f1 7d 28 6f d1 vmovdqa32 %ymm1,%ymm2 + +[a-f0-9]+: 62 f1 fd 28 6f d1 vmovdqa64 %ymm1,%ymm2 + +[a-f0-9]+: 62 f1 7f 08 6f d1 vmovdqu8 %xmm1,%xmm2 + +[a-f0-9]+: 62 f1 ff 08 6f d1 vmovdqu16 %xmm1,%xmm2 + +[a-f0-9]+: 62 f1 7e 08 6f d1 vmovdqu32 %xmm1,%xmm2 + +[a-f0-9]+: 62 f1 fe 08 6f d1 vmovdqu64 %xmm1,%xmm2 #pass diff --git a/gas/testsuite/gas/i386/x86-64-optimize-4.s b/gas/testsuite/gas/i386/x86-64-optimize-4.s index 0c4fdcecc5..b6d872db2c 100644 --- a/gas/testsuite/gas/i386/x86-64-optimize-4.s +++ b/gas/testsuite/gas/i386/x86-64-optimize-4.s @@ -4,3 +4,10 @@ .text _start: {nooptimize} testl $0x7f, %eax + + {nooptimize} vmovdqa32 %ymm1, %ymm2 + {nooptimize} vmovdqa64 %ymm1, %ymm2 + {nooptimize} vmovdqu8 %xmm1, %xmm2 + {nooptimize} vmovdqu16 %xmm1, %xmm2 + {nooptimize} vmovdqu32 %xmm1, %xmm2 + {nooptimize} vmovdqu64 %xmm1, %xmm2 diff --git a/gas/testsuite/gas/i386/x86-64-optimize-5.d b/gas/testsuite/gas/i386/x86-64-optimize-5.d index 085f7f29f2..012237df57 100644 --- a/gas/testsuite/gas/i386/x86-64-optimize-5.d +++ b/gas/testsuite/gas/i386/x86-64-optimize-5.d @@ -106,6 +106,60 @@ Disassembly of section .text: +[a-f0-9]+: 62 e1 f5 08 fb c1 vpsubq %xmm1,%xmm1,%xmm16 +[a-f0-9]+: 62 b1 f5 00 fb c9 vpsubq %xmm17,%xmm17,%xmm1 +[a-f0-9]+: 62 b1 f5 00 fb c9 vpsubq %xmm17,%xmm17,%xmm1 + +[a-f0-9]+: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 + +[a-f0-9]+: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c4 41 79 6f e3 vmovdqa %xmm11,%xmm12 + +[a-f0-9]+: c4 41 79 6f e3 vmovdqa %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c5 f9 6f 50 7f vmovdqa 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 f9 6f 50 7f vmovdqa 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 f9 7f 88 80 00 00 00 vmovdqa %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 f9 7f 88 80 00 00 00 vmovdqa %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fd 6f d1 vmovdqa %ymm1,%ymm2 + +[a-f0-9]+: c5 fd 6f d1 vmovdqa %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c4 41 7d 6f e3 vmovdqa %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7d 6f e3 vmovdqa %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c5 fd 6f 50 7f vmovdqa 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fd 6f 50 7f vmovdqa 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fd 7f 88 80 00 00 00 vmovdqa %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fd 7f 88 80 00 00 00 vmovdqa %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%rax\) +[a-f0-9]+: 62 f1 f5 08 55 e9 vandnpd %xmm1,%xmm1,%xmm5 +[a-f0-9]+: 62 f1 f5 08 55 e9 vandnpd %xmm1,%xmm1,%xmm5 + +[a-f0-9]+: 62 f1 7d 28 6f d1 vmovdqa32 %ymm1,%ymm2 + +[a-f0-9]+: 62 f1 fd 28 6f d1 vmovdqa64 %ymm1,%ymm2 + +[a-f0-9]+: 62 f1 7f 08 6f d1 vmovdqu8 %xmm1,%xmm2 + +[a-f0-9]+: 62 f1 ff 08 6f d1 vmovdqu16 %xmm1,%xmm2 + +[a-f0-9]+: 62 f1 7e 08 6f d1 vmovdqu32 %xmm1,%xmm2 + +[a-f0-9]+: 62 f1 fe 08 6f d1 vmovdqu64 %xmm1,%xmm2 #pass diff --git a/gas/testsuite/gas/i386/x86-64-optimize-5.s b/gas/testsuite/gas/i386/x86-64-optimize-5.s index 6b4ff103ab..9756ae815c 100644 --- a/gas/testsuite/gas/i386/x86-64-optimize-5.s +++ b/gas/testsuite/gas/i386/x86-64-optimize-5.s @@ -4,3 +4,10 @@ {evex} vandnpd %zmm1, %zmm1, %zmm5 {evex} vandnpd %ymm1, %ymm1, %ymm5 + + {evex} vmovdqa32 %ymm1, %ymm2 + {evex} vmovdqa64 %ymm1, %ymm2 + {evex} vmovdqu8 %xmm1, %xmm2 + {evex} vmovdqu16 %xmm1, %xmm2 + {evex} vmovdqu32 %xmm1, %xmm2 + {evex} vmovdqu64 %xmm1, %xmm2 diff --git a/gas/testsuite/gas/i386/x86-64-optimize-6.d b/gas/testsuite/gas/i386/x86-64-optimize-6.d index 0d52c8fcbb..aca119e4f9 100644 --- a/gas/testsuite/gas/i386/x86-64-optimize-6.d +++ b/gas/testsuite/gas/i386/x86-64-optimize-6.d @@ -106,6 +106,60 @@ Disassembly of section .text: +[a-f0-9]+: 62 e1 f5 08 fb c1 vpsubq %xmm1,%xmm1,%xmm16 +[a-f0-9]+: 62 b1 f5 00 fb c9 vpsubq %xmm17,%xmm17,%xmm1 +[a-f0-9]+: 62 b1 f5 00 fb c9 vpsubq %xmm17,%xmm17,%xmm1 + +[a-f0-9]+: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 + +[a-f0-9]+: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c4 41 79 6f e3 vmovdqa %xmm11,%xmm12 + +[a-f0-9]+: c4 41 79 6f e3 vmovdqa %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c5 f9 6f 50 7f vmovdqa 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 f9 6f 50 7f vmovdqa 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 f9 7f 88 80 00 00 00 vmovdqa %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 f9 7f 88 80 00 00 00 vmovdqa %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fa 7f 88 80 00 00 00 vmovdqu %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fd 6f d1 vmovdqa %ymm1,%ymm2 + +[a-f0-9]+: c5 fd 6f d1 vmovdqa %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c4 41 7d 6f e3 vmovdqa %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7d 6f e3 vmovdqa %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c5 fd 6f 50 7f vmovdqa 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fd 6f 50 7f vmovdqa 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fd 7f 88 80 00 00 00 vmovdqa %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fd 7f 88 80 00 00 00 vmovdqa %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%rax\) + +[a-f0-9]+: c5 fe 7f 88 80 00 00 00 vmovdqu %ymm1,0x80\(%rax\) +[a-f0-9]+: 62 f1 f5 08 55 e9 vandnpd %xmm1,%xmm1,%xmm5 +[a-f0-9]+: 62 f1 f5 08 55 e9 vandnpd %xmm1,%xmm1,%xmm5 + +[a-f0-9]+: 62 f1 7d 28 6f d1 vmovdqa32 %ymm1,%ymm2 + +[a-f0-9]+: 62 f1 fd 28 6f d1 vmovdqa64 %ymm1,%ymm2 + +[a-f0-9]+: 62 f1 7f 08 6f d1 vmovdqu8 %xmm1,%xmm2 + +[a-f0-9]+: 62 f1 ff 08 6f d1 vmovdqu16 %xmm1,%xmm2 + +[a-f0-9]+: 62 f1 7e 08 6f d1 vmovdqu32 %xmm1,%xmm2 + +[a-f0-9]+: 62 f1 fe 08 6f d1 vmovdqu64 %xmm1,%xmm2 #pass diff --git a/gas/testsuite/gas/i386/x86-64-optimize-6.s b/gas/testsuite/gas/i386/x86-64-optimize-6.s index 70ccbc41be..7c403fcc86 100644 --- a/gas/testsuite/gas/i386/x86-64-optimize-6.s +++ b/gas/testsuite/gas/i386/x86-64-optimize-6.s @@ -6,3 +6,10 @@ {evex} vandnpd %zmm1, %zmm1, %zmm5 {evex} vandnpd %ymm1, %ymm1, %ymm5 + + {evex} vmovdqa32 %ymm1, %ymm2 + {evex} vmovdqa64 %ymm1, %ymm2 + {evex} vmovdqu8 %xmm1, %xmm2 + {evex} vmovdqu16 %xmm1, %xmm2 + {evex} vmovdqu32 %xmm1, %xmm2 + {evex} vmovdqu64 %xmm1, %xmm2 diff --git a/gas/testsuite/gas/i386/x86-64-optimize-8.d b/gas/testsuite/gas/i386/x86-64-optimize-8.d new file mode 100644 index 0000000000..46efa5229d --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-optimize-8.d @@ -0,0 +1,12 @@ +#as: -O2 -march=+noavx +#objdump: -drw +#name: x86-64 optimized encoding 8 with -O2 + +.*: +file format .* + + +Disassembly of section .text: + +0+ <_start>: + +[a-f0-9]+: 62 f1 7d 28 6f d1 vmovdqa32 %ymm1,%ymm2 +#pass diff --git a/gas/testsuite/gas/i386/x86-64-optimize-8.s b/gas/testsuite/gas/i386/x86-64-optimize-8.s new file mode 100644 index 0000000000..4b9865a91b --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-optimize-8.s @@ -0,0 +1,6 @@ +# Check 64bit instructions with optimized encoding + + .allow_index_reg + .text +_start: + vmovdqa32 %ymm1, %ymm2 diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl index 1194dcd1c0..26a68d8cbe 100644 --- a/opcodes/i386-opc.tbl +++ b/opcodes/i386-opc.tbl @@ -3709,11 +3709,11 @@ vmovd, 2, 0x666E, None, 1, CpuAVX512F, D|Modrm|EVex=2|VexOpcode=0|Disp8MemShift= vmovddup, 2, 0xF212, None, 1, CpuAVX512F, Modrm|Masking=3|VexOpcode=0|VexW=2|Disp8ShiftVL|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegYMM|RegZMM|Unspecified|BaseIndex, RegYMM|RegZMM } -vmovdqa64, 2, 0x666F, None, 1, CpuAVX512F, D|Modrm|MaskingMorZ|VexOpcode=0|VexW=2|Disp8ShiftVL|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vmovdqa32, 2, 0x666F, None, 1, CpuAVX512F, D|Modrm|MaskingMorZ|VexOpcode=0|VexW=1|Disp8ShiftVL|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vmovdqa64, 2, 0x666F, None, 1, CpuAVX512F, D|Modrm|MaskingMorZ|VexOpcode=0|VexW=2|Disp8ShiftVL|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vmovdqa32, 2, 0x666F, None, 1, CpuAVX512F, D|Modrm|MaskingMorZ|VexOpcode=0|VexW=1|Disp8ShiftVL|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } vmovntdq, 2, 0x66E7, None, 1, CpuAVX512F, Modrm|VexOpcode=0|VexW=1|Disp8ShiftVL|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|RegZMM, XMMword|YMMword|ZMMword|Unspecified|BaseIndex } -vmovdqu32, 2, 0xF36F, None, 1, CpuAVX512F, D|Modrm|MaskingMorZ|VexOpcode=0|VexW=1|Disp8ShiftVL|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vmovdqu64, 2, 0xF36F, None, 1, CpuAVX512F, D|Modrm|MaskingMorZ|VexOpcode=0|VexW=2|Disp8ShiftVL|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vmovdqu32, 2, 0xF36F, None, 1, CpuAVX512F, D|Modrm|MaskingMorZ|VexOpcode=0|VexW=1|Disp8ShiftVL|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vmovdqu64, 2, 0xF36F, None, 1, CpuAVX512F, D|Modrm|MaskingMorZ|VexOpcode=0|VexW=2|Disp8ShiftVL|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } vmovhlps, 3, 0x12, None, 1, CpuAVX512F, Modrm|EVex=4|VexOpcode=0|VexVVVV=1|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM, RegXMM } vmovlhps, 3, 0x16, None, 1, CpuAVX512F, Modrm|EVex=4|VexOpcode=0|VexVVVV=1|VexW=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM, RegXMM, RegXMM } @@ -4190,8 +4190,8 @@ kshiftrq, 3, 0x6631, None, 1, CpuAVX512BW, Modrm|Vex=1|VexOpcode=2|VexW=2|No_bSu vdbpsadbw, 4, 0x6642, None, 1, CpuAVX512BW, Modrm|Masking=3|VexOpcode=2|VexVVVV=1|VexW=1|Disp8ShiftVL|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vmovdqu8, 2, 0xF26F, None, 1, CpuAVX512BW, D|Modrm|MaskingMorZ|VexOpcode=0|VexW=1|Disp8ShiftVL|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vmovdqu16, 2, 0xF26F, None, 1, CpuAVX512BW, D|Modrm|MaskingMorZ|VexOpcode=0|VexW=2|Disp8ShiftVL|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vmovdqu8, 2, 0xF26F, None, 1, CpuAVX512BW, D|Modrm|MaskingMorZ|VexOpcode=0|VexW=1|Disp8ShiftVL|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vmovdqu16, 2, 0xF26F, None, 1, CpuAVX512BW, D|Modrm|MaskingMorZ|VexOpcode=0|VexW=2|Disp8ShiftVL|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|Optimize, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } vpabsb, 2, 0x661C, None, 1, CpuAVX512BW, Modrm|Masking=3|VexOpcode=1|VexWIG|Disp8ShiftVL|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } vpmaxsb, 3, 0x663C, None, 1, CpuAVX512BW, Modrm|Masking=3|VexOpcode=1|VexWIG|VexVVVV=1|Disp8ShiftVL|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } diff --git a/opcodes/i386-tbl.h b/opcodes/i386-tbl.h index 81575df3f2..bd33eb5ce5 100644 --- a/opcodes/i386-tbl.h +++ b/opcodes/i386-tbl.h @@ -60123,7 +60123,7 @@ const insn_template i386_optab[] = 0, 0, 0, 0, 0, 0 } }, { 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - 2, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0, 0 }, + 2, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 7, 0, 0, 1, 0, 0, 0, 0, 0 }, { { { 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0 } }, @@ -60139,7 +60139,7 @@ const insn_template i386_optab[] = 0, 0, 0, 0, 0, 0 } }, { 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0, 0 }, + 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 7, 0, 0, 1, 0, 0, 0, 0, 0 }, { { { 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0 } }, @@ -60155,7 +60155,7 @@ const insn_template i386_optab[] = 0, 0, 0, 0, 0, 0 } }, { 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0, 0 }, + 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 7, 0, 0, 1, 0, 0, 0, 0, 0 }, { { { 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0 } }, @@ -60171,7 +60171,7 @@ const insn_template i386_optab[] = 0, 0, 0, 0, 0, 0 } }, { 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - 2, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0, 0 }, + 2, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 7, 0, 0, 1, 0, 0, 0, 0, 0 }, { { { 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0 } }, @@ -63555,7 +63555,7 @@ const insn_template i386_optab[] = 0, 0, 0, 0, 0, 0 } }, { 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0, 0 }, + 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 7, 0, 0, 1, 0, 0, 0, 0, 0 }, { { { 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0 } }, @@ -63571,7 +63571,7 @@ const insn_template i386_optab[] = 0, 0, 0, 0, 0, 0 } }, { 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, - 2, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0, 0 }, + 2, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 7, 0, 0, 1, 0, 0, 0, 0, 0 }, { { { 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0 } }, -- 2.20.1