From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21534 invoked by alias); 26 May 2016 17:06:12 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 21523 invoked by uid 89); 26 May 2016 17:06:12 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.3 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=allowing, ia32, 0x, xx X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Thu, 26 May 2016 17:05:52 +0000 Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 98D3B8AE73; Thu, 26 May 2016 17:05:51 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-116-88.ams2.redhat.com [10.36.116.88]) by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u4QH5nx8019978 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Thu, 26 May 2016 13:05:51 -0400 Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id u4QH5mXW016819; Thu, 26 May 2016 19:05:48 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id u4QH5knb016818; Thu, 26 May 2016 19:05:46 +0200 Date: Thu, 26 May 2016 18:00:00 -0000 From: Jakub Jelinek To: Uros Bizjak , Kirill Yukhin Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] Improve *vec_concatv2si_sse4_1 Message-ID: <20160526170545.GZ28550@tucnak.redhat.com> Reply-To: Jakub Jelinek MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.24 (2015-08-30) X-IsSubscribed: yes X-SW-Source: 2016-05/txt/msg02126.txt.bz2 Hi! This patch adds an avx512dq alternative (EVEX vpinsrd requires that) and enables EVEX vmovd and vpunpckldq. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-05-26 Jakub Jelinek * config/i386/sse.md (*vec_concatv2si_sse4_1): Add avx512dq v=Yv,rm alternative. Change x=x,x alternative to v=Yv,Yv and x=rm,C alternative to v=rm,C. * gcc.target/i386/avx512dq-concatv2si-1.c: New test. * gcc.target/i386/avx512vl-concatv2si-1.c: New test. --- gcc/config/i386/sse.md.jj 2016-05-26 10:44:25.000000000 +0200 +++ gcc/config/i386/sse.md 2016-05-26 14:22:26.819313220 +0200 @@ -13339,29 +13339,30 @@ (define_split (define_insn "*vec_concatv2si_sse4_1" [(set (match_operand:V2SI 0 "register_operand" - "=Yr,*x,x, Yr,*x,x, x, *y,*y") + "=Yr,*x, x, v,Yr,*x, v, v, *y,*y") (vec_concat:V2SI (match_operand:SI 1 "nonimmediate_operand" - " 0, 0,x, 0,0, x,rm, 0,rm") + " 0, 0, x,Yv, 0, 0,Yv,rm, 0,rm") (match_operand:SI 2 "vector_move_operand" - " rm,rm,rm,Yr,*x,x, C,*ym, C")))] + " rm,rm,rm,rm,Yr,*x,Yv, C,*ym, C")))] "TARGET_SSE4_1 && !(MEM_P (operands[1]) && MEM_P (operands[2]))" "@ pinsrd\t{$1, %2, %0|%0, %2, 1} pinsrd\t{$1, %2, %0|%0, %2, 1} vpinsrd\t{$1, %2, %1, %0|%0, %1, %2, 1} + vpinsrd\t{$1, %2, %1, %0|%0, %1, %2, 1} punpckldq\t{%2, %0|%0, %2} punpckldq\t{%2, %0|%0, %2} vpunpckldq\t{%2, %1, %0|%0, %1, %2} %vmovd\t{%1, %0|%0, %1} punpckldq\t{%2, %0|%0, %2} movd\t{%1, %0|%0, %1}" - [(set_attr "isa" "noavx,noavx,avx,noavx,noavx,avx,*,*,*") - (set_attr "type" "sselog,sselog,sselog,sselog,sselog,sselog,ssemov,mmxcvt,mmxmov") - (set_attr "prefix_extra" "1,1,1,*,*,*,*,*,*") - (set_attr "length_immediate" "1,1,1,*,*,*,*,*,*") - (set_attr "prefix" "orig,orig,vex,orig,orig,vex,maybe_vex,orig,orig") - (set_attr "mode" "TI,TI,TI,TI,TI,TI,TI,DI,DI")]) + [(set_attr "isa" "noavx,noavx,avx,avx512dq,noavx,noavx,avx,*,*,*") + (set_attr "type" "sselog,sselog,sselog,sselog,sselog,sselog,sselog,ssemov,mmxcvt,mmxmov") + (set_attr "prefix_extra" "1,1,1,1,*,*,*,*,*,*") + (set_attr "length_immediate" "1,1,1,1,*,*,*,*,*,*") + (set_attr "prefix" "orig,orig,vex,evex,orig,orig,maybe_evex,maybe_vex,orig,orig") + (set_attr "mode" "TI,TI,TI,TI,TI,TI,TI,TI,DI,DI")]) ;; ??? In theory we can match memory for the MMX alternative, but allowing ;; nonimmediate_operand for operand 2 and *not* allowing memory for the SSE --- gcc/testsuite/gcc.target/i386/avx512dq-concatv2si-1.c.jj 2016-05-26 15:14:55.853786550 +0200 +++ gcc/testsuite/gcc.target/i386/avx512dq-concatv2si-1.c 2016-05-26 15:13:57.000000000 +0200 @@ -0,0 +1,43 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -mavx512vl -mavx512dq -masm=att" } */ + +typedef int V __attribute__((vector_size (8))); + +void +f1 (int x, int y) +{ + register int a __asm ("xmm16"); + register int b __asm ("xmm17"); + register V c __asm ("xmm3"); + a = x; + b = y; + asm volatile ("" : "+v" (a), "+v" (b)); + c = (V) { a, b }; + asm volatile ("" : "+v" (c)); +} + +/* { dg-final { scan-assembler "vpunpckldq\[^\n\r]*%xmm17\[^\n\r]*%xmm16\[^\n\r]*%xmm3" } } */ + +void +f2 (int x, int y) +{ + register int a __asm ("xmm16"); + register V c __asm ("xmm3"); + a = x; + asm volatile ("" : "+v" (a)); + c = (V) { a, y }; + asm volatile ("" : "+v" (c)); +} + +void +f3 (int x, int *y) +{ + register int a __asm ("xmm16"); + register V c __asm ("xmm3"); + a = x; + asm volatile ("" : "+v" (a)); + c = (V) { a, *y }; + asm volatile ("" : "+v" (c)); +} + +/* { dg-final { scan-assembler-times "vpinsrd\[^\n\r]*\\\$1\[^\n\r]*%xmm16\[^\n\r]*%xmm3" 2 } } */ --- gcc/testsuite/gcc.target/i386/avx512vl-concatv2si-1.c.jj 2016-05-26 15:15:11.921574803 +0200 +++ gcc/testsuite/gcc.target/i386/avx512vl-concatv2si-1.c 2016-05-26 15:16:24.936612585 +0200 @@ -0,0 +1,43 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -mavx512vl -mno-avx512dq -masm=att" } */ + +typedef int V __attribute__((vector_size (8))); + +void +f1 (int x, int y) +{ + register int a __asm ("xmm16"); + register int b __asm ("xmm17"); + register V c __asm ("xmm3"); + a = x; + b = y; + asm volatile ("" : "+v" (a), "+v" (b)); + c = (V) { a, b }; + asm volatile ("" : "+v" (c)); +} + +/* { dg-final { scan-assembler "vpunpckldq\[^\n\r]*%xmm17\[^\n\r]*%xmm16\[^\n\r]*%xmm3" } } */ + +void +f2 (int x, int y) +{ + register int a __asm ("xmm16"); + register V c __asm ("xmm3"); + a = x; + asm volatile ("" : "+v" (a)); + c = (V) { a, y }; + asm volatile ("" : "+v" (c)); +} + +void +f3 (int x, int *y) +{ + register int a __asm ("xmm16"); + register V c __asm ("xmm3"); + a = x; + asm volatile ("" : "+v" (a)); + c = (V) { a, *y }; + asm volatile ("" : "+v" (c)); +} + +/* { dg-final { scan-assembler-not "vpinsrd\[^\n\r]*\\\$1\[^\n\r]*%xmm16\[^\n\r]*%xmm3" } } */ Jakub