From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 5604 invoked by alias); 13 Jan 2014 18:40:21 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 5534 invoked by uid 89); 13 Jan 2014 18:40:20 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-oa0-f43.google.com Received: from mail-oa0-f43.google.com (HELO mail-oa0-f43.google.com) (209.85.219.43) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Mon, 13 Jan 2014 18:40:18 +0000 Received: by mail-oa0-f43.google.com with SMTP id m1so8445349oag.16 for ; Mon, 13 Jan 2014 10:40:16 -0800 (PST) MIME-Version: 1.0 X-Received: by 10.60.61.14 with SMTP id l14mr21814149oer.18.1389638416392; Mon, 13 Jan 2014 10:40:16 -0800 (PST) Received: by 10.182.137.136 with HTTP; Mon, 13 Jan 2014 10:40:16 -0800 (PST) In-Reply-To: <20140113182613.GD24431@msticlxl57.ims.intel.com> References: <20140113080711.GS892@tucnak.redhat.com> <20140113083501.GT892@tucnak.redhat.com> <20140113182613.GD24431@msticlxl57.ims.intel.com> Date: Mon, 13 Jan 2014 18:40:00 -0000 Message-ID: Subject: Re: Patch ping From: Uros Bizjak To: Kirill Yukhin Cc: Jakub Jelinek , Richard Biener , "gcc-patches@gcc.gnu.org" Content-Type: text/plain; charset=ISO-8859-1 X-SW-Source: 2014-01/txt/msg00744.txt.bz2 On Mon, Jan 13, 2014 at 7:26 PM, Kirill Yukhin wrote: >> > Kirill, is it possible for you to test the patch in the simulator? Do >> > we have a testcase in gcc's testsuite that can be used to check this >> > patch? >> >> E.g. gcc.target/i386/avx2-gather* and avx512f-gather*. > This tests are for built-in generation. The issue is connected to > auto code gen. > > It seems to be working, we have for hss2a.fppized.f: > .L402: > vmovdqu64 (%rdi,%rax), %zmm1 > kmovw %k1, %k3 > kmovw %k1, %k2 > kmovw %k1, %k4 > kmovw %k1, %k5 > addl $1, %esi > vpgatherdd npwrx.4971-4(,%zmm1,4), %zmm0{%k3} > vpgatherdd (%r10,%zmm1,4), %zmm2{%k2} > vpmulld %zmm3, %zmm0, %zmm0 > vpaddd %zmm7, %zmm0, %zmm0 > vmovdqu32 %zmm0, (%r11,%rax) > vpgatherdd npwry.4973-4(,%zmm1,4), %zmm0{%k4} > vpmulld %zmm3, %zmm0, %zmm0 > vpaddd %zmm6, %zmm0, %zmm0 > vmovdqu32 %zmm0, (%r9,%rax) > vpgatherdd npwrz.4975-4(,%zmm1,4), %zmm0{%k5} > vpmulld %zmm3, %zmm0, %zmm0 > vpaddd %zmm5, %zmm0, %zmm0 > vmovdqu32 %zmm0, (%r14,%rax) > vpaddd %zmm2, %zmm4, %zmm0 > vmovdqa64 %zmm0, (%r15,%rax) > addq $64, %rax > cmpl %esi, %edx > ja .L402 An unrelated observation: gcc should figure out that %k1 mask register can be used in all gather insns and avoid unnecessary copies at the beginning of the loop. Uros.