From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 31654 invoked by alias); 9 Jul 2013 13:49:19 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 31605 invoked by uid 48); 9 Jul 2013 13:49:13 -0000 From: "vincenzo.innocente at cern dot ch" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/57858] AVX2: ymm used for div, not for sqrt Date: Tue, 09 Jul 2013 13:49:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 4.9.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: vincenzo.innocente at cern dot ch X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2013-07/txt/msg00528.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57858 --- Comment #2 from vincenzo Innocente --- actually the code for div and sqr is different already for standard SSE c++ -std=c++11 -Ofast -S avx2sqrt.cc -ftree-vectorizer-verbose=1 -Wall ; cat avx2sqrt.s .L2: movdqa %xmm0, %xmm1 addl $1, %eax movdqa %xmm0, %xmm4 cmpl $256, %eax paddd %xmm5, %xmm1 pshufd $238, %xmm1, %xmm0 cvtdq2pd %xmm1, %xmm1 movapd %xmm3, %xmm7 paddd %xmm6, %xmm4 cvtdq2pd %xmm0, %xmm0 divpd %xmm0, %xmm7 movapd %xmm7, %xmm0 movapd %xmm3, %xmm7 divpd %xmm1, %xmm7 addpd %xmm7, %xmm0 addpd %xmm0, %xmm2 jne .L3 movapd %xmm2, -24(%rsp) movsd -16(%rsp), %xmm0 addsd %xmm2, %xmm0 ret .cfi_endproc .LFE3: .size _Z3divv, .-_Z3divv .p2align 4,,15 .globl _Z3sqrv .type _Z3sqrv, @function _Z3sqrv: .LFB4: .cfi_startproc movl $1, %eax movsd .LC4(%rip), %xmm1 xorpd %xmm0, %xmm0 jmp .L6 .p2align 4,,10 .p2align 3 .L7: cvtsi2sd %eax, %xmm1 sqrtsd %xmm1, %xmm1 .L6: addl $1, %eax addsd %xmm1, %xmm0 cmpl $1025, %eax jne .L7 rep; ret .cfi_endproc