From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 11759 invoked by alias); 14 Oct 2010 08:54:18 -0000 Received: (qmail 11749 invoked by uid 22791); 14 Oct 2010 08:54:17 -0000 X-SWARE-Spam-Status: No, hits=-2.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00,TW_DD,TW_DQ,TW_TD,TW_VC,TW_VD,TW_VT X-Spam-Check-By: sourceware.org Received: from localhost (HELO gcc.gnu.org) (127.0.0.1) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 14 Oct 2010 08:54:04 +0000 From: "hjl.tools at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/46012] New: 256bit vectorizer failed on int->double X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Keywords: X-Bugzilla-Severity: enhancement X-Bugzilla-Who: hjl.tools at gmail dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Changed-Fields: Message-ID: X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Date: Thu, 14 Oct 2010 08:54:00 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2010-10/txt/msg01157.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46012 Summary: 256bit vectorizer failed on int->double Product: gcc Version: 4.6.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: tree-optimization AssignedTo: unassigned@gcc.gnu.org ReportedBy: hjl.tools@gmail.com CC: rguenth@gcc.gnu.org For --- double a[1024]; float b[1024]; int c[1024]; void dependence_distance_4_mixed_0 (void) { int i; for (i = 0; i < 1020; ++i) a[i + 4] = a[i] + a[i + 4] + c[i]; } --- with -O3 -ffast-math -mavx, vect256 branch generates: .L2: vmovapd a(%rax,%rax), %ymm0 vcvtdq2pd c(%rax), %ymm1 vaddpd a+32(%rax,%rax), %ymm0, %ymm0 vaddpd %ymm1, %ymm0, %ymm0 vmovapd %ymm0, a+32(%rax,%rax) addq $16, %rax cmpq $4080, %rax jne .L2 Trunk at revision 165455 generates .L2: vmovapd 16(%rax), %xmm2 vaddpd -16(%rax), %xmm2, %xmm2 vmovdqa (%rdx), %xmm0 addq $16, %rdx vpshufd $238, %xmm0, %xmm1 vcvtdq2pd %xmm0, %xmm0 vcvtdq2pd %xmm1, %xmm1 vaddpd %xmm1, %xmm2, %xmm1 vmovapd (%rax), %xmm2 vaddpd -32(%rax), %xmm2, %xmm2 vmovapd %xmm1, 16(%rax) vaddpd %xmm0, %xmm2, %xmm0 vmovapd %xmm0, (%rax) addq $32, %rax cmpq %rax, %rcx jne .L2