From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 22244 invoked by alias); 5 Apr 2007 11:29:55 -0000 Received: (qmail 21860 invoked by uid 48); 5 Apr 2007 11:29:40 -0000 Date: Thu, 05 Apr 2007 11:29:00 -0000 Subject: [Bug rtl-optimization/31485] New: C complex numbers, amd64 SSE, missed optimization opportunity X-Bugzilla-Reason: CC Message-ID: Reply-To: gcc-bugzilla@gcc.gnu.org To: gcc-bugs@gcc.gnu.org From: "bisqwit at iki dot fi" Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2007-04/txt/msg00365.txt.bz2 Considering that "complex" turns basically any basic type into a vector type, complex number addition and subtraction could utilize SSE instructions to perform the operation on real and imaginary parts simultaneously. (Only applies to addition and subtraction.) Code: #include typedef float complex ss1; typedef float ss2 __attribute__((vector_size(sizeof(ss1)))); ss1 add1(ss1 a, ss1 b) { return a + b; } ss2 add2(ss2 a, ss2 b) { return a + b; } Produces: add1: movq %xmm0, -8(%rsp) movq %xmm1, -16(%rsp) movss -4(%rsp), %xmm0 movss -8(%rsp), %xmm1 addss -12(%rsp), %xmm0 addss -16(%rsp), %xmm1 movss %xmm0, -20(%rsp) movss %xmm1, -24(%rsp) movq -24(%rsp), %xmm0 ret add2: movlps %xmm0, -16(%rsp) movlps %xmm1, -24(%rsp) movaps -24(%rsp), %xmm0 addps -16(%rsp), %xmm0 movaps %xmm0, -56(%rsp) movlps -56(%rsp), %xmm0 ret Command line: gcc -msse -O3 -S test2.c (Results are same with -ffast-math) Architecture: CPU=AMD Athlon(tm) 64 X2 Dual Core Processor 4600+ CPU features=fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm cmp_legacy GCC is: Target: x86_64-linux-gnu Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr --enable-checking=release x86_64-linux-gnu Thread model: posix gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21) -- Summary: C complex numbers, amd64 SSE, missed optimization opportunity Product: gcc Version: 4.1.2 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: rtl-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: bisqwit at iki dot fi http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31485