From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 31290 invoked by alias); 10 Aug 2014 11:30:29 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 31256 invoked by uid 48); 10 Aug 2014 11:30:25 -0000 From: "glisse at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug c++/62080] Suboptimal code generation with eigen library Date: Sun, 10 Aug 2014 11:30:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c++ X-Bugzilla-Version: 4.8.3 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: glisse at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2014-08/txt/msg00604.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62080 --- Comment #2 from Marc Glisse --- (note that a minimal, self-contained testcase would be much better and shouldn't be hard to produce) We write to memory with: (insn 10 8 11 2 (set (mem:V2DI (reg/v/f:DI 97 [ vec ]) [0 MEM[(__m128i * {ref-all})vec_4(D)]+0 S16 A128]) (subreg:V2DI (reg:V4SI 98) 0)) /usr/lib/gcc-snapshot/lib/gcc/x86_64-linux-gnu/4.10.0/include/emmintrin.h:706 1147 {*movv2di_internal} (expr_list:REG_DEAD (reg:V4SI 98) (nil))) and then read back with: (insn 15 12 17 2 (set (reg:V2DF 100) (vec_concat:V2DF (mem:DF (reg/v/f:DI 97 [ vec ]) [5 MEM[(const double *)vec_4(D)]+0 S8 A64]) (mem:DF (plus:DI (reg/v/f:DI 97 [ vec ]) (const_int 8 [0x8])) [0 S8 A8]))) /usr/lib/gcc-snapshot/lib/gcc/x86_64-linux-gnu/4.10.0/include/emmintrin.h:925 2016 {*vec_concatv2df} (nil)) The vec_concat of the 2 adjacent memory locations is not merged into a single memory read, although from the previous insn it looks like it is suitably aligned.