From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 18211 invoked by alias); 6 Aug 2013 16:03:21 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 18122 invoked by uid 48); 6 Aug 2013 16:03:17 -0000 From: "siavashserver at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug c++/58095] New: SIMD code requiring auxiliary array for best optimization Date: Tue, 06 Aug 2013 16:03:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c++ X-Bugzilla-Version: unknown X-Bugzilla-Keywords: X-Bugzilla-Severity: major X-Bugzilla-Who: siavashserver at gmail dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2013-08/txt/msg00374.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58095 Bug ID: 58095 Summary: SIMD code requiring auxiliary array for best optimization Product: gcc Version: unknown Status: UNCONFIRMED Severity: major Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: siavashserver at gmail dot com Created attachment 30621 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30621&action=edit Source code and its generated asm code. Hello. I have noticed a strange behavior when I'm trying to write SIMD code using provided SSE intrinsics. It looks like GCC is not able to generate/optimize same code like function (bar) for function (foo). I was wondering how can I achieve same generated code for the function (foo) without going into trouble of defining and using an auxiliary array like function (bar). I've tried using __restrict__ keyword for input data (foo2), but GCC still generates same code like function (foo). ICC and Clang also generate same code and fail to optimize. Something strange I've noticed is that GCC 4.4.7 generates desired code for function (foo), but fails to do for function (foo2) and (bar). Newer versions generate exactly same code for function (foo) and (foo2), and desired code for function (bar). Output attached is generated from GCC 4.8.1 using -O2 optimization level. I've used online GCC compiler from: http://gcc.godbolt.org/