public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/60408] New: ARM: inefficient code for vget_lane_f32 intrinsic
@ 2014-03-04 10:57 mans at mansr dot com
  2014-03-04 11:37 ` [Bug target/60408] " ktkachov at gcc dot gnu.org
  2015-03-23 17:04 ` wilson at tuliptree dot org
  0 siblings, 2 replies; 3+ messages in thread
From: mans at mansr dot com @ 2014-03-04 10:57 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60408

            Bug ID: 60408
           Summary: ARM: inefficient code for vget_lane_f32 intrinsic
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: mans at mansr dot com

Consider this trivial function:

#include <arm_neon.h>
float foo(float32x2_t v)
{
    return vget_lane_f32(v, 0) + vget_lane_f32(v, 1);
}

Compiling with gcc 4.9 trunk from 2014-03-02 yields this (non-code output
removed):

$ gcc -O3 -march=armv7-a -mfpu=neon -S -o - test.c
foo:
        vmov.32 r3, d0[0]
        vmov.32 r2, d0[1]
        fmsr    s15, r3
        fmsr    s0, r2
        fadds   s0, s0, s15
        bx      lr

A simple "fadds s0, s0, s1" is what one would expect from code like this.


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-03-23 16:17 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-04 10:57 [Bug target/60408] New: ARM: inefficient code for vget_lane_f32 intrinsic mans at mansr dot com
2014-03-04 11:37 ` [Bug target/60408] " ktkachov at gcc dot gnu.org
2015-03-23 17:04 ` wilson at tuliptree dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).