public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "rearnsha at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/43725] Poor instructions selection, scheduling and registers allocation for ARM NEON intrinsics
Date: Wed, 29 Sep 2010 20:50:00 -0000	[thread overview]
Message-ID: <20100929205000.zppwamQFPcCoKlluUTd-Txq_FosjEDdoKqNmfpsuARI@z> (raw)
In-Reply-To: <bug-43725-4@http.gcc.gnu.org/bugzilla/>

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43725

Richard Earnshaw <rearnsha at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2010-05-11 07:35:23         |2010-09-29 7:35:23
               date|                            |
                 CC|                            |rearnsha at gcc dot gnu.org

--- Comment #1 from Richard Earnshaw <rearnsha at gcc dot gnu.org> 2010-09-29 16:28:17 UTC ---
So the compiler is correct not to be using vld1 for this code.  The memory
format of int32x4_t is defined to be the format of a neon register that has
been filled from an array of int32 values and then stored to memory using VSTM
(or equivalent sequence).  The implication of all this is that int32x4_t does
not (necessarily) have the same memory layout as int32_t[4].


arm_neon.h provides intrinsics for filling neon registers from arrays in
memory, and in this case I think you should be using these directly.  That is,
your macro should be modified to contain:

#define X(n) {int32x4_t v; v = vld1q_s32((const int32_t*)&p[n]); v =
vaddq_s32(v, a); v = vorrq_s32(v, b); vst1q_s32 ((int32_t*)&p[n], v);}


There are still problems after doing this, however.  In particular the compiler
is not correctly tracking alias information for the load/store intrinsics,
which means it is unable to move stores past loads to reduce stalls in the
pipeline.

The stack wastage appears to be fixed in trunk gcc; at least I don't see any
stack allocation for your testcase.

I haven't looked into the scheduling issues at this time.


       reply	other threads:[~2010-09-29 16:28 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-43725-4@http.gcc.gnu.org/bugzilla/>
2010-09-29 20:50 ` rearnsha at gcc dot gnu.org [this message]
2010-10-04 23:00 ` siarhei.siamashka at gmail dot com
2010-10-04 23:46 ` joseph at codesourcery dot com
2010-10-05  7:16 ` ramana at gcc dot gnu.org
2010-10-08 14:13 ` siarhei.siamashka at gmail dot com
2011-06-29 13:35 ` siarhei.siamashka at gmail dot com
2014-07-09 12:26 ` m.zakirov at samsung dot com
2014-07-29 11:35 ` m.zakirov at samsung dot com
2014-07-29 11:46 ` m.zakirov at samsung dot com
2014-08-20 16:44 ` mkuvyrkov at gcc dot gnu.org
2021-09-27  7:21 ` pinskia at gcc dot gnu.org
2010-04-12  7:27 [Bug target/43725] New: " siarhei dot siamashka at gmail dot com
2010-05-11  7:35 ` [Bug target/43725] " ramana at gcc dot gnu dot org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100929205000.zppwamQFPcCoKlluUTd-Txq_FosjEDdoKqNmfpsuARI@z \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).