public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
From: David Brown <david@westcontrol.com>
To: Cody Rigney <codyrigney92@gmail.com>, <gcc-help@gcc.gnu.org>
Subject: Re: Compiler optimizing variables in inline assembly
Date: Thu, 20 Feb 2014 09:54:00 -0000	[thread overview]
Message-ID: <5305D0D4.6080105@westcontrol.com> (raw)
In-Reply-To: <CA+1=iYaWg6OyzNjM9K2Qb1fn40ei0Ls+3AhVyXcg-h2Pm3xQaw@mail.gmail.com>

Hi,

I haven't read through the code at all, but I will give you a little
general advice.

Try to cut the code to the absolute minimum that shows the problem.  It
makes it easier for you to work with and check, and it makes it easier
for other people to examine.  Also make sure that the code has no other
dependencies such as extra headers - ideally people should be able to
compile the code themselves and test it (I realise this is difficult for
those who don't have an ARM handy).

Code that works without optimisation but fails with optimisation, or
that works when you make a variable volatile, is always a bug.
Occasionally, it is a bug in the compiler - but most often it is a bug
in the code.  Either way, it is important to figure out the root cause,
and not try to hide it by making things volatile (though that might be a
good temporary fix for a compiler bug).

I am not familiar with Neon (and not as good as I should be at ARM
assembly in general), but it looks to me that you have used specific
registers in your inline assembly, and assumed specific registers for
compiler use (such as variables).  Don't do that.  When you have turned
off all optimisation, the compiler is consistent about which registers
it uses for different purposes - when optimising, it changes register
usage in a very unpredictable way.  You must be explicit - all data
going into your assembly must be declared, as must all data coming out
of the assembly.  And if you use specific registers, you need to tell
the compiler about them (as "clobbers") - and be aware that the compiler
might be using those registers for the input or output values.

Getting inline assembly right is not easy, and it is often best to work
with several small assembly statements rather than large ones - I
usually make a "static inline" function around a line or two of inline
assembly and then use that function in the code as needed.  It can make
the result a lot clearer, and makes it easier to mix the C and assembly
- the end result is often better than I would make in pure assembly.

Finally, is there a good reason why you need inline assembly rather than
the neon intrinsics provided by gcc?

<http://gcc.gnu.org/onlinedocs/gcc/ARM-NEON-Intrinsics.html>


mvh.,

David




On 19/02/14 20:04, Cody Rigney wrote:
> Hi,
> 
> I'm trying to add NEON optimizations to OpenCV's LK optical flow.  See
> link below.
> https://github.com/Itseez/opencv/blob/2.4/modules/video/src/lkpyramid.cpp
> 
> The gcc version could vary since this is an open source project, but
> the one I'm currently using is 4.8.1. The target architecture is ARMv7
> w/ NEON. The processor I'm testing on is an ARM
> Cortex-A15(big.LITTLE).
> 
> The problem is, in release mode (where optimizations are set) it does
> not work properly. However, in debug mode, it works fine. I tracked
> down a specific variable(FLT_SCALE) that was being optimized out and
> made it volatile and that part worked fine after that. However, I'm
> still having incorrect behavior from some other optimization.  I'm new
> to inline assembly, so I thought maybe I'm doing something wrong
> that's not telling the compiler that I'm using a certain variable.
> 
> Below is the code at its current state. Ignore all the comments and
> volatiles(for testing this problem) everywhere. It's WIP. I removed
> unnecessary functions and code so it would be easier to see. I think
> the problem is in the bottom-most asm block because if I do if(false)
> to skip it, I don't run into the problem. Thanks.
> 

<snip>


  parent reply	other threads:[~2014-02-20  9:54 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-19 19:05 Cody Rigney
2014-02-20  9:14 ` Andrew Haley
2014-02-20 19:30   ` Cody Rigney
2014-02-21  9:53     ` Andrew Haley
2014-02-21 14:06       ` Cody Rigney
2014-02-21 15:02         ` Andrew Haley
2014-02-21 15:20           ` Cody Rigney
2014-02-27 13:18             ` Cody Rigney
2014-02-27 14:03               ` Andrew Haley
2014-02-27 18:34                 ` Cody Rigney
2014-02-21  9:54     ` David Brown
2014-02-21  9:55     ` David Brown
2014-02-20  9:54 ` David Brown [this message]
2014-02-20 19:39   ` Cody Rigney
2014-02-21 10:15     ` David Brown
2014-02-21 14:11       ` Cody Rigney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5305D0D4.6080105@westcontrol.com \
    --to=david@westcontrol.com \
    --cc=codyrigney92@gmail.com \
    --cc=gcc-help@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).