[Bug target/97040] New: incorrect fused multiply add/subtract instruction generated from C code

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug target/97040] New: incorrect fused multiply add/subtract instruction generated from C code
@ 2020-09-13 17:14 ddiculoiu at dspace dot de
  2022-02-06 18:20 ` [Bug target/97040] " law at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: ddiculoiu at dspace dot de @ 2020-09-13 17:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97040

            Bug ID: 97040
           Summary: incorrect fused multiply add/subtract instruction
                    generated from C code
           Product: gcc
           Version: 8.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ddiculoiu at dspace dot de
  Target Milestone: ---
            Target: v850-elf

Consider the following simple code:

__attribute__ ((noinline)) float func_a(float x, float y, float z)
{
  return (x-y*z);
}

__attribute__ ((noinline)) float func_b(float x, float y, float z)
{
  return (-x-y*z);
}

volatile float x = 5.0,y = 2.0,z = 1.0;

int main()
{
  volatile float c = func_a(x,y,z);
  volatile float d = func_b(x,y,z);

  return 0;
}
compiled with:
v850-elf-gcc test.c  -mrh850-abi -mv850e3v5 -nostdlib --entry=0 -O2

The gcc generates the following assembly code for the 'func_a' and 'func_b':

00100000 <_func_a>:
  100000:       06 50           mov     r6, r10
  100002:       e7 47 e4 54     fnmaf.s r7, r8, r10
  100006:       7f 00           jmp     [lp]

00100008 <_func_b>:
  100008:       06 50           mov     r6, r10
  10000a:       e7 47 e6 54     fnmsf.s r7, r8, r10
  10000e:       7f 00           jmp     [lp]

In my opinion the 2 instructions 'fnmaf.s' and 'fnmsf.s' are exchanged in the 2
functions. The funtion 'func_a' must contain the 'fnmsf.s' and the 'func_b' the
'fnmaf.s' instruction.

Can someone confirm my observations?
Thanks.

P.S. I did a test with older (gcc 4.9.0) and latest (gcc 10.2.0) gcc. 
Both have the same problem. I think the bug is from the first release where the
fused multiply and add/subtraction were added and nobody detected it yet.

When I compile the same code with Green Hills Compiler I got the correct
assembly instruction for 'func_a' and the 'func_b' uses negated value for x and
the 'fnmsf.s' instruction instead of the 'fnmaf.s' one, but the result is
correct:

00001000 <_func_a>:
    1000:       06 50           mov     r6, r10
    1002:       e7 47 e6 54     fnmsf.s r7, r8, r10
    1006:       7f 00           jmp     [lp]

00001010 <_func_b>:
    1010:       e1 37 48 54     negf.s  r6, r10
    1014:       e7 47 e6 54     fnmsf.s r7, r8, r10
    1018:       7f 00           jmp     [lp]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/97040] incorrect fused multiply add/subtract instruction generated from C code
  2020-09-13 17:14 [Bug target/97040] New: incorrect fused multiply add/subtract instruction generated from C code ddiculoiu at dspace dot de
@ 2022-02-06 18:20 ` law at gcc dot gnu.org
  2022-02-09 19:06 ` law at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: law at gcc dot gnu.org @ 2022-02-06 18:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97040

Jeffrey A. Law <law at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |law at gcc dot gnu.org
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2022-02-06

--- Comment #1 from Jeffrey A. Law <law at gcc dot gnu.org> ---
I think I know what's going on here...

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/97040] incorrect fused multiply add/subtract instruction generated from C code
  2020-09-13 17:14 [Bug target/97040] New: incorrect fused multiply add/subtract instruction generated from C code ddiculoiu at dspace dot de
  2022-02-06 18:20 ` [Bug target/97040] " law at gcc dot gnu.org
@ 2022-02-09 19:06 ` law at gcc dot gnu.org
  2022-02-09 19:13 ` cvs-commit at gcc dot gnu.org
  2022-02-11 16:58 ` law at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: law at gcc dot gnu.org @ 2022-02-09 19:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97040

--- Comment #2 from Jeffrey A. Law <law at gcc dot gnu.org> ---
So a bit more background.  I was doing some maintenance work on the v850 a few
years back and noticed an issue with e3v5 testing and FMAs.  I poked around a
bit and had largely ruled out the generic bits of FMA generation as the
problem.  But I was very short of time and didn't have reasonable docs on the
implementation of FMAs for the newer versions of the v850 (e3v5 was added long
after I did the original port).  Renesas's site isn't great for finding the
relevant docs as you have to map from something like e3v5 to a part within the
v850 family:(

SO I set it aside.  You report caused me to dig a bit further.  I finally found
a mapping from e3v5 to a specific part on Renesas's site (RH850G3KH) and was
able to review the specification of these instructions.

Again I managed to convince myself that our gimple FMA generation was correct. 
I also convinced myself that the RTL patterns for the v850 match the
documentation from Renesas.   Then it finally hit me.

The v850 patterns are using pattern names that require specific semantics.  ie,
if the expanders see a pattern fnma<mode> they expect that pattern will
implement very specific semantics.  In particular they expect that the negation
will apply to one of the arguments to the multiplication.   But the instruction
emitted in those patterns negates the final result.  ie, GCC expects this
semantic from the fnma<mode> pattern

  (fma (neg op0) (op1) (op2))

But the v850 chip implements

  (neg (fma (op0) (op1) (op2)))

So the bug is using the "fnma" and "fnms" names for those patterns.  The RTL is
fine, it's strictly a pattern naming issue.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/97040] incorrect fused multiply add/subtract instruction generated from C code
  2020-09-13 17:14 [Bug target/97040] New: incorrect fused multiply add/subtract instruction generated from C code ddiculoiu at dspace dot de
  2022-02-06 18:20 ` [Bug target/97040] " law at gcc dot gnu.org
  2022-02-09 19:06 ` law at gcc dot gnu.org
@ 2022-02-09 19:13 ` cvs-commit at gcc dot gnu.org
  2022-02-11 16:58 ` law at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-02-09 19:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97040

--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jeff Law <law@gcc.gnu.org>:

https://gcc.gnu.org/g:eefec38c992e3622a69de9667e91f0cafbff03cc

commit r12-7145-geefec38c992e3622a69de9667e91f0cafbff03cc
Author: Jeff Law <jeffreyalaw@gmail.com>
Date:   Wed Feb 9 14:10:53 2022 -0500

    Avoid using predefined insn name for instruction with different semantics

    This isn't technically a regression, but it only impacts the v850 target
and
    fixes a long standing code correctness issue.

    As outlined in slightly more detail in the PR, the v850 is using the
pattern
    name "fnmasf4" and "fnmssf4" to generate fnmaf.s and fnmsf.s instructions
     respectively.

    Unfortunately fnmasf4 is expected to produce (-a * b) + c and
    fnmssf4 (-a * b) - c.  Those v850 instructions actually negate the entire
    result.

    The fix is trivial.  Use a different pattern name so that the combiner can
    still generate those instructions, but prevent those instructions from
being
    used to implement GCC's notion of what fnmas and fnmss should be.

    This fixes pr97040 as well as a handful of testsuite failures for the v3e5
    multilib.

    gcc/
            PR target/97040
            * config/v850/v850.md (*v850_fnmasf4): Renamed from fnmasf4.
            (*v850_fnmssf4): Renamed from fnmssf4

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/97040] incorrect fused multiply add/subtract instruction generated from C code
  2020-09-13 17:14 [Bug target/97040] New: incorrect fused multiply add/subtract instruction generated from C code ddiculoiu at dspace dot de
                   ` (2 preceding siblings ...)
  2022-02-09 19:13 ` cvs-commit at gcc dot gnu.org
@ 2022-02-11 16:58 ` law at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: law at gcc dot gnu.org @ 2022-02-11 16:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97040

Jeffrey A. Law <law at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|ASSIGNED                    |RESOLVED

--- Comment #4 from Jeffrey A. Law <law at gcc dot gnu.org> ---
Fixed on the trunk.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-02-11 16:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-13 17:14 [Bug target/97040] New: incorrect fused multiply add/subtract instruction generated from C code ddiculoiu at dspace dot de
2022-02-06 18:20 ` [Bug target/97040] " law at gcc dot gnu.org
2022-02-09 19:06 ` law at gcc dot gnu.org
2022-02-09 19:13 ` cvs-commit at gcc dot gnu.org
2022-02-11 16:58 ` law at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).