public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/55073] New: Wrong Neon code generation at -O2 caused by -fschedule-insns
@ 2012-10-25 12:54 eric.batut at allegorithmic dot com
  2012-10-25 23:02 ` [Bug target/55073] " steven at gcc dot gnu.org
                   ` (16 more replies)
  0 siblings, 17 replies; 18+ messages in thread
From: eric.batut at allegorithmic dot com @ 2012-10-25 12:54 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55073

             Bug #: 55073
           Summary: Wrong Neon code generation at -O2 caused by
                    -fschedule-insns
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: eric.batut@allegorithmic.com


Created attachment 28528
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=28528
Zipfile with repro case, build script, disassembly listings and register flow
analysis

Using gcc trunk at rev 192800, compiled with the Android NDK's build-gcc.sh
script (arm-linux-androideabi target).

Compiling the attached repro case at -O2 yields incorrect results. Correct
results are generated for -O2 -fno-schedule-insns.

The command line to build an incorrect program is :
arm-linux-androideabi-g++ -mandroid -march=armv7-a -mfloat-abi=softfp -mfpu=vfp
-mfpu=neon -fpic -marm -O2 -fno-strict-aliasing -Wall -o repro_ko repro.cpp

The command line to build a correct program is :
arm-linux-androideabi-g++ -mandroid -march=armv7-a -mfloat-abi=softfp -mfpu=vfp
-mfpu=neon -fpic -marm -O2 -fno-schedule-insns -fno-strict-aliasing -Wall -o
repro_ok repro.cpp

I am aware that the test case is quite convoluted but this is because we use
some kind of "universal" 128b vector type that autoconverts to and from other
Neon types (not all ARM compilers have -flax-vector-conversions). Still, both
program should output the same results.

The body of the failing function is pasted below (prolog and epilog omitted):
Correct code (-O2 -fno-schedule-insns):
    vmov    d19, d20  @ v8qi
    vmov    d21, d18  @ v8qi
    vmov    d20, d19  @ v8qi
    vzip.8    d19, d18
    vzip.8    d21, d20
    vswp    d18, d19
    vswp    d20, d21
    vmov    d21, d19  @ v8qi
    vmov    d19, d20  @ v8qi
    vzip.8    d21, d20
    vzip.8    d19, d18
    vswp    d20, d21
    vswp    d18, d19
    vmovl.s8    q10, d21
    vmovl.s8    q9, d19
    vsub.i16    q9, q9, q8
    vsub.i16    q8, q10, q8
    vadd.i16    q8, q9, q8
    vst1.64    {d16-d17}, [r0:128]

Incorrect code (-O2):
    vmov    d19, d20  @ v8qi
    vmov    d22, d18  @ v8qi
    vmov    d21, d20  @ v8qi
    vzip.8    d19, d18
    vzip.8    d22, d21
    vswp    d18, d19
    vmov    d20, d22  @ v8qi
    vmov    d21, d18  @ v8qi
    vzip.8    d22, d19
    vzip.8    d21, d20
    vmovl.s8    q9, d22
    vswp    d20, d21
    vsub.i16    q9, q9, q8
    vmovl.s8    q10, d21
    vsub.i16    q8, q10, q8
    vadd.i16    q8, q9, q8
    vst1.64    {d16-d17}, [r0:128]

I have attached a build.sh script that builds the two versions (OK and KO) of
the output programs. These programs need to be run on any Android ARMV7 target.
This probably happens with linux builds of gcc as well.

I did some register flow tracing to give formal expressions of what ends up in
the return value (well, just before the vsub/vsub/vadd actually). This is in
the attached bug_gcc.txt file (which should be read with hard tabs, tab length
set to 30 or something in order for the formatting to work).

I don't know if this is related to bug 54300 (which by the way is still
"unconfirmed" although I confirmed it occurring even with -fno-strict-aliasing,
do I need to provide more info on this one?)


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2012-11-30 18:50 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-25 12:54 [Bug target/55073] New: Wrong Neon code generation at -O2 caused by -fschedule-insns eric.batut at allegorithmic dot com
2012-10-25 23:02 ` [Bug target/55073] " steven at gcc dot gnu.org
2012-11-29 17:52 ` rearnsha at gcc dot gnu.org
2012-11-29 18:05 ` rearnsha at gcc dot gnu.org
2012-11-30  9:52 ` eric.batut at allegorithmic dot com
2012-11-30  9:58 ` rearnsha at gcc dot gnu.org
2012-11-30 10:14 ` eric.batut at allegorithmic dot com
2012-11-30 11:05 ` eric.batut at allegorithmic dot com
2012-11-30 13:21 ` eric.batut at allegorithmic dot com
2012-11-30 14:00 ` rearnsha at gcc dot gnu.org
2012-11-30 14:29 ` eric.batut at allegorithmic dot com
2012-11-30 14:40 ` rearnsha at gcc dot gnu.org
2012-11-30 14:56 ` rearnsha at gcc dot gnu.org
2012-11-30 15:17 ` eric.batut at allegorithmic dot com
2012-11-30 16:18 ` eric.batut at allegorithmic dot com
2012-11-30 16:21 ` eric.batut at allegorithmic dot com
2012-11-30 17:29 ` rearnsha at gcc dot gnu.org
2012-11-30 18:50 ` rearnsha at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).