* [Bug target/53362] gcc 4.7 generates invalid code with -O3 and -mtune=bdver2
2012-05-15 15:38 [Bug c/53362] New: gcc 4.7 generates invalid code with -O3 and -mtune=bdver2 valerio at aimale dot com
@ 2012-05-15 17:49 ` pinskia at gcc dot gnu.org
2012-05-15 19:02 ` valerio at aimale dot com
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-05-15 17:49 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53362
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |WAITING
Last reconfirmed| |2012-05-15
Component|c |target
Ever Confirmed|0 |1
Severity|major |normal
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-05-15 17:43:29 UTC ---
Can you attach a testcase that can compile and run?
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/53362] gcc 4.7 generates invalid code with -O3 and -mtune=bdver2
2012-05-15 15:38 [Bug c/53362] New: gcc 4.7 generates invalid code with -O3 and -mtune=bdver2 valerio at aimale dot com
2012-05-15 17:49 ` [Bug target/53362] " pinskia at gcc dot gnu.org
@ 2012-05-15 19:02 ` valerio at aimale dot com
2012-05-15 22:15 ` valerio at aimale dot com
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: valerio at aimale dot com @ 2012-05-15 19:02 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53362
--- Comment #2 from Valerio Aimale <valerio at aimale dot com> 2012-05-15 18:07:01 UTC ---
Andrew,
thank you for your email. I'll extract some code from the R code base
and generate a test case.
Valerio
On 5/15/12 11:43 AM, pinskia at gcc dot gnu.org wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53362
>
> Andrew Pinski<pinskia at gcc dot gnu.org> changed:
>
> What |Removed |Added
> ----------------------------------------------------------------------------
> Status|UNCONFIRMED |WAITING
> Last reconfirmed| |2012-05-15
> Component|c |target
> Ever Confirmed|0 |1
> Severity|major |normal
>
> --- Comment #1 from Andrew Pinski<pinskia at gcc dot gnu.org> 2012-05-15 17:43:29 UTC ---
> Can you attach a testcase that can compile and run?
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/53362] gcc 4.7 generates invalid code with -O3 and -mtune=bdver2
2012-05-15 15:38 [Bug c/53362] New: gcc 4.7 generates invalid code with -O3 and -mtune=bdver2 valerio at aimale dot com
2012-05-15 17:49 ` [Bug target/53362] " pinskia at gcc dot gnu.org
2012-05-15 19:02 ` valerio at aimale dot com
@ 2012-05-15 22:15 ` valerio at aimale dot com
2012-05-15 22:24 ` valerio at aimale dot com
2012-05-16 6:31 ` ubizjak at gmail dot com
4 siblings, 0 replies; 6+ messages in thread
From: valerio at aimale dot com @ 2012-05-15 22:15 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53362
--- Comment #3 from Valerio Aimale <valerio at aimale dot com> 2012-05-15 22:13:47 UTC ---
First of all, I made a mistake. The FX-8150 (which is family 14h) requires
-march=bdver1 not bdver2. The SIGSEGV, however, happens even with bdver1
To reproduce, compile R with
CC=gcc-4.7 \
CXX=g++-4.7 \
OBJC=gcc-4.7 \
FC=gfortran-4.7 \
F77=gfortran-4.7 \
CFLAGS="-g -O3 -march=bdver1" \
CXXFLAGS="-g -O3 -march=bdver1" \
OBJCFLAGS="-g -O3 -march=bdver1" \
FCFLAGS="-g -O3 -march=bdver1" \
FFLAGS="-g -O3 -march=bdver1" \
./configure \
--enable-R-shlib \
--enable-threads=posix \
--with-readline \
--with-system-pcre \
--prefix=/usr/local/pkg/R-2.15.0-k15 \
--with-x \
--with-system-zlib \
--with-cairo \
--with-jpeglib \
--with-blas \
--with-lapack \
--with-tcltk \
--with-libpng
Second, the SIGSEGV actually happens inside eval.c at bcEval(). Here's a more
detailed description:
R has a "just in time" compiler that compiles R code to a virtual machine (a la
java like). The SIGSEGV, which happens when optimizing with -O3 -march=bdver1,
happens in the JIT intepreter.
The JIT essential has a switch { case OPERAND 1: ; case OPERAND 2: ... } with a
program counter called pc
This snippet
---
BEGIN_MACHINE {
OP(BCMISMATCH, 0): error(_("byte code version mismatch"));
OP(RETURN, 0): value = GETSTACK(-1); goto done;
OP(GOTO, 1):
{
int label = GETOP();
BC_CHECK_SIGINT();
pc = codebase + label;
NEXT();
}
....
---
which, when preprocessed, translates to:
------------------
(__extension__ ({goto *(*pc++).v;})); init: { loop: switch(which++) {
case BCMISMATCH_OP: opinfo[BCMISMATCH_OP].addr = (__extension__
&&op_BCMISMATCH); opinfo[BCMISMATCH_OP].argc = (0); goto loop; op_BCMISMATCH:
Rf_error(dcgettext (((void *)0), "byte code version mismatch", __LC_MESSAGES));
case RETURN_OP: opinfo[RETURN_OP].addr = (__extension__ &&op_RETURN);
opinfo[RETURN_OP].argc = (0); goto loop; op_RETURN: value = (*(R_BCNodeStackTop
+ (-1))); goto done;
case GOTO_OP: opinfo[GOTO_OP].addr = (__extension__ &&op_GOTO);
opinfo[GOTO_OP].argc = (1); goto loop; op_GOTO:
{
int label = (*pc++).i;
do { if (++evalcount > 1000) { R_CheckUserInterrupt(); evalcount = 0; } }
while (0);
pc = codebase + label;
(__extension__ ({goto *(*pc++).v;}));
}
case BRIFNOT_OP: opinfo[BRIFNOT_OP].addr = (__extension__ &&op_BRIFNOT);
opinfo[BRIFNOT_OP].argc = (2); goto loop; op_BRIFNOT:
{
int callidx = (*pc++).i;
int label = (*pc++).i;
-----------------
now the line
goto *(*pc++).v;
when compiled as -O3 -march=bdver1
translates to
0x00007ffff786bb4e <+366>: lea 0x38(%r15),%rbp
0x00007ffff786bb52 <+370>: data32 data32 data32 data32 nopw
%cs:0x0(%rax,%rax,1)
0x00007ffff786bb60 <+384>: jmpq *%rax
0x00007ffff786bb62 <+386>: nopw 0x0(%rax,%rax,1)
I believe that the goto becomes jmpq *%rax, with nopw before and after being
just fillers for 64bit alignment (not sure though I don't understand those
nopw)
When executing, the code had to run some bytecode; before executing
0x00007ffff786bb60 the return rip correctly contains 0x7ffff787ad4d
(gdb) stepi
0x00007ffff786bb60 4033 BEGIN_MACHINE {
(gdb) info frame 0
Stack frame at 0x7ffffffeff20:
rip = 0x7ffff786bb60 in bcEval (eval.c:4033); saved rip 0x7ffff787ad4d
called by frame at 0x7fffffff0110
source language c.
Arglist at 0x7ffffffef978, args: body=body@entry=0x153ecb0,
rho=rho@entry=0x1540150, useCache=TRUE
Locals at 0x7ffffffef978, Previous frame's sp is 0x7ffffffeff20
Saved registers:
rbx at 0x7ffffffefee8, rbp at 0x7ffffffefef0, r12 at 0x7ffffffefef8, r13 at
0x7ffffffeff00, r14 at 0x7ffffffeff08, r15 at 0x7ffffffeff10, rip at
0x7ffffffeff18
(gdb) info program
Using the running image of child Thread 0x7ffff7fde780 (LWP 25913).
Program stopped at 0x7ffff786bb60.
once i execute 0x00007ffff786bb60
(gdb) stepi
bcEval (useCache=FALSE, rho=0x0, body=0x0) at eval.c:4217
4217 OP(GETFUN, 1):
(gdb) info frame 0
Stack frame at 0x7ffffffefe90:
rip = 0x7ffff7890f97 in bcEval (eval.c:4217); saved rip 0x7ffffffeff30
called by frame at 0x7ffffffefe98
source language c.
Arglist at 0x7ffffffef978, args: useCache=FALSE, rho=0x0, body=0x0
Locals at 0x7ffffffef978, Previous frame's sp is 0x7ffffffefe90
Saved registers:
rbx at 0x7ffffffefe58, rbp at 0x7ffffffefe60, r12 at 0x7ffffffefe68, r13 at
0x7ffffffefe70, r14 at 0x7ffffffefe78, r15 at 0x7ffffffefe80, rip at
0x7ffffffefe88
the return rip is 0x7ffffffeff30, which is outside the program virtual address
space and gives the SIGSEGV when the next retq is executed.
When, instead, I compile with "-O -march=bdver1"
that line, goto *(*pc++).v; , compiles to
209d: 48 83 c3 38 add $0x38,%rbx
20a1: c7 44 24 50 00 00 00 movl $0x0,0x50(%rsp)
20a8: 00
20a9: ff e0 jmpq *%rax
20ab: 41 bd 00 00 00 00 mov $0x0,%r13d
20b1: 4c 8d 35 76 36 00 00 lea 0x3676(%rip),%r14 # 572e
<bcEval+0x3861>
20b8: 89 d0 mov %edx,%eax
jmpq *%rax has only one byte of padding in front and it executes correctly.
Without any optimization, i.e. only with -march=bdver1
it compiles to
9706: 48 8d 05 00 00 00 00 lea 0x0(%rip),%rax # 970d
<bcEval+0x28c>
970d: 8b 04 02 mov (%rdx,%rax,1),%eax
9710: 48 63 d0 movslq %eax,%rdx
9713: 48 8d 05 00 00 00 00 lea 0x0(%rip),%rax # 971a
<bcEval+0x299>
971a: 48 01 d0 add %rdx,%rax
971d: ff e0 jmpq *%rax
971f: 48 8d 05 13 00 00 00 lea 0x13(%rip),%rax # 9739
<bcEval+0x2b8>
9726: 48 89 05 00 00 00 00 mov %rax,0x0(%rip) # 972d
<bcEval+0x2ac>
972d: c7 05 00 00 00 00 00 movl $0x0,0x0(%rip) # 9737
<bcEval+0x2b6>
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/53362] gcc 4.7 generates invalid code with -O3 and -mtune=bdver2
2012-05-15 15:38 [Bug c/53362] New: gcc 4.7 generates invalid code with -O3 and -mtune=bdver2 valerio at aimale dot com
` (2 preceding siblings ...)
2012-05-15 22:15 ` valerio at aimale dot com
@ 2012-05-15 22:24 ` valerio at aimale dot com
2012-05-16 6:31 ` ubizjak at gmail dot com
4 siblings, 0 replies; 6+ messages in thread
From: valerio at aimale dot com @ 2012-05-15 22:24 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53362
--- Comment #4 from Valerio Aimale <valerio at aimale dot com> 2012-05-15 22:15:19 UTC ---
On 5/15/12 11:43 AM, pinskia at gcc dot gnu.org wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53362
>
> Andrew Pinski<pinskia at gcc dot gnu.org> changed:
>
> What |Removed |Added
> ----------------------------------------------------------------------------
> Status|UNCONFIRMED |WAITING
> Last reconfirmed| |2012-05-15
> Component|c |target
> Ever Confirmed|0 |1
> Severity|major |normal
>
> --- Comment #1 from Andrew Pinski<pinskia at gcc dot gnu.org> 2012-05-15 17:43:29 UTC ---
> Can you attach a testcase that can compile and run?
>
Andrew,
I have been unable to come up with a test case, but I dug up more
information. R has a "just in time" compiler that compiles R code to a
virtual machine (a la java like). The SIGSEGV, which happens when
optimizing with -O3 -march=bdver1, happens in the JIT intepreter.
The assembler code I pointed to in the original bug-report is not where
the SIGSEGV happens.
Here's the code, I had to do some major digging with gdb to find the
problem.
the JIT essential has a switch { case OPERAND 1: ; case OPERAND 2: ... }
with a program counter called pc
This snippet
---
BEGIN_MACHINE {
OP(BCMISMATCH, 0): error(_("byte code version mismatch"));
OP(RETURN, 0): value = GETSTACK(-1); goto done;
OP(GOTO, 1):
{
int label = GETOP();
BC_CHECK_SIGINT();
pc = codebase + label;
NEXT();
}
....
---
which, when preprocessed, translates to:
------------------
(__extension__ ({goto *(*pc++).v;})); init: { loop: switch(which++) {
case BCMISMATCH_OP: opinfo[BCMISMATCH_OP].addr = (__extension__
&&op_BCMISMATCH); opinfo[BCMISMATCH_OP].argc = (0); goto loop;
op_BCMISMATCH: Rf_error(dcgettext (((void *)0), "byte code version
mismatch", __LC_MESSAGES));
case RETURN_OP: opinfo[RETURN_OP].addr = (__extension__
&&op_RETURN); opinfo[RETURN_OP].argc = (0); goto loop; op_RETURN: value
= (*(R_BCNodeStackTop + (-1))); goto done;
case GOTO_OP: opinfo[GOTO_OP].addr = (__extension__ &&op_GOTO);
opinfo[GOTO_OP].argc = (1); goto loop; op_GOTO:
{
int label = (*pc++).i;
do { if (++evalcount > 1000) { R_CheckUserInterrupt(); evalcount = 0;
} } while (0);
pc = codebase + label;
(__extension__ ({goto *(*pc++).v;}));
}
case BRIFNOT_OP: opinfo[BRIFNOT_OP].addr = (__extension__
&&op_BRIFNOT); opinfo[BRIFNOT_OP].argc = (2); goto loop; op_BRIFNOT:
{
int callidx = (*pc++).i;
int label = (*pc++).i;
-----------------
now the line
goto *(*pc++).v;
when compiled as -O3 -march=bdver1
translates to
0x00007ffff786bb4e <+366>: lea 0x38(%r15),%rbp
0x00007ffff786bb52 <+370>: data32 data32 data32 data32 nopw
%cs:0x0(%rax,%rax,1)
0x00007ffff786bb60 <+384>: jmpq *%rax
0x00007ffff786bb62 <+386>: nopw 0x0(%rax,%rax,1)
I believe that the goto becomes jmpq *%rax, with nopw before and
after being just fillers for 64bit alignment (not sure though I don't
understand those nopw)
When executing, the code had to run some bytecode; before executing
0x00007ffff786bb60 the return rip correctly contains 0x7ffff787ad4d
(gdb) stepi
0x00007ffff786bb60 4033 BEGIN_MACHINE {
(gdb) info frame 0
Stack frame at 0x7ffffffeff20:
rip = 0x7ffff786bb60 in bcEval (eval.c:4033); saved rip 0x7ffff787ad4d
called by frame at 0x7fffffff0110
source language c.
Arglist at 0x7ffffffef978, args: body=body@entry=0x153ecb0,
rho=rho@entry=0x1540150, useCache=TRUE
Locals at 0x7ffffffef978, Previous frame's sp is 0x7ffffffeff20
Saved registers:
rbx at 0x7ffffffefee8, rbp at 0x7ffffffefef0, r12 at 0x7ffffffefef8,
r13 at 0x7ffffffeff00, r14 at 0x7ffffffeff08, r15 at 0x7ffffffeff10, rip
at 0x7ffffffeff18
(gdb) info program
Using the running image of child Thread 0x7ffff7fde780 (LWP 25913).
Program stopped at 0x7ffff786bb60.
once i execute 0x00007ffff786bb60
(gdb) stepi
bcEval (useCache=FALSE, rho=0x0, body=0x0) at eval.c:4217
4217 OP(GETFUN, 1):
(gdb) info frame 0
Stack frame at 0x7ffffffefe90:
rip = 0x7ffff7890f97 in bcEval (eval.c:4217); saved rip 0x7ffffffeff30
called by frame at 0x7ffffffefe98
source language c.
Arglist at 0x7ffffffef978, args: useCache=FALSE, rho=0x0, body=0x0
Locals at 0x7ffffffef978, Previous frame's sp is 0x7ffffffefe90
Saved registers:
rbx at 0x7ffffffefe58, rbp at 0x7ffffffefe60, r12 at 0x7ffffffefe68,
r13 at 0x7ffffffefe70, r14 at 0x7ffffffefe78, r15 at 0x7ffffffefe80, rip
at 0x7ffffffefe88
the return rip is 0x7ffffffeff30, which is outside the program virtual
address space and gives the SIGSEGV when the next retq is executed.
When, instead, I compile with "-O -march=bdver1"
that line, goto *(*pc++).v; , compiles to
209d: 48 83 c3 38 add $0x38,%rbx
20a1: c7 44 24 50 00 00 00 movl $0x0,0x50(%rsp)
20a8: 00
20a9: ff e0 jmpq *%rax
20ab: 41 bd 00 00 00 00 mov $0x0,%r13d
20b1: 4c 8d 35 76 36 00 00 lea 0x3676(%rip),%r14
# 572e <bcEval+0x3861>
20b8: 89 d0 mov %edx,%eax
jmpq *%rax has only one byte of padding in front and it executes
correctly.
Without any optimization, i.e. only with -march=bdver1
it compiles to
9706: 48 8d 05 00 00 00 00 lea 0x0(%rip),%rax #
970d <bcEval+0x28c>
970d: 8b 04 02 mov (%rdx,%rax,1),%eax
9710: 48 63 d0 movslq %eax,%rdx
9713: 48 8d 05 00 00 00 00 lea 0x0(%rip),%rax #
971a <bcEval+0x299>
971a: 48 01 d0 add %rdx,%rax
971d: ff e0 jmpq *%rax
971f: 48 8d 05 13 00 00 00 lea 0x13(%rip),%rax #
9739 <bcEval+0x2b8>
9726: 48 89 05 00 00 00 00 mov %rax,0x0(%rip) #
972d <bcEval+0x2ac>
972d: c7 05 00 00 00 00 00 movl $0x0,0x0(%rip) #
9737 <bcEval+0x2b6>
Is this enough for you to work with?
Thanks,
Valerio
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/53362] gcc 4.7 generates invalid code with -O3 and -mtune=bdver2
2012-05-15 15:38 [Bug c/53362] New: gcc 4.7 generates invalid code with -O3 and -mtune=bdver2 valerio at aimale dot com
` (3 preceding siblings ...)
2012-05-15 22:24 ` valerio at aimale dot com
@ 2012-05-16 6:31 ` ubizjak at gmail dot com
4 siblings, 0 replies; 6+ messages in thread
From: ubizjak at gmail dot com @ 2012-05-16 6:31 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53362
--- Comment #5 from Uros Bizjak <ubizjak at gmail dot com> 2012-05-16 06:13:53 UTC ---
(In reply to comment #4)
> Is this enough for you to work with?
No, please follow the instructions in [1]. Also, since this is a runtime
problem, we will need (preferrably minimized) source that can be compiled to an
executable that fails.
[1] http://gcc.gnu.org/bugs/#report
^ permalink raw reply [flat|nested] 6+ messages in thread