* [Bug target/98482] -mfentry creates invalid call for -mcmodel=large
2020-12-30 19:13 [Bug c/98482] New: -mfentry creates invalid call for -mcmodel=large toiwoton at gmail dot com
@ 2021-01-07 10:44 ` crazylht at gmail dot com
2021-01-07 12:01 ` ubizjak at gmail dot com
` (13 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: crazylht at gmail dot com @ 2021-01-07 10:44 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482
--- Comment #1 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Topi Miettinen from comment #0)
> GCC on x86_64 with `-mfentry` generates invalid code for `-mcmodel=large`.
> The call to `__fentry__` uses plain `call` instruction, but this can only
> address locations within 32 bit range while the target may be anywhere in
> the 64 bit range due to `-mcmodel=large`.
>
> The expected code would be something which loads a 64 bit value to a
> register and then uses register indirect call, so instead of
> call __fentry__
> there needs to be
> movabsq $__fentry__, %rax
> call *%rax
according to PSabi, %rax is not safe.
%rax
temporary register; with variable arguments No
passes information about the number of vector
registers used; 1 st return register
and by the time of output __fentry__ in gcc, register is already accocated, is
there any regs supposed to be safe in the entry of function? or we need to
spill reg to stack and load it back after call, it looks inefficient.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/98482] -mfentry creates invalid call for -mcmodel=large
2020-12-30 19:13 [Bug c/98482] New: -mfentry creates invalid call for -mcmodel=large toiwoton at gmail dot com
2021-01-07 10:44 ` [Bug target/98482] " crazylht at gmail dot com
@ 2021-01-07 12:01 ` ubizjak at gmail dot com
2021-01-07 12:24 ` ubizjak at gmail dot com
` (12 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: ubizjak at gmail dot com @ 2021-01-07 12:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482
--- Comment #2 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Hongtao.liu from comment #1)
> and by the time of output __fentry__ in gcc, register is already accocated,
> is there any regs supposed to be safe in the entry of function? or we need
> to spill reg to stack and load it back after call, it looks inefficient.
You can use any calee-saved register here.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/98482] -mfentry creates invalid call for -mcmodel=large
2020-12-30 19:13 [Bug c/98482] New: -mfentry creates invalid call for -mcmodel=large toiwoton at gmail dot com
2021-01-07 10:44 ` [Bug target/98482] " crazylht at gmail dot com
2021-01-07 12:01 ` ubizjak at gmail dot com
@ 2021-01-07 12:24 ` ubizjak at gmail dot com
2021-01-07 13:30 ` toiwoton at gmail dot com
` (11 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: ubizjak at gmail dot com @ 2021-01-07 12:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482
--- Comment #3 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Uroš Bizjak from comment #2)
> (In reply to Hongtao.liu from comment #1)
> > and by the time of output __fentry__ in gcc, register is already accocated,
> > is there any regs supposed to be safe in the entry of function? or we need
> > to spill reg to stack and load it back after call, it looks inefficient.
>
> You can use any calee-saved register here.
Eh, no - __fentry__ is called before pushes.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/98482] -mfentry creates invalid call for -mcmodel=large
2020-12-30 19:13 [Bug c/98482] New: -mfentry creates invalid call for -mcmodel=large toiwoton at gmail dot com
` (2 preceding siblings ...)
2021-01-07 12:24 ` ubizjak at gmail dot com
@ 2021-01-07 13:30 ` toiwoton at gmail dot com
2021-01-07 13:37 ` ubizjak at gmail dot com
` (10 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: toiwoton at gmail dot com @ 2021-01-07 13:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482
--- Comment #4 from Topi Miettinen <toiwoton at gmail dot com> ---
Sorry, I didn't check the ABI. It seems that %r11 and maybe %r10 should be
usable:
Figure 3.4: Register Usage
Register
Usage
Preserved across function calls
%r10
temporary register, used for passing a function’s static chain pointer
No
%r11
temporary register
No
Otherwise, I suppose any register could be used if it's saved:
pushq %reg
movabsq $__fentry__, %reg
call *%reg
popq %reg
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/98482] -mfentry creates invalid call for -mcmodel=large
2020-12-30 19:13 [Bug c/98482] New: -mfentry creates invalid call for -mcmodel=large toiwoton at gmail dot com
` (3 preceding siblings ...)
2021-01-07 13:30 ` toiwoton at gmail dot com
@ 2021-01-07 13:37 ` ubizjak at gmail dot com
2021-01-07 23:50 ` hjl.tools at gmail dot com
` (9 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: ubizjak at gmail dot com @ 2021-01-07 13:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482
--- Comment #5 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Topi Miettinen from comment #4)
> Sorry, I didn't check the ABI. It seems that %r11 and maybe %r10 should be
> usable:
%r11 is already used as PROFILE_COUNT_REGISTER for !NO_PROFILE_COUNTERS
targets.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/98482] -mfentry creates invalid call for -mcmodel=large
2020-12-30 19:13 [Bug c/98482] New: -mfentry creates invalid call for -mcmodel=large toiwoton at gmail dot com
` (4 preceding siblings ...)
2021-01-07 13:37 ` ubizjak at gmail dot com
@ 2021-01-07 23:50 ` hjl.tools at gmail dot com
2021-01-08 2:08 ` crazylht at gmail dot com
` (8 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: hjl.tools at gmail dot com @ 2021-01-07 23:50 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482
H.J. Lu <hjl.tools at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Ever confirmed|0 |1
Last reconfirmed|2021-01-04 00:00:00 |2021-01-07
Status|UNCONFIRMED |NEW
--- Comment #6 from H.J. Lu <hjl.tools at gmail dot com> ---
A patch is posted at
https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563033.html
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/98482] -mfentry creates invalid call for -mcmodel=large
2020-12-30 19:13 [Bug c/98482] New: -mfentry creates invalid call for -mcmodel=large toiwoton at gmail dot com
` (5 preceding siblings ...)
2021-01-07 23:50 ` hjl.tools at gmail dot com
@ 2021-01-08 2:08 ` crazylht at gmail dot com
2021-01-08 8:58 ` toiwoton at gmail dot com
` (7 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: crazylht at gmail dot com @ 2021-01-08 2:08 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482
--- Comment #7 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to H.J. Lu from comment #6)
> A patch is posted at
>
> https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563033.html
Yes, %r10 is pushed before __fentry__
cat test.c
void func(int (*param)(int));
void outer(int x)
{
int nested(int y)
{
// If x is not used somewhere in here,
// then the function will be "lifted" into
// a normal, non-nested function.
return x + y;
}
func(nested);
}
with -O2 -pg -mfentry got
nested.0:
.LFB1:
.cfi_startproc
pushq %r10
1: call __fentry__
popq %r10
movl (%r10), %eax
addl %edi, %eax
ret
.cfi_endproc
.LFE1:
.size nested.0, .-nested.0
.p2align 4
.globl outer
.type outer, @function
outer:
.LFB0:
.cfi_startproc
1: call __fentry__
subq $56, %rsp
.cfi_def_cfa_offset 64
leaq 64(%rsp), %rax
movq %rax, 32(%rsp)
movl $-17599, %eax
movl %edi, (%rsp)
movw %ax, 4(%rsp)
movl $-17847, %edx
movl $nested.0, %eax
leaq 4(%rsp), %rdi
movl %eax, 6(%rsp)
movw %dx, 10(%rsp)
movq %rsp, 12(%rsp)
movl $-1864106167, 20(%rsp)
call func
addq $56, %rsp
.cfi_def_cfa_offset 8
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/98482] -mfentry creates invalid call for -mcmodel=large
2020-12-30 19:13 [Bug c/98482] New: -mfentry creates invalid call for -mcmodel=large toiwoton at gmail dot com
` (6 preceding siblings ...)
2021-01-08 2:08 ` crazylht at gmail dot com
@ 2021-01-08 8:58 ` toiwoton at gmail dot com
2021-01-08 9:29 ` ubizjak at gmail dot com
` (6 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: toiwoton at gmail dot com @ 2021-01-08 8:58 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482
--- Comment #8 from Topi Miettinen <toiwoton at gmail dot com> ---
I'm unfortunately ignorant to GCC internals and usage of %r10, but otherwise
the patch looks good to me.
For -mcmodel=large -fPIC, the call sequence probably needs to be similar to how
other extern functions are called under those flags:
.L2:
movabsq $_GLOBAL_OFFSET_TABLE_-.L2, %reg0
leaq .L2(%rip), %reg1
movabsq $__fentry__@PLTOFF, %reg2
addq %reg0, %reg1
addq %reg1, %reg2
call *%reg2
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/98482] -mfentry creates invalid call for -mcmodel=large
2020-12-30 19:13 [Bug c/98482] New: -mfentry creates invalid call for -mcmodel=large toiwoton at gmail dot com
` (7 preceding siblings ...)
2021-01-08 8:58 ` toiwoton at gmail dot com
@ 2021-01-08 9:29 ` ubizjak at gmail dot com
2021-01-08 9:34 ` jakub at gcc dot gnu.org
` (5 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: ubizjak at gmail dot com @ 2021-01-08 9:29 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482
--- Comment #9 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Topi Miettinen from comment #8)
> I'm unfortunately ignorant to GCC internals and usage of %r10, but otherwise
> the patch looks good to me.
>
> For -mcmodel=large -fPIC, the call sequence probably needs to be similar to
> how other extern functions are called under those flags:
>
> .L2:
> movabsq $_GLOBAL_OFFSET_TABLE_-.L2, %reg0
> leaq .L2(%rip), %reg1
> movabsq $__fentry__@PLTOFF, %reg2
> addq %reg0, %reg1
> addq %reg1, %reg2
> call *%reg2
We are only lucky to get one temporary register (%r10), so perhaps the above
could be implemented for NO_PROFILE_COUNTERS targets, where %r11 is also
available.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/98482] -mfentry creates invalid call for -mcmodel=large
2020-12-30 19:13 [Bug c/98482] New: -mfentry creates invalid call for -mcmodel=large toiwoton at gmail dot com
` (8 preceding siblings ...)
2021-01-08 9:29 ` ubizjak at gmail dot com
@ 2021-01-08 9:34 ` jakub at gcc dot gnu.org
2021-01-08 12:54 ` cvs-commit at gcc dot gnu.org
` (4 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-01-08 9:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
If we are emitting for nested functions
pushq %r10
1: call __fentry__
popq %r10
(is it ok to misalign the stack for __fentry__? but then even plain call
__fentry__ actually misaligns it), then perhaps we can do similarly for the PIC
case. But I wonder how does __fentry__ then find the caller if it can't rely
on the return address being right above the return address to the function that
called __fentry__ (appart from unwind info of course, but we don't really emit
.cfi_* directives here either, do we?).
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/98482] -mfentry creates invalid call for -mcmodel=large
2020-12-30 19:13 [Bug c/98482] New: -mfentry creates invalid call for -mcmodel=large toiwoton at gmail dot com
` (9 preceding siblings ...)
2021-01-08 9:34 ` jakub at gcc dot gnu.org
@ 2021-01-08 12:54 ` cvs-commit at gcc dot gnu.org
2021-01-08 14:48 ` cvs-commit at gcc dot gnu.org
` (3 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-01-08 12:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482
--- Comment #11 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by H.J. Lu <hjl@gcc.gnu.org>:
https://gcc.gnu.org/g:1b885264a48dcd71b7aeb26c0abeb91246724897
commit r11-6548-g1b885264a48dcd71b7aeb26c0abeb91246724897
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Thu Jan 7 14:27:49 2021 -0800
x86-64: Use R10 for profiling large model
R10 is caller-saved. Although it can be used as a static chain register,
it is preserved when calling mcount for nested functions. Use R10 as a
scratch register to call mcount in large model.
gcc/
PR target/98482
* config/i386/i386.c (x86_function_profiler): Use R10 to call
mcount in large model. Sorry for large model with PIC.
gcc/testsuite/
PR target/98482
* gcc.target/i386/pr98482-1.c: New test.
* gcc.target/i386/pr98482-1.c: Likewise.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/98482] -mfentry creates invalid call for -mcmodel=large
2020-12-30 19:13 [Bug c/98482] New: -mfentry creates invalid call for -mcmodel=large toiwoton at gmail dot com
` (10 preceding siblings ...)
2021-01-08 12:54 ` cvs-commit at gcc dot gnu.org
@ 2021-01-08 14:48 ` cvs-commit at gcc dot gnu.org
2021-01-08 14:49 ` hjl.tools at gmail dot com
` (2 subsequent siblings)
14 siblings, 0 replies; 16+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-01-08 14:48 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482
--- Comment #12 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by H.J. Lu <hjl@gcc.gnu.org>:
https://gcc.gnu.org/g:76be18f442948d1a4bc49a7d670b07097f9e5983
commit r11-6552-g76be18f442948d1a4bc49a7d670b07097f9e5983
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Jan 8 05:20:19 2021 -0800
x86-64: Use R10 and R11 for profiling large model with PIC
For NO_PROFILE_COUNTERS targets, R11 is a scratch register. We can use
R10 and R11 to call mcount in large model with PIC.
gcc/
PR target/98482
* config/i386/i386.c (x86_function_profiler): Use R10 and R11
to call mcount in large model with PIC for NO_PROFILE_COUNTERS
targets.
gcc/testsuite/
PR target/98482
* gcc.target/i386/pr98482-2.c: Updated.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/98482] -mfentry creates invalid call for -mcmodel=large
2020-12-30 19:13 [Bug c/98482] New: -mfentry creates invalid call for -mcmodel=large toiwoton at gmail dot com
` (11 preceding siblings ...)
2021-01-08 14:48 ` cvs-commit at gcc dot gnu.org
@ 2021-01-08 14:49 ` hjl.tools at gmail dot com
2021-01-08 14:50 ` ubizjak at gmail dot com
2021-01-08 16:47 ` cvs-commit at gcc dot gnu.org
14 siblings, 0 replies; 16+ messages in thread
From: hjl.tools at gmail dot com @ 2021-01-08 14:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482
--- Comment #13 from H.J. Lu <hjl.tools at gmail dot com> ---
Fixed for GCC 11 so far. Please open a new GCC bug for mcount stack
alignment.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/98482] -mfentry creates invalid call for -mcmodel=large
2020-12-30 19:13 [Bug c/98482] New: -mfentry creates invalid call for -mcmodel=large toiwoton at gmail dot com
` (12 preceding siblings ...)
2021-01-08 14:49 ` hjl.tools at gmail dot com
@ 2021-01-08 14:50 ` ubizjak at gmail dot com
2021-01-08 16:47 ` cvs-commit at gcc dot gnu.org
14 siblings, 0 replies; 16+ messages in thread
From: ubizjak at gmail dot com @ 2021-01-08 14:50 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482
--- Comment #14 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Jakub Jelinek from comment #10)
> If we are emitting for nested functions
> pushq %r10
> 1: call __fentry__
> popq %r10
> (is it ok to misalign the stack for __fentry__? but then even plain call
> __fentry__ actually misaligns it), then perhaps we can do similarly for the
> PIC case. But I wonder how does __fentry__ then find the caller if it can't
> rely on the return address being right above the return address to the
> function that called __fentry__ (appart from unwind info of course, but we
> don't really emit .cfi_* directives here either, do we?).
Generic part of the compiler pushes static chain register for nested functions,
so there is little we can do in the target part. If there is a problem with
misaligned stack, then I think __mcount_internal will have to be realigned,
because calls to both, mcount and __fentry__ can be misaligned.
I don't know what to do with __fentry__ argument. Luckily, mcount finds its
argument via frame pointer, so it works there.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [Bug target/98482] -mfentry creates invalid call for -mcmodel=large
2020-12-30 19:13 [Bug c/98482] New: -mfentry creates invalid call for -mcmodel=large toiwoton at gmail dot com
` (13 preceding siblings ...)
2021-01-08 14:50 ` ubizjak at gmail dot com
@ 2021-01-08 16:47 ` cvs-commit at gcc dot gnu.org
14 siblings, 0 replies; 16+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-01-08 16:47 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98482
--- Comment #15 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by H.J. Lu <hjl@gcc.gnu.org>:
https://gcc.gnu.org/g:745d04e796c1a7ebcea0185d0742d58b0c0030ab
commit r11-6557-g745d04e796c1a7ebcea0185d0742d58b0c0030ab
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri Jan 8 08:41:38 2021 -0800
x86-64: Require lp64 for PR target/98482 tests
Require lp64 for PR target/98482 tests since -mcmodel=large is isn't
supported for x32.
PR target/98482
* gcc.target/i386/pr98482-1.c: Require lp64.
* gcc.target/i386/pr98482-2.c: Likewise.
^ permalink raw reply [flat|nested] 16+ messages in thread