* [RFC PATCH, i386]: Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on x86_64
@ 2012-12-24 15:40 Uros Bizjak
2012-12-24 21:32 ` Andi Kleen
0 siblings, 1 reply; 17+ messages in thread
From: Uros Bizjak @ 2012-12-24 15:40 UTC (permalink / raw)
To: gcc-patches; +Cc: Richard Henderson, Jakub Jelinek, jh, H.J. Lu
[-- Attachment #1: Type: text/plain, Size: 1043 bytes --]
Hello!
Currently, we use %rbx as REAL_PIC_OFFSET_TABLE_REGNUM on x86_64.
Since this register gets marked as fixed reg in
ix86_conditional_register_usage, we get into troubles with insns that
use %rbx (cmpxchg, cpuid). According to x86_64 psABI, we are free to
use any register, so attached patch changes %rbx with %r15 (also
following the example in the psABI). This patch has no implications on
small code model (that doesn't use REAL_PIC_OFFSET_TABLE_REGNUM
anyway), but on medium and large code model fixes usage of cpuid.h
(please see PR 55712 [1]) and avoids a pair of xchgs around cmpxchg or
cpuid instructions.
Probably, we can also enhance ix86_select_alt_pic_regnum for x86_64,
but this is 4.9 material.
2012-12-24 Uros Bizjak <ubizjak@gmail.com>
* config/i386/i386.md (R14_REG, R15_REG): New constants.
* config/i386/i386.h (REAL_PIC_OFFSET_TABLE_REGNUM): Use R15_REG
for 64bit targets.
[1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55712
Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
Uros.
[-- Attachment #2: p.diff.txt --]
[-- Type: text/plain, Size: 791 bytes --]
Index: i386.h
===================================================================
--- i386.h (revision 194703)
+++ i386.h (working copy)
@@ -1173,7 +1173,7 @@
the pic register when possible. The change is visible after the
prologue has been emitted. */
-#define REAL_PIC_OFFSET_TABLE_REGNUM BX_REG
+#define REAL_PIC_OFFSET_TABLE_REGNUM (TARGET_64BIT ? R15_REG : BX_REG)
#define PIC_OFFSET_TABLE_REGNUM \
((TARGET_64BIT && ix86_cmodel == CM_SMALL_PIC) \
Index: i386.md
===================================================================
--- i386.md (revision 194703)
+++ i386.md (working copy)
@@ -301,6 +301,8 @@
(R11_REG 40)
(R12_REG 41)
(R13_REG 42)
+ (R14_REG 43)
+ (R15_REG 44)
(XMM8_REG 45)
(XMM9_REG 46)
(XMM10_REG 47)
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH, i386]: Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on x86_64
2012-12-24 15:40 [RFC PATCH, i386]: Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on x86_64 Uros Bizjak
@ 2012-12-24 21:32 ` Andi Kleen
2012-12-24 22:26 ` Leif Ekblad
0 siblings, 1 reply; 17+ messages in thread
From: Andi Kleen @ 2012-12-24 21:32 UTC (permalink / raw)
To: Uros Bizjak; +Cc: gcc-patches, Richard Henderson, Jakub Jelinek, jh, H.J. Lu
Uros Bizjak <ubizjak@gmail.com> writes:
> Hello!
>
> Currently, we use %rbx as REAL_PIC_OFFSET_TABLE_REGNUM on x86_64.
> Since this register gets marked as fixed reg in
> ix86_conditional_register_usage, we get into troubles with insns that
> use %rbx (cmpxchg, cpuid). According to x86_64 psABI, we are free to
> use any register, so attached patch changes %rbx with %r15 (also
> following the example in the psABI). This patch has no implications on
> small code model (that doesn't use REAL_PIC_OFFSET_TABLE_REGNUM
> anyway), but on medium and large code model fixes usage of cpuid.h
> (please see PR 55712 [1]) and avoids a pair of xchgs around cmpxchg or
> cpuid instructions.
So everyone who worked around this and use r15 now broke.
It would be probably better to just teach the register allocator
to spill those registers as needed, to handle some asm() statement.
Not sure how hard that would be.
-Andi
--
ak@linux.intel.com -- Speaking for myself only
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH, i386]: Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on x86_64
2012-12-24 21:32 ` Andi Kleen
@ 2012-12-24 22:26 ` Leif Ekblad
2012-12-25 5:54 ` Mike Frysinger
0 siblings, 1 reply; 17+ messages in thread
From: Leif Ekblad @ 2012-12-24 22:26 UTC (permalink / raw)
To: Andi Kleen, Uros Bizjak
Cc: gcc-patches, Richard Henderson, Jakub Jelinek, jh, H.J. Lu
In the case of cpuid, the code is hardly performance sensitive, and probably
runs only at startup. An alternative solution for the broken code here is to
move the result from rbx to another register, and to save/restore rbx.
Currently, this is the only place in libgcc and newlib affected by this
problem.
Leif Ekblad
----- Original Message -----
From: "Andi Kleen" <andi@firstfloor.org>
To: "Uros Bizjak" <ubizjak@gmail.com>
Cc: <gcc-patches@gcc.gnu.org>; "Richard Henderson" <rth@redhat.com>; "Jakub
Jelinek" <jakub@redhat.com>; <jh@suse.cz>; "H.J. Lu" <hjl.tools@gmail.com>
Sent: Monday, December 24, 2012 10:32 PM
Subject: Re: [RFC PATCH, i386]: Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on
x86_64
> Uros Bizjak <ubizjak@gmail.com> writes:
>
>> Hello!
>>
>> Currently, we use %rbx as REAL_PIC_OFFSET_TABLE_REGNUM on x86_64.
>> Since this register gets marked as fixed reg in
>> ix86_conditional_register_usage, we get into troubles with insns that
>> use %rbx (cmpxchg, cpuid). According to x86_64 psABI, we are free to
>> use any register, so attached patch changes %rbx with %r15 (also
>> following the example in the psABI). This patch has no implications on
>> small code model (that doesn't use REAL_PIC_OFFSET_TABLE_REGNUM
>> anyway), but on medium and large code model fixes usage of cpuid.h
>> (please see PR 55712 [1]) and avoids a pair of xchgs around cmpxchg or
>> cpuid instructions.
>
> So everyone who worked around this and use r15 now broke.
>
> It would be probably better to just teach the register allocator
> to spill those registers as needed, to handle some asm() statement.
> Not sure how hard that would be.
>
> -Andi
>
> --
> ak@linux.intel.com -- Speaking for myself only
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH, i386]: Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on x86_64
2012-12-24 22:26 ` Leif Ekblad
@ 2012-12-25 5:54 ` Mike Frysinger
2012-12-25 19:12 ` Uros Bizjak
0 siblings, 1 reply; 17+ messages in thread
From: Mike Frysinger @ 2012-12-25 5:54 UTC (permalink / raw)
To: gcc-patches
Cc: Leif Ekblad, Andi Kleen, Uros Bizjak, Richard Henderson,
Jakub Jelinek, jh, H.J. Lu
[-- Attachment #1: Type: Text/Plain, Size: 873 bytes --]
On Monday 24 December 2012 17:26:47 Leif Ekblad wrote:
> In the case of cpuid, the code is hardly performance sensitive, and
> probably runs only at startup. An alternative solution for the broken code
> here is to move the result from rbx to another register, and to
> save/restore rbx. Currently, this is the only place in libgcc and newlib
> affected by this problem.
it's not a question of performance. i can't remember how many various
projects i've had to tweak the inline asm code to work with __PIC__ (either
because it's going into shared library code or it's being built as a PIE).
Andi's point is now we have to redo all of that work a 2nd time and handle two
different cases depending on gcc version ? it'd be a _lot_ better if gcc were
intelligent and end users didn't have to code crap like stuffing %ebx somewhere
temporarily.
-mike
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH, i386]: Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on x86_64
2012-12-25 5:54 ` Mike Frysinger
@ 2012-12-25 19:12 ` Uros Bizjak
2012-12-25 19:27 ` Mike Frysinger
0 siblings, 1 reply; 17+ messages in thread
From: Uros Bizjak @ 2012-12-25 19:12 UTC (permalink / raw)
To: Mike Frysinger
Cc: gcc-patches, Leif Ekblad, Andi Kleen, Richard Henderson,
Jakub Jelinek, jh, H.J. Lu
On Tue, Dec 25, 2012 at 6:54 AM, Mike Frysinger <vapier@gentoo.org> wrote:
> On Monday 24 December 2012 17:26:47 Leif Ekblad wrote:
>> In the case of cpuid, the code is hardly performance sensitive, and
>> probably runs only at startup. An alternative solution for the broken code
>> here is to move the result from rbx to another register, and to
>> save/restore rbx. Currently, this is the only place in libgcc and newlib
>> affected by this problem.
>
> it's not a question of performance. i can't remember how many various
> projects i've had to tweak the inline asm code to work with __PIC__ (either
> because it's going into shared library code or it's being built as a PIE).
> Andi's point is now we have to redo all of that work a 2nd time and handle two
> different cases depending on gcc version ? it'd be a _lot_ better if gcc were
> intelligent and end users didn't have to code crap like stuffing %ebx somewhere
> temporarily.
Please note that we are not talking about 32bit code, where this would
make a difference, but for 64bit targets with -mcmodel=medium and
-mcmodel=large exclusively. The default x64_64 -mcmodel=small doesn't
use PIC register, other code models are rarely used, so I sincerely
doubt that any %rbx workarounds were needed in the past for x86_64.
Uros.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH, i386]: Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on x86_64
2012-12-25 19:12 ` Uros Bizjak
@ 2012-12-25 19:27 ` Mike Frysinger
2012-12-26 9:53 ` Uros Bizjak
0 siblings, 1 reply; 17+ messages in thread
From: Mike Frysinger @ 2012-12-25 19:27 UTC (permalink / raw)
To: Uros Bizjak
Cc: gcc-patches, Leif Ekblad, Andi Kleen, Richard Henderson,
Jakub Jelinek, jh, H.J. Lu
[-- Attachment #1: Type: Text/Plain, Size: 1628 bytes --]
On Tuesday 25 December 2012 14:12:09 Uros Bizjak wrote:
> On Tue, Dec 25, 2012 at 6:54 AM, Mike Frysinger <vapier@gentoo.org> wrote:
> > On Monday 24 December 2012 17:26:47 Leif Ekblad wrote:
> >> In the case of cpuid, the code is hardly performance sensitive, and
> >> probably runs only at startup. An alternative solution for the broken
> >> code here is to move the result from rbx to another register, and to
> >> save/restore rbx. Currently, this is the only place in libgcc and
> >> newlib affected by this problem.
> >
> > it's not a question of performance. i can't remember how many various
> > projects i've had to tweak the inline asm code to work with __PIC__
> > (either because it's going into shared library code or it's being built
> > as a PIE). Andi's point is now we have to redo all of that work a 2nd
> > time and handle two different cases depending on gcc version ? it'd be
> > a _lot_ better if gcc were intelligent and end users didn't have to code
> > crap like stuffing %ebx somewhere temporarily.
>
> Please note that we are not talking about 32bit code, where this would
> make a difference, but for 64bit targets with -mcmodel=medium and
> -mcmodel=large exclusively. The default x64_64 -mcmodel=small doesn't
> use PIC register, other code models are rarely used, so I sincerely
> doubt that any %rbx workarounds were needed in the past for x86_64.
i'm aware. the comment still applies. you're breaking asm code that used to
work because the gcc inline asm code isn't intelligent enough (currently) to
transparently handle the PIC register for the user.
-mike
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH, i386]: Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on x86_64
2012-12-25 19:27 ` Mike Frysinger
@ 2012-12-26 9:53 ` Uros Bizjak
2012-12-26 20:16 ` Andi Kleen
0 siblings, 1 reply; 17+ messages in thread
From: Uros Bizjak @ 2012-12-26 9:53 UTC (permalink / raw)
To: Mike Frysinger
Cc: gcc-patches, Leif Ekblad, Andi Kleen, Richard Henderson,
Jakub Jelinek, jh, H.J. Lu
On Tue, Dec 25, 2012 at 8:27 PM, Mike Frysinger <vapier@gentoo.org> wrote:
>> >> In the case of cpuid, the code is hardly performance sensitive, and
>> >> probably runs only at startup. An alternative solution for the broken
>> >> code here is to move the result from rbx to another register, and to
>> >> save/restore rbx. Currently, this is the only place in libgcc and
>> >> newlib affected by this problem.
>> >
>> > it's not a question of performance. i can't remember how many various
>> > projects i've had to tweak the inline asm code to work with __PIC__
>> > (either because it's going into shared library code or it's being built
>> > as a PIE). Andi's point is now we have to redo all of that work a 2nd
>> > time and handle two different cases depending on gcc version ? it'd be
>> > a _lot_ better if gcc were intelligent and end users didn't have to code
>> > crap like stuffing %ebx somewhere temporarily.
>>
>> Please note that we are not talking about 32bit code, where this would
>> make a difference, but for 64bit targets with -mcmodel=medium and
>> -mcmodel=large exclusively. The default x64_64 -mcmodel=small doesn't
>> use PIC register, other code models are rarely used, so I sincerely
>> doubt that any %rbx workarounds were needed in the past for x86_64.
>
> i'm aware. the comment still applies. you're breaking asm code that used to
> work because the gcc inline asm code isn't intelligent enough (currently) to
> transparently handle the PIC register for the user.
Can you please post a real-world example, where using %r15 would break
existing code?
Uros.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH, i386]: Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on x86_64
2012-12-26 9:53 ` Uros Bizjak
@ 2012-12-26 20:16 ` Andi Kleen
2012-12-27 8:08 ` Uros Bizjak
0 siblings, 1 reply; 17+ messages in thread
From: Andi Kleen @ 2012-12-26 20:16 UTC (permalink / raw)
To: Uros Bizjak
Cc: Mike Frysinger, gcc-patches, Leif Ekblad, Andi Kleen,
Richard Henderson, Jakub Jelinek, jh, H.J. Lu
> Can you please post a real-world example, where using %r15 would break
> existing code?
I used to run into problems like this when porting code to gcc from icc or VC.
A lot of hyper optimized inline assembler snippets wants to use all registers
and icc/VC support that. With gcc usually had to add some manual
push/pops. In older gcc versions usually more than one because reload
tended to error out otherwise.
So by try and error used the non fixed registers, but let the compiler know
about the others. This case would break.
In 64bit it was less a problem than on 32bit, but could still happen.
Admittedly medium is somewhat obscure and rarely used (and very slow),
but someone could be still using it.
-Andi
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH, i386]: Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on x86_64
2012-12-26 20:16 ` Andi Kleen
@ 2012-12-27 8:08 ` Uros Bizjak
2012-12-27 8:09 ` Uros Bizjak
2012-12-28 20:27 ` Richard Henderson
0 siblings, 2 replies; 17+ messages in thread
From: Uros Bizjak @ 2012-12-27 8:08 UTC (permalink / raw)
To: Andi Kleen
Cc: Mike Frysinger, gcc-patches, Leif Ekblad, Richard Henderson,
Jakub Jelinek, jh, H.J. Lu
On Wed, Dec 26, 2012 at 9:16 PM, Andi Kleen <andi@firstfloor.org> wrote:
>> Can you please post a real-world example, where using %r15 would break
>> existing code?
>
> I used to run into problems like this when porting code to gcc from icc or VC.
> A lot of hyper optimized inline assembler snippets wants to use all registers
> and icc/VC support that. With gcc usually had to add some manual
> push/pops. In older gcc versions usually more than one because reload
> tended to error out otherwise.
>
> So by try and error used the non fixed registers, but let the compiler know
> about the others. This case would break.
>
> In 64bit it was less a problem than on 32bit, but could still happen.
>
> Admittedly medium is somewhat obscure and rarely used (and very slow),
> but someone could be still using it.
The alternative approach is changing cpuid definition in cpuid.h (as
in attached patch) to preserve %rbx, but we can't detect various code
model settings there. Since the change depends on the definition of
__PIC__, we unnecessary preserve %rbx also for -mcmodel=small.
However, code that involves cpuid is rarely performance critical, so
perhaps we can live with this tradeoff.
IMO, this patch can be used on 4.7 branch, too.
Uros.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH, i386]: Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on x86_64
2012-12-27 8:08 ` Uros Bizjak
@ 2012-12-27 8:09 ` Uros Bizjak
2012-12-27 9:10 ` Florian Weimer
2012-12-28 20:27 ` Richard Henderson
1 sibling, 1 reply; 17+ messages in thread
From: Uros Bizjak @ 2012-12-27 8:09 UTC (permalink / raw)
To: Andi Kleen
Cc: Mike Frysinger, gcc-patches, Leif Ekblad, Richard Henderson,
Jakub Jelinek, jh, H.J. Lu
[-- Attachment #1: Type: text/plain, Size: 1377 bytes --]
On Thu, Dec 27, 2012 at 9:08 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Wed, Dec 26, 2012 at 9:16 PM, Andi Kleen <andi@firstfloor.org> wrote:
>>> Can you please post a real-world example, where using %r15 would break
>>> existing code?
>>
>> I used to run into problems like this when porting code to gcc from icc or VC.
>> A lot of hyper optimized inline assembler snippets wants to use all registers
>> and icc/VC support that. With gcc usually had to add some manual
>> push/pops. In older gcc versions usually more than one because reload
>> tended to error out otherwise.
>>
>> So by try and error used the non fixed registers, but let the compiler know
>> about the others. This case would break.
>>
>> In 64bit it was less a problem than on 32bit, but could still happen.
>>
>> Admittedly medium is somewhat obscure and rarely used (and very slow),
>> but someone could be still using it.
>
> The alternative approach is changing cpuid definition in cpuid.h (as
> in attached patch) to preserve %rbx, but we can't detect various code
> model settings there. Since the change depends on the definition of
> __PIC__, we unnecessary preserve %rbx also for -mcmodel=small.
> However, code that involves cpuid is rarely performance critical, so
> perhaps we can live with this tradeoff.
>
> IMO, this patch can be used on 4.7 branch, too.
Now with the patch ...
Uros.
[-- Attachment #2: r.diff.txt --]
[-- Type: text/plain, Size: 1160 bytes --]
Index: i386/cpuid.h
===================================================================
--- i386/cpuid.h (revision 194723)
+++ i386/cpuid.h (working copy)
@@ -132,8 +132,9 @@
#define signature_VORTEX_ecx 0x436f5320
#define signature_VORTEX_edx 0x36387865
-#if defined(__i386__) && defined(__PIC__)
+#if defined(__PIC__)
/* %ebx may be the PIC register. */
+#if defined(__i386__)
#if __GNUC__ >= 3
#define __cpuid(level, a, b, c, d) \
__asm__ ("xchg{l}\t{%%}ebx, %1\n\t" \
@@ -165,6 +166,21 @@
: "=a" (a), "=r" (b), "=c" (c), "=d" (d) \
: "0" (level), "2" (count))
#endif
+#elif defined(__x86_64__)
+#define __cpuid(level, a, b, c, d) \
+ __asm__ ("xchg{q}\t{%%}rbx, %q1\n\t" \
+ "cpuid\n\t" \
+ "xchg{q}\t{%%}rbx, %q1\n\t" \
+ : "=a" (a), "=r" (b), "=c" (c), "=d" (d) \
+ : "0" (level))
+
+#define __cpuid_count(level, count, a, b, c, d) \
+ __asm__ ("xchg{q}\t{%%}rbx, %q1\n\t" \
+ "cpuid\n\t" \
+ "xchg{q}\t{%%}rbx, %q1\n\t" \
+ : "=a" (a), "=r" (b), "=c" (c), "=d" (d) \
+ : "0" (level), "2" (count))
+#endif
#else
#define __cpuid(level, a, b, c, d) \
__asm__ ("cpuid\n\t" \
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH, i386]: Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on x86_64
2012-12-27 8:09 ` Uros Bizjak
@ 2012-12-27 9:10 ` Florian Weimer
2012-12-27 9:17 ` Uros Bizjak
0 siblings, 1 reply; 17+ messages in thread
From: Florian Weimer @ 2012-12-27 9:10 UTC (permalink / raw)
To: Uros Bizjak
Cc: Andi Kleen, Mike Frysinger, gcc-patches, Leif Ekblad,
Richard Henderson, Jakub Jelinek, jh, H.J. Lu
* Uros Bizjak:
> +#elif defined(__x86_64__)
> +#define __cpuid(level, a, b, c, d) \
> + __asm__ ("xchg{q}\t{%%}rbx, %q1\n\t" \
> + "cpuid\n\t" \
> + "xchg{q}\t{%%}rbx, %q1\n\t" \
> + : "=a" (a), "=r" (b), "=c" (c), "=d" (d) \
> + : "0" (level))
> +
> +#define __cpuid_count(level, count, a, b, c, d) \
> + __asm__ ("xchg{q}\t{%%}rbx, %q1\n\t" \
> + "cpuid\n\t" \
> + "xchg{q}\t{%%}rbx, %q1\n\t" \
> + : "=a" (a), "=r" (b), "=c" (c), "=d" (d) \
> + : "0" (level), "2" (count))
> +#endif
Shouldn't the constraint for b be "=&r"?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH, i386]: Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on x86_64
2012-12-27 9:10 ` Florian Weimer
@ 2012-12-27 9:17 ` Uros Bizjak
0 siblings, 0 replies; 17+ messages in thread
From: Uros Bizjak @ 2012-12-27 9:17 UTC (permalink / raw)
To: Florian Weimer
Cc: Andi Kleen, Mike Frysinger, gcc-patches, Leif Ekblad,
Richard Henderson, Jakub Jelinek, jh, H.J. Lu
On Thu, Dec 27, 2012 at 10:10 AM, Florian Weimer <fw@deneb.enyo.de> wrote:
> * Uros Bizjak:
>
>> +#elif defined(__x86_64__)
>> +#define __cpuid(level, a, b, c, d) \
>> + __asm__ ("xchg{q}\t{%%}rbx, %q1\n\t" \
>> + "cpuid\n\t" \
>> + "xchg{q}\t{%%}rbx, %q1\n\t" \
>> + : "=a" (a), "=r" (b), "=c" (c), "=d" (d) \
>> + : "0" (level))
>> +
>> +#define __cpuid_count(level, count, a, b, c, d) \
>> + __asm__ ("xchg{q}\t{%%}rbx, %q1\n\t" \
>> + "cpuid\n\t" \
>> + "xchg{q}\t{%%}rbx, %q1\n\t" \
>> + : "=a" (a), "=r" (b), "=c" (c), "=d" (d) \
>> + : "0" (level), "2" (count))
>> +#endif
>
> Shouldn't the constraint for b be "=&r"?
Technically yes, but all input operands are matched to outputs, so in
practice it doesn't really matter.
Uros.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH, i386]: Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on x86_64
2012-12-27 8:08 ` Uros Bizjak
2012-12-27 8:09 ` Uros Bizjak
@ 2012-12-28 20:27 ` Richard Henderson
2012-12-30 19:09 ` Uros Bizjak
1 sibling, 1 reply; 17+ messages in thread
From: Richard Henderson @ 2012-12-28 20:27 UTC (permalink / raw)
To: Uros Bizjak
Cc: Andi Kleen, Mike Frysinger, gcc-patches, Leif Ekblad,
Jakub Jelinek, jh, H.J. Lu
On 12/27/2012 12:08 AM, Uros Bizjak wrote:
> The alternative approach is changing cpuid definition in cpuid.h (as
> in attached patch) to preserve %rbx, but we can't detect various code
> model settings there. Since the change depends on the definition of
> __PIC__, we unnecessary preserve %rbx also for -mcmodel=small.
Certainly we can. We also control the preprocessor defines.
All that's needed is that we create one for the code model.
r~
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH, i386]: Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on x86_64
2012-12-28 20:27 ` Richard Henderson
@ 2012-12-30 19:09 ` Uros Bizjak
2013-01-04 21:38 ` Leif Ekblad
0 siblings, 1 reply; 17+ messages in thread
From: Uros Bizjak @ 2012-12-30 19:09 UTC (permalink / raw)
To: Richard Henderson
Cc: Andi Kleen, Mike Frysinger, gcc-patches, Leif Ekblad,
Jakub Jelinek, jh, H.J. Lu
[-- Attachment #1: Type: text/plain, Size: 1192 bytes --]
On Fri, Dec 28, 2012 at 9:27 PM, Richard Henderson <rth@redhat.com> wrote:
> On 12/27/2012 12:08 AM, Uros Bizjak wrote:
>> The alternative approach is changing cpuid definition in cpuid.h (as
>> in attached patch) to preserve %rbx, but we can't detect various code
>> model settings there. Since the change depends on the definition of
>> __PIC__, we unnecessary preserve %rbx also for -mcmodel=small.
>
> Certainly we can. We also control the preprocessor defines.
> All that's needed is that we create one for the code model.
Something like attached?
I have also included all suggestions (earlyclobber and operand prefix
on temporary register).
2012-12-30 Uros Bizjak <ubizjak@gmail.com>
PR target/55712
* config/i386/i386-c.c (ix86_target_macros_internal): Depending on
selected code model, define __code_mode_small__, __code_model_medium__,
__code_model_large__, __code_model_32__ or __code_model_kernel__.
* config/i386/cpuid.h (__cpuid, __cpuid_count) [__i386__]: Prefix
xchg temporary register with %k. Declare temporary register as
early clobbered.
[__x86_64__]: For medium and large code models, preserve %rbx register.
Tested on x86_64-pc-linux-gnu {,-m32}.
Uros.
[-- Attachment #2: p.diff.txt --]
[-- Type: text/plain, Size: 3419 bytes --]
Index: config/i386/cpuid.h
===================================================================
--- config/i386/cpuid.h (revision 194757)
+++ config/i386/cpuid.h (working copy)
@@ -136,35 +136,50 @@
/* %ebx may be the PIC register. */
#if __GNUC__ >= 3
#define __cpuid(level, a, b, c, d) \
- __asm__ ("xchg{l}\t{%%}ebx, %1\n\t" \
+ __asm__ ("xchg{l}\t{%%}ebx, %k1\n\t" \
"cpuid\n\t" \
- "xchg{l}\t{%%}ebx, %1\n\t" \
- : "=a" (a), "=r" (b), "=c" (c), "=d" (d) \
+ "xchg{l}\t{%%}ebx, %k1\n\t" \
+ : "=a" (a), "=&r" (b), "=c" (c), "=d" (d) \
: "0" (level))
#define __cpuid_count(level, count, a, b, c, d) \
- __asm__ ("xchg{l}\t{%%}ebx, %1\n\t" \
+ __asm__ ("xchg{l}\t{%%}ebx, %k1\n\t" \
"cpuid\n\t" \
- "xchg{l}\t{%%}ebx, %1\n\t" \
- : "=a" (a), "=r" (b), "=c" (c), "=d" (d) \
+ "xchg{l}\t{%%}ebx, %k1\n\t" \
+ : "=a" (a), "=&r" (b), "=c" (c), "=d" (d) \
: "0" (level), "2" (count))
#else
/* Host GCCs older than 3.0 weren't supporting Intel asm syntax
nor alternatives in i386 code. */
#define __cpuid(level, a, b, c, d) \
- __asm__ ("xchgl\t%%ebx, %1\n\t" \
+ __asm__ ("xchgl\t%%ebx, %k1\n\t" \
"cpuid\n\t" \
- "xchgl\t%%ebx, %1\n\t" \
- : "=a" (a), "=r" (b), "=c" (c), "=d" (d) \
+ "xchgl\t%%ebx, %k1\n\t" \
+ : "=a" (a), "=&r" (b), "=c" (c), "=d" (d) \
: "0" (level))
#define __cpuid_count(level, count, a, b, c, d) \
- __asm__ ("xchgl\t%%ebx, %1\n\t" \
+ __asm__ ("xchgl\t%%ebx, %k1\n\t" \
"cpuid\n\t" \
- "xchgl\t%%ebx, %1\n\t" \
- : "=a" (a), "=r" (b), "=c" (c), "=d" (d) \
+ "xchgl\t%%ebx, %k1\n\t" \
+ : "=a" (a), "=&r" (b), "=c" (c), "=d" (d) \
: "0" (level), "2" (count))
#endif
+#elif defined(__x86_64__) && (defined(__code_model_medium__) || defined(__code_model_large__)) && defined(__PIC__)
+/* %ebx may be the PIC register. */
+#define __cpuid(level, a, b, c, d) \
+ __asm__ ("xchg{q}\t{%%}rbx, %q1\n\t" \
+ "cpuid\n\t" \
+ "xchg{q}\t{%%}rbx, %q1\n\t" \
+ : "=a" (a), "=&r" (b), "=c" (c), "=d" (d) \
+ : "0" (level))
+
+#define __cpuid_count(level, count, a, b, c, d) \
+ __asm__ ("xchg{q}\t{%%}rbx, %q1\n\t" \
+ "cpuid\n\t" \
+ "xchg{q}\t{%%}rbx, %q1\n\t" \
+ : "=a" (a), "=&r" (b), "=c" (c), "=d" (d) \
+ : "0" (level), "2" (count))
#else
#define __cpuid(level, a, b, c, d) \
__asm__ ("cpuid\n\t" \
Index: config/i386/i386-c.c
===================================================================
--- config/i386/i386-c.c (revision 194757)
+++ config/i386/i386-c.c (working copy)
@@ -243,6 +243,30 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_fla
break;
}
+ switch (ix86_cmodel)
+ {
+ case CM_SMALL:
+ case CM_SMALL_PIC:
+ def_or_undef (parse_in, "__code_model_small__");
+ break;
+ case CM_MEDIUM:
+ case CM_MEDIUM_PIC:
+ def_or_undef (parse_in, "__code_model_medium__");
+ break;
+ case CM_LARGE:
+ case CM_LARGE_PIC:
+ def_or_undef (parse_in, "__code_model_large__");
+ break;
+ case CM_32:
+ def_or_undef (parse_in, "__code_model_32__");
+ break;
+ case CM_KERNEL:
+ def_or_undef (parse_in, "__code_model_kernel__");
+ break;
+ default:
+ ;
+ }
+
if (isa_flag & OPTION_MASK_ISA_MMX)
def_or_undef (parse_in, "__MMX__");
if (isa_flag & OPTION_MASK_ISA_3DNOW)
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH, i386]: Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on x86_64
2012-12-30 19:09 ` Uros Bizjak
@ 2013-01-04 21:38 ` Leif Ekblad
2013-01-04 21:42 ` Jakub Jelinek
0 siblings, 1 reply; 17+ messages in thread
From: Leif Ekblad @ 2013-01-04 21:38 UTC (permalink / raw)
To: Uros Bizjak, Richard Henderson
Cc: Andi Kleen, Mike Frysinger, gcc-patches, Jakub Jelinek, jh, H.J. Lu
I just tried the patch, but it seems to produce some problems for me. The
other patch which used a 64-bit specific register (r15) instead of rbx was
easier to adapt to. The problem for me is that syscalls might clobber
higher-half of all 32-bit registers, and that I cannot use the stack to
preserve rbx in the call because of the red-zone.
Problem code:
#define RdosUserGateEdiEcxPar0RetEbx(nr, rdi, rcx, size, res) do { \
register int _id asm("r14") = nr; \
register typeof(rdi) _rdi asm("rdi") = (rdi); \
register typeof(rcx) _rcx asm("r8") = (rcx); \
register int _size asm("r12") = (size); \
asm volatile ( \
"syscall\n\t" \
"jc 1f\n\t" \
"movzx %%bx,%%rax\n\t" \
"jmp 2f\n\t" \
"1: \n\t" \
"xorq %%rax,%%rax\n\t" \
"2: \n\t" \
: "=a" (res) : "r" (_id), "r" (_rdi), "r" (_rcx), "r" (_size) : "rbx",
"rdx", "rsi" \
); \
} while(0);
inline volatile int RdosCreateFile(const char *FileName, int Attrib)
{
int res;
int size = strlen(FileName) + 1;
RdosUserGateEdiEcxPar0RetEbx(usergate_create_file, FileName, Attrib,
size, res);
return res;
}
Error log:
$ rdos-gcc test.c -o test.exe
In file included from /usr/local/rdos/include/rdos.h:719:0,
from test.c:1:
/usr/local/rdos/include/rdosgcc.h: In function 'main':
/usr/local/rdos/include/rdosgcc.h:236:5: error: PIC register clobbered by
'rbx' in 'asm'
RdosUserGateEdiEcxPar0RetEbx(usergate_open_file, FileName, Access, size,
res);
^
I would prefer to change PIC register to r15 instead, or alternatively make
this an target-option.
Regards,
Leif Ekblad
----- Original Message -----
From: "Uros Bizjak" <ubizjak@gmail.com>
To: "Richard Henderson" <rth@redhat.com>
Cc: "Andi Kleen" <andi@firstfloor.org>; "Mike Frysinger"
<vapier@gentoo.org>; <gcc-patches@gcc.gnu.org>; "Leif Ekblad"
<leif@rdos.net>; "Jakub Jelinek" <jakub@redhat.com>; <jh@suse.cz>; "H.J. Lu"
<hjl.tools@gmail.com>
Sent: Sunday, December 30, 2012 8:08 PM
Subject: Re: [RFC PATCH, i386]: Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on
x86_64
> On Fri, Dec 28, 2012 at 9:27 PM, Richard Henderson <rth@redhat.com> wrote:
>> On 12/27/2012 12:08 AM, Uros Bizjak wrote:
>>> The alternative approach is changing cpuid definition in cpuid.h (as
>>> in attached patch) to preserve %rbx, but we can't detect various code
>>> model settings there. Since the change depends on the definition of
>>> __PIC__, we unnecessary preserve %rbx also for -mcmodel=small.
>>
>> Certainly we can. We also control the preprocessor defines.
>> All that's needed is that we create one for the code model.
>
> Something like attached?
>
> I have also included all suggestions (earlyclobber and operand prefix
> on temporary register).
>
> 2012-12-30 Uros Bizjak <ubizjak@gmail.com>
>
> PR target/55712
> * config/i386/i386-c.c (ix86_target_macros_internal): Depending on
> selected code model, define __code_mode_small__, __code_model_medium__,
> __code_model_large__, __code_model_32__ or __code_model_kernel__.
> * config/i386/cpuid.h (__cpuid, __cpuid_count) [__i386__]: Prefix
> xchg temporary register with %k. Declare temporary register as
> early clobbered.
> [__x86_64__]: For medium and large code models, preserve %rbx register.
>
> Tested on x86_64-pc-linux-gnu {,-m32}.
>
> Uros.
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH, i386]: Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on x86_64
2013-01-04 21:38 ` Leif Ekblad
@ 2013-01-04 21:42 ` Jakub Jelinek
2013-01-04 22:31 ` Leif Ekblad
0 siblings, 1 reply; 17+ messages in thread
From: Jakub Jelinek @ 2013-01-04 21:42 UTC (permalink / raw)
To: Leif Ekblad
Cc: Uros Bizjak, Richard Henderson, Andi Kleen, Mike Frysinger,
gcc-patches, jh, H.J. Lu
On Fri, Jan 04, 2013 at 10:39:05PM +0100, Leif Ekblad wrote:
> I just tried the patch, but it seems to produce some problems for
> me. The other patch which used a 64-bit specific register (r15)
> instead of rbx was easier to adapt to. The problem for me is that
> syscalls might clobber higher-half of all 32-bit registers, and that
> I cannot use the stack to preserve rbx in the call because of the
> red-zone.
Of course you can use the stack, just subtract the red zone size (plus
whatever you need) from %rsp first, then save to stack, do syscall,
then restore from stack and finally increment %rsp back.
Jakub
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [RFC PATCH, i386]: Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on x86_64
2013-01-04 21:42 ` Jakub Jelinek
@ 2013-01-04 22:31 ` Leif Ekblad
0 siblings, 0 replies; 17+ messages in thread
From: Leif Ekblad @ 2013-01-04 22:31 UTC (permalink / raw)
To: Jakub Jelinek
Cc: Uros Bizjak, Richard Henderson, Andi Kleen, Mike Frysinger,
gcc-patches, jh, H.J. Lu
OK. I know I can do that, but I would need to do it for every syscall since
every syscall can potentially clobber rbx.
Also, it is very strange that it is only one of the inlines the compiler
complains about. I have another inline (which uses rbx as input), but that
doesn't generate any error:
#define RdosUserGateEbxEdiEcxParRetEax(nr, rbx, rdi, rcx, res) do { \
register int _id asm("r14") = nr; \
register typeof(rbx) _rbx asm("rbx") = (rbx); \
register typeof(rdi) _rdi asm("rdi") = (rdi); \
register typeof(rcx) _rcx asm("r8") = (rcx); \
register int _size asm("r12") = (rcx); \
asm volatile ( \
"syscall\n\t" \
"jnc 1f\n\t" \
"xorq %%rax,%%rax\n\t" \
"1: \n\t" \
: "=a" (res) : "r" (_id), "r" (_rbx), "r" (_rdi), "r" (_rcx), "r"
(_size) : "rdx", "rsi" \
); \
} while(0);
inline int RdosWriteFile(int Handle, void *Buf, int Size)
{
int res;
RdosUserGateEbxEdiEcxParRetEax(usergate_read_file, Handle, Buf, Size,
res);
return res;
}
Why is not this inline causing the problem? Might that be because it will
not use rbx, but instead another register? OTOH, the code seems to work and
rbx is not assigned to PIC at the point of the call, but to the defined
parameter.
Regards,
Leif Ekblad
----- Original Message -----
From: "Jakub Jelinek" <jakub@redhat.com>
To: "Leif Ekblad" <leif@rdos.net>
Cc: "Uros Bizjak" <ubizjak@gmail.com>; "Richard Henderson" <rth@redhat.com>;
"Andi Kleen" <andi@firstfloor.org>; "Mike Frysinger" <vapier@gentoo.org>;
<gcc-patches@gcc.gnu.org>; <jh@suse.cz>; "H.J. Lu" <hjl.tools@gmail.com>
Sent: Friday, January 04, 2013 10:42 PM
Subject: Re: [RFC PATCH, i386]: Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on
x86_64
> On Fri, Jan 04, 2013 at 10:39:05PM +0100, Leif Ekblad wrote:
>> I just tried the patch, but it seems to produce some problems for
>> me. The other patch which used a 64-bit specific register (r15)
>> instead of rbx was easier to adapt to. The problem for me is that
>> syscalls might clobber higher-half of all 32-bit registers, and that
>> I cannot use the stack to preserve rbx in the call because of the
>> red-zone.
>
> Of course you can use the stack, just subtract the red zone size (plus
> whatever you need) from %rsp first, then save to stack, do syscall,
> then restore from stack and finally increment %rsp back.
>
> Jakub
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2013-01-04 22:31 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-12-24 15:40 [RFC PATCH, i386]: Use %r15 for REAL_PIC_OFFSET_TABLE_REGNUM on x86_64 Uros Bizjak
2012-12-24 21:32 ` Andi Kleen
2012-12-24 22:26 ` Leif Ekblad
2012-12-25 5:54 ` Mike Frysinger
2012-12-25 19:12 ` Uros Bizjak
2012-12-25 19:27 ` Mike Frysinger
2012-12-26 9:53 ` Uros Bizjak
2012-12-26 20:16 ` Andi Kleen
2012-12-27 8:08 ` Uros Bizjak
2012-12-27 8:09 ` Uros Bizjak
2012-12-27 9:10 ` Florian Weimer
2012-12-27 9:17 ` Uros Bizjak
2012-12-28 20:27 ` Richard Henderson
2012-12-30 19:09 ` Uros Bizjak
2013-01-04 21:38 ` Leif Ekblad
2013-01-04 21:42 ` Jakub Jelinek
2013-01-04 22:31 ` Leif Ekblad
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).