public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* An unusual x86_64 code model
@ 2011-08-09 23:26 Jed Davis
  2011-08-09 23:58 ` Andrew Pinski
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Jed Davis @ 2011-08-09 23:26 UTC (permalink / raw)
  To: gcc

The vSphere Hypervisor (ESXi) kernel runs on x86_64 and loads all text
and data sections (for the kernel itself and for modules) within a 2GB
window that lives around virtual address 0x418000000000 (65.5 TiB). 
Thus, 32-bit absolute addresses won't work, but %rip-relative addressing
is fine.  Additionally, because this is a kernel, the usual issues of
"shared text" that discourage text relocations are inapplicable.

What this means in terms of GCC is: the usual small code model won't
work, nor -mcmodel=kernel, because they assume signed 32-bit addresses.
The large code model probably will work, but that turns everything into
movabs and indirect calls, which is unnecessarily inefficient.  The
closest approximation is -fPIC or -fPIE, but that assumes we want to
implement the PLT/GOT machinery in our loader, which we don't; it imposes
overhead for no benefit.

The existing workaround, which predates my personal involvement, is to
use -fPIE together with a -include'd file that uses a #pragma to set the
default symbol visibility to hidden, which suppresses the PLTness.
That works on GCC 4.1, but with newer versions that no longer affects
implicitly declared functions (which turn up occasionally in third-party
drivers), or coverage instrumentation's calls to __gcov_init, or probably
other things that have not yet been discovered.  Also, it was never an
ideal solution, except in that it didn't require modifying the compiler
(at the time).

Thus, I'm trying to find the right solution.  My current attempt is to
add an -mno-plt flag in i386.opt, and add it to the list of reasons not
to print "@PLT" after symbol names.  This seems to work, although I've
only done minimal testing so far.

But is that the right way to do that, do people think?  Or should I
look into making this its own -mcmodel option?  (Which would raise the
question of what to call it -- medsmall? smallhigh? altkernel?)  Or is
there some other way that this ought to be done?

Thanks,
--Jed

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: An unusual x86_64 code model
  2011-08-09 23:26 An unusual x86_64 code model Jed Davis
@ 2011-08-09 23:58 ` Andrew Pinski
  2011-08-12 22:02   ` Jed Davis
  2011-08-11 18:30 ` Andi Kleen
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: Andrew Pinski @ 2011-08-09 23:58 UTC (permalink / raw)
  To: Jed Davis; +Cc: gcc

On Tue, Aug 9, 2011 at 4:26 PM, Jed Davis <jedidiah@vmware.com> wrote:
> The existing workaround, which predates my personal involvement, is to
> use -fPIE together with a -include'd file that uses a #pragma to set the
> default symbol visibility to hidden, which suppresses the PLTness.
> That works on GCC 4.1, but with newer versions that no longer affects
> implicitly declared functions (which turn up occasionally in third-party
> drivers), or coverage instrumentation's calls to __gcov_init, or probably
> other things that have not yet been discovered.  Also, it was never an
> ideal solution, except in that it didn't require modifying the compiler
> (at the time).

Have you tried -fvisibility=hidden option ?

Thanks,
Andrew Pinski

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: An unusual x86_64 code model
  2011-08-09 23:26 An unusual x86_64 code model Jed Davis
  2011-08-09 23:58 ` Andrew Pinski
@ 2011-08-11 18:30 ` Andi Kleen
  2011-08-12 23:53 ` Jed Davis
  2011-08-17 23:20 ` Jed Davis
  3 siblings, 0 replies; 8+ messages in thread
From: Andi Kleen @ 2011-08-11 18:30 UTC (permalink / raw)
  To: Jed Davis; +Cc: gcc

Jed Davis <jedidiah@vmware.com> writes:
>
> But is that the right way to do that, do people think?  Or should I
> look into making this its own -mcmodel option?  (Which would raise the

I would make it a new -mcmodel=... option.

> question of what to call it -- medsmall? smallhigh? altkernel?)  Or is

smallhigh sounds reasonable to me.

-Andi


-- 
ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: An unusual x86_64 code model
  2011-08-09 23:58 ` Andrew Pinski
@ 2011-08-12 22:02   ` Jed Davis
  0 siblings, 0 replies; 8+ messages in thread
From: Jed Davis @ 2011-08-12 22:02 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc

On Tue, Aug 09, 2011 at 04:58:01PM -0700, Andrew Pinski wrote:
> On Tue, Aug 9, 2011 at 4:26 PM, Jed Davis <jedidiah@vmware.com> wrote:
> > The existing workaround, which predates my personal involvement, is to
> > use -fPIE together with a -include'd file that uses a #pragma to set the
> > default symbol visibility to hidden, which suppresses the PLTness.
> > That works on GCC 4.1, but with newer versions that no longer affects
> > implicitly declared functions (which turn up occasionally in third-party
> > drivers), or coverage instrumentation's calls to __gcov_init, or probably
> > other things that have not yet been discovered.  Also, it was never an
> > ideal solution, except in that it didn't require modifying the compiler
> > (at the time).
> 
> Have you tried -fvisibility=hidden option ?

Sadly, that doesn't work:

$ cat test.c
int baz();
int quux() { return baz(); }
int foo() { return bar() + quux(); }
$ gcc -fprofile-arcs -fvisibility=hidden -fPIE -S test.c
$ grep call test.s
        call    baz@PLT
        call    bar@PLT
        call    quux
        call    __gcov_init@PLT

The fine manual states that "extern declarations are not affected by
-fvisibility", so that result is expected.

Adding "#pragma GCC visibility push(hidden)" also takes care of baz
(declared and extern), but not bar (implicit) or __gcov_init (very
implicit).

--Jed

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: An unusual x86_64 code model
  2011-08-09 23:26 An unusual x86_64 code model Jed Davis
  2011-08-09 23:58 ` Andrew Pinski
  2011-08-11 18:30 ` Andi Kleen
@ 2011-08-12 23:53 ` Jed Davis
  2011-08-17 23:20 ` Jed Davis
  3 siblings, 0 replies; 8+ messages in thread
From: Jed Davis @ 2011-08-12 23:53 UTC (permalink / raw)
  To: gcc

On Tue, Aug 09, 2011 at 04:26:06PM -0700, Jed Davis wrote:
> Thus, I'm trying to find the right solution.  My current attempt is to
> add an -mno-plt flag in i386.opt, and add it to the list of reasons not
> to print "@PLT" after symbol names.  This seems to work, although I've
> only done minimal testing so far.

Emphasis on "minimal"; a reference to the address of an extern variable
yields a @GOTPCREL.  So that wasn't going to work in any case.  The more
of i386.c I read, the more I realize that the resemblance to CM_SMALL_PIC
was mostly coincidental.

--Jed

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: An unusual x86_64 code model
  2011-08-09 23:26 An unusual x86_64 code model Jed Davis
                   ` (2 preceding siblings ...)
  2011-08-12 23:53 ` Jed Davis
@ 2011-08-17 23:20 ` Jed Davis
  2011-08-18 13:37   ` Michael Matz
  3 siblings, 1 reply; 8+ messages in thread
From: Jed Davis @ 2011-08-17 23:20 UTC (permalink / raw)
  To: gcc

Second attempt: I now have a modified GCC 4.4.3 which recognizes
-mcmodel=smallhigh; in CM_SMALLHIGH, pic_32bit_operand acts as it does
for PIC (to get lea instead of movabs), and legitimate_address_p accepts
SYMBOLIC_CONSTs with no indexing (for anything with a memory constraint).

Beyond that, the defaults (i.e., what happens if the code model isn't
any of the previously defined ones) happen to be more or less what I
want.  In particular, the operand printer chooses %rip-relative mode
over plain disp32 for any symbolic displacement in all the smallish
modes; this is commented as if it were just a space optimization
(avoiding the SIB byte), but in fact it's necessary for -fPIC to work,
so I think I can depend on it.

One thing I'm not so sure about is accepting any SYMBOLIC_CONST as a
legitimate address.  That allows, for example, a symbol address cast
to uintptr_t and added to (6ULL << 32), which will never fit.  On the
other hand, -fPIC allows offsets of up to +/- 16Mib for some unexplained
reason, meaning that I can break it by pushing the code+data size almost
to 2GiB with a large .bss and evaluating (uintptr_t)&_end+0xffffff.

I think I could try to fix that by interrogating the SYMBOL_REF_DECL for
the object's size, but given that -fPIC doesn't go that far, it's not
clear that I need to.

Thoughts?

Also, it may actually work now.  I've successfully bootstrapped with
BOOT_CFLAGS='-mcmodel=smallhigh -O2 -g', after which the only _32 or
_32S relocations in the gcc/ subdirectory of the objdir were either
.debug references that I assume are safe, or in crtstuff that's part of
the libgcc build and not affected by BOOT_CFLAGS.  (It also successfully
builds ESXi with kernel coverage enabled, but that's less informative
for people on this list.)

Once our lawyers approve, I can also send the actual diff; by my count,
it makes nontrivial changes to a whole seven lines of code.

--Jed

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: An unusual x86_64 code model
  2011-08-17 23:20 ` Jed Davis
@ 2011-08-18 13:37   ` Michael Matz
  2011-08-23 18:56     ` Jed Davis
  0 siblings, 1 reply; 8+ messages in thread
From: Michael Matz @ 2011-08-18 13:37 UTC (permalink / raw)
  To: Jed Davis; +Cc: gcc

Hi,

On Wed, 17 Aug 2011, Jed Davis wrote:

> One thing I'm not so sure about is accepting any SYMBOLIC_CONST as a
> legitimate address.  That allows, for example, a symbol address cast
> to uintptr_t and added to (6ULL << 32), which will never fit.  On the
> other hand, -fPIC allows offsets of up to +/- 16Mib for some unexplained
> reason,

The x86-64 ABI specifies this.  All symbols have to be located between 0x0 
and 2^31-2^24-1, and that is so that everything in memory objects of 
length less than 2^24 can be addressed directly.  Otherwise only the base 
address of symbols would be addressable directly and any offsetted variant 
would have to be calculated explicitely.  If it weren't for this 
provision, given this code:

global char arr[4096];
char f () { return arr[2]; }

the load couldn't use arr+2 directly as that possibly might not fit into 
32 bit anymore.  Similar things are true for the small PIC models 
including your new one.  That is, as long as symbols are always at most 
2^31-2^24-1 away from all ends of referring instructions you can happily 
accept offsets between +-2^24.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: An unusual x86_64 code model
  2011-08-18 13:37   ` Michael Matz
@ 2011-08-23 18:56     ` Jed Davis
  0 siblings, 0 replies; 8+ messages in thread
From: Jed Davis @ 2011-08-23 18:56 UTC (permalink / raw)
  To: Michael Matz; +Cc: gcc

On Thu, Aug 18, 2011 at 03:37:15PM +0200, Michael Matz wrote:
> On Wed, 17 Aug 2011, Jed Davis wrote:
> 
> > One thing I'm not so sure about is accepting any SYMBOLIC_CONST as a
> > legitimate address.  That allows, for example, a symbol address cast
> > to uintptr_t and added to (6ULL << 32), which will never fit.  On the
> > other hand, -fPIC allows offsets of up to +/- 16Mib for some unexplained
> > reason,
> 
> The x86-64 ABI specifies this.  All symbols have to be located between 0x0 
> and 2^31-2^24-1, and that is so that everything in memory objects of 
> length less than 2^24 can be addressed directly.

Oh, of course.  For some reason I went through the ELF spec, but
didn't think to see what the x86_64 ABI had to say about code models.
Everything makes much more sense now.  Thanks for the pointer.

> Otherwise only the base address of symbols would be addressable
> directly and any offsetted variant would have to be calculated
> explicitely.

Right; that's what I was trying to avoid doing.

It looks like I can reuse legitimate_pic_address_disp_p for this; this
is not quite PIC, but the same set of non-immediate displacement-only
addresses is usable in general operands.

Except that then pic_32bit_operand does the wrong thing, because actual
PIC has hooks in the MI recog.c that affect the constraints (I think?),
and I don't.  But... what "pic_32bit_operand" actually means is "can I
use LEA to obtain this value?", and anything that's a legitimate address
in strict RTL can be LEA'ed.  So that takes care of that.  Back to
testing, I guess.

--Jed

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2011-08-23 18:56 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-09 23:26 An unusual x86_64 code model Jed Davis
2011-08-09 23:58 ` Andrew Pinski
2011-08-12 22:02   ` Jed Davis
2011-08-11 18:30 ` Andi Kleen
2011-08-12 23:53 ` Jed Davis
2011-08-17 23:20 ` Jed Davis
2011-08-18 13:37   ` Michael Matz
2011-08-23 18:56     ` Jed Davis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).