public inbox for gnu-gabi@sourceware.org
 help / color / mirror / Atom feed
* RFC: Update x86 psABIs to support IBT
@ 2017-01-01  0:00 H.J. Lu
  2019-01-01  0:00 ` H.J. Lu
  0 siblings, 1 reply; 5+ messages in thread
From: H.J. Lu @ 2017-01-01  0:00 UTC (permalink / raw)
  To: IA32 System V Application Binary Interface, x86-64-abi, gnu-gabi

On Tue, Jun 13, 2017 at 12:11 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> To support ENDBR in Intel Control-flow Enforcement Technology (CET)
> instructions:
>
> https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf
>
> following changes to i386 psABI are required.

Here is the updated extension for both i386 and x86-64 psABI to
support IBT.  I will post a binutls patch later.

Any comments?

-- 
H.J.
---
To support indirect branch tracking (IBT) in Intel Control-flow Enforcement
Technology (CET) instructions:

https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf

following changes to x86 psABI are required.

To program properties, add

#define GNU_PROPERTY_X86_FEATURE_1_AND 0xc0000002

#define GNU_PROPERTY_X86_FEATURE_1_IBT (1U << 0)

to indicate that all executable sections are compatible with IBT when
ENDBR instruction is inserted at:

   a. All function entries whose addresses may be taken.
   b. All branch targets whose addresses have been taken.

GNU_PROPERTY_X86_FEATURE_1_IBT is set on output only if it is set on
all relocatable inputs, which means that the C library must be compiled
with IBT-enabled compiler.

The followings changes are made to the Procedure Linkage Table (PLT) to
enable IBT:

1. For 64-bit x86-64,  PLT is changed to:

PLT0:  push       GOT[1]
       bnd jmp    *GOT[2]
       nop
...
PLTn:  endbr64
       push       namen_reloc_index
       bnd jmp    PLT0

together with the second PLT section:

PLTn:  endbr64
       bnd jmp   *GOT[namen_index]
       nop

BND prefix is also added so that IBT-enabled PLT is compatible with MPX.

2. For 32-bit x86-64 (x32) and i386,  PLT is changed to

PLT0:  push       GOT[1]
       jmp        *GOT[2]
       nop
...
PLTn:  endbr64                                 # endbr32 for i386.
       push       namen_reloc_index
       jmp        PLT0

together with the second PLT section:

PLTn:  endbr64                                 # endbr32 for i386.
       jmp       *GOT[namen_index]
       nop

BND prefix isn't used since MPX isn't supported on x32 and BND registers
aren't used in parameter passing on i386.

GOT is an array of addresses.  Initially, GOT[namen_index] is filled
with the address of the ENDBR instruction of the corresponding entry
in the first PLT section.  The function, namen, is called via the
ENDBR instruction in the second PLT entry.  GOT[namen_index] is updated
to the actual address of the function, namen, at run-time.

Load-time processing

On an IBT capable processor, the following steps should be taken:

1. When loading an executable, if GNU_PROPERTY_X86_FEATURE_1_IBT is
set on the executable, enable IBT.
2. If IBT is enabled, when loading a shared object without
GNU_PROPERTY_X86_FEATURE_1_IBT:
  a. If legacy interwork is allowed, then mark all pages in executable
     PL_LOAD segments in legacy code page bitmap.  Failure of legacy code
     page bitmap allocation causes an error.
  b. If legacy interwork isn't allowed, it causes an error.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RFC: Update x86 psABIs to support IBT
  2017-01-01  0:00 RFC: Update x86 psABIs to support IBT H.J. Lu
@ 2019-01-01  0:00 ` H.J. Lu
       [not found]   ` <CAJENXgtX3Foh2gAHt4yxOg92rSSnf9-WjBNn3fn3M3tgbDKpEw@mail.gmail.com>
  0 siblings, 1 reply; 5+ messages in thread
From: H.J. Lu @ 2019-01-01  0:00 UTC (permalink / raw)
  To: IA32 System V Application Binary Interface, x86-64-abi, gnu-gabi

On Tue, Jun 20, 2017 at 9:38 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Tue, Jun 13, 2017 at 12:11 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> > To support ENDBR in Intel Control-flow Enforcement Technology (CET)
> > instructions:
> >
> > https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf
> >
> > following changes to i386 psABI are required.
>
> Here is the updated extension for both i386 and x86-64 psABI to
> support IBT.  I will post a binutls patch later.
>
> Any comments?
>
> --
> H.J.
> ---
> To support indirect branch tracking (IBT) in Intel Control-flow Enforcement
> Technology (CET) instructions:
>
> https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf
>
> following changes to x86 psABI are required.
>
> To program properties, add
>
> #define GNU_PROPERTY_X86_FEATURE_1_AND 0xc0000002
>
> #define GNU_PROPERTY_X86_FEATURE_1_IBT (1U << 0)
>
> to indicate that all executable sections are compatible with IBT when
> ENDBR instruction is inserted at:
>
>    a. All function entries whose addresses may be taken.
>    b. All branch targets whose addresses have been taken.
>
> GNU_PROPERTY_X86_FEATURE_1_IBT is set on output only if it is set on
> all relocatable inputs, which means that the C library must be compiled
> with IBT-enabled compiler.
>
> The followings changes are made to the Procedure Linkage Table (PLT) to
> enable IBT:
>
> 1. For 64-bit x86-64,  PLT is changed to:
>
> PLT0:  push       GOT[1]
>        bnd jmp    *GOT[2]
>        nop
> ...
> PLTn:  endbr64
>        push       namen_reloc_index
>        bnd jmp    PLT0
>
> together with the second PLT section:
>
> PLTn:  endbr64
>        bnd jmp   *GOT[namen_index]
>        nop
>
> BND prefix is also added so that IBT-enabled PLT is compatible with MPX.
>
> 2. For 32-bit x86-64 (x32) and i386,  PLT is changed to
>
> PLT0:  push       GOT[1]
>        jmp        *GOT[2]
>        nop
> ...
> PLTn:  endbr64                                 # endbr32 for i386.
>        push       namen_reloc_index
>        jmp        PLT0
>
> together with the second PLT section:
>
> PLTn:  endbr64                                 # endbr32 for i386.
>        jmp       *GOT[namen_index]
>        nop
>
> BND prefix isn't used since MPX isn't supported on x32 and BND registers
> aren't used in parameter passing on i386.
>

There are 2 reasons for this 2-PLT scheme:

1.  Provide compatibility with other tools that have an hardcoded limit of 16
bytes for an x86 PLT entry.
2.  Improve code cache locality: since most of the instructions in .plt would be
executed only the first time a symbol is resolved they would waste space in
the cache and, by having a .plt.sec, only instructions that are often executed
would be cached.

-- 
H.J.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RFC: Update x86 psABIs to support IBT
       [not found]   ` <CAJENXgtX3Foh2gAHt4yxOg92rSSnf9-WjBNn3fn3M3tgbDKpEw@mail.gmail.com>
@ 2019-01-01  0:00     ` H.J. Lu
       [not found]       ` <CAJENXgvnZZHxb=2x4xwJPf8mtcxkaY+0TLLaawRoEA-gP=Topw@mail.gmail.com>
  0 siblings, 1 reply; 5+ messages in thread
From: H.J. Lu @ 2019-01-01  0:00 UTC (permalink / raw)
  To: Rui Ueyama
  Cc: IA32 System V Application Binary Interface, x86-64-abi, gnu-gabi

On Wed, Feb 20, 2019 at 4:30 PM Rui Ueyama <ruiu@google.com> wrote:
>
> Hi H.J.Lu,
>
> I'm replying because I was wondering why the 2-PLT scheme was chosen to support Intel CET.
>
> On Tue, Feb 19, 2019 at 8:36 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>>
>> On Tue, Jun 20, 2017 at 9:38 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>> >
>> > On Tue, Jun 13, 2017 at 12:11 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> > > To support ENDBR in Intel Control-flow Enforcement Technology (CET)
>> > > instructions:
>> > >
>> > > https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf
>> > >
>> > > following changes to i386 psABI are required.
>> >
>> > Here is the updated extension for both i386 and x86-64 psABI to
>> > support IBT.  I will post a binutls patch later.
>> >
>> > Any comments?
>> >
>> > --
>> > H.J.
>> > ---
>> > To support indirect branch tracking (IBT) in Intel Control-flow Enforcement
>> > Technology (CET) instructions:
>> >
>> > https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf
>> >
>> > following changes to x86 psABI are required.
>> >
>> > To program properties, add
>> >
>> > #define GNU_PROPERTY_X86_FEATURE_1_AND 0xc0000002
>> >
>> > #define GNU_PROPERTY_X86_FEATURE_1_IBT (1U << 0)
>> >
>> > to indicate that all executable sections are compatible with IBT when
>> > ENDBR instruction is inserted at:
>> >
>> >    a. All function entries whose addresses may be taken.
>> >    b. All branch targets whose addresses have been taken.
>> >
>> > GNU_PROPERTY_X86_FEATURE_1_IBT is set on output only if it is set on
>> > all relocatable inputs, which means that the C library must be compiled
>> > with IBT-enabled compiler.
>> >
>> > The followings changes are made to the Procedure Linkage Table (PLT) to
>> > enable IBT:
>> >
>> > 1. For 64-bit x86-64,  PLT is changed to:
>> >
>> > PLT0:  push       GOT[1]
>> >        bnd jmp    *GOT[2]
>> >        nop
>> > ...
>> > PLTn:  endbr64
>> >        push       namen_reloc_index
>> >        bnd jmp    PLT0
>> >
>> > together with the second PLT section:
>> >
>> > PLTn:  endbr64
>> >        bnd jmp   *GOT[namen_index]
>> >        nop
>> >
>> > BND prefix is also added so that IBT-enabled PLT is compatible with MPX.
>> >
>> > 2. For 32-bit x86-64 (x32) and i386,  PLT is changed to
>> >
>> > PLT0:  push       GOT[1]
>> >        jmp        *GOT[2]
>> >        nop
>> > ...
>> > PLTn:  endbr64                                 # endbr32 for i386.
>> >        push       namen_reloc_index
>> >        jmp        PLT0
>> >
>> > together with the second PLT section:
>> >
>> > PLTn:  endbr64                                 # endbr32 for i386.
>> >        jmp       *GOT[namen_index]
>> >        nop
>> >
>> > BND prefix isn't used since MPX isn't supported on x32 and BND registers
>> > aren't used in parameter passing on i386.
>> >
>>
>> There are 2 reasons for this 2-PLT scheme:
>>
>> 1.  Provide compatibility with other tools that have an hardcoded limit of 16
>> bytes for an x86 PLT entry.
>
>
> I don't think that the 2-PLT scheme actually provides compatibility with existing tools. The new PLT uses different code instructions, and the usage of the .plt section has changed as well. IIUC, foo@PLT is now resolved to its entry in the second PLT instead of the first regular PLT.
>
> I know that some existing tools even crash if we change the PLT entry size, so keeping the PLT entry size would at least keep them from crashing. But I'd think compatibility means more than that.
>

We are doing the best we can.

>> 2.  Improve code cache locality: since most of the instructions in .plt would be
>> executed only the first time a symbol is resolved they would waste space in
>> the cache and, by having a .plt.sec, only instructions that are often executed
>> would be cached.
>
>
> This is personally much more convincing answer than keeping the compatibility. The PLT section could be hot, and separating hot code from relatively cold code could have an performance impact. But do you know how much is the impact? I wonder if there's a measurable difference if you simply extend the PLT size to 32-byte.
>

We don't have such data.

FWIW, we introduced 2 PLT scheme for MPX.  This isn't a new thing in
x86-64 psABI.


-- 
H.J.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RFC: Update x86 psABIs to support IBT
       [not found]       ` <CAJENXgvnZZHxb=2x4xwJPf8mtcxkaY+0TLLaawRoEA-gP=Topw@mail.gmail.com>
@ 2019-01-01  0:00         ` H.J. Lu
       [not found]           ` <CAJENXgtB0eT=tqY7pKyaJDihcE2c9t+CW5s0_sU2hFNC6CN8Xw@mail.gmail.com>
  0 siblings, 1 reply; 5+ messages in thread
From: H.J. Lu @ 2019-01-01  0:00 UTC (permalink / raw)
  To: Rui Ueyama
  Cc: IA32 System V Application Binary Interface, x86-64-abi, gnu-gabi

On Thu, Feb 21, 2019 at 11:18 AM Rui Ueyama <ruiu@google.com> wrote:
>
> On Wed, Feb 20, 2019 at 7:01 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>>
>> On Wed, Feb 20, 2019 at 4:30 PM Rui Ueyama <ruiu@google.com> wrote:
>> >
>> > Hi H.J.Lu,
>> >
>> > I'm replying because I was wondering why the 2-PLT scheme was chosen to support Intel CET.
>> >
>> > On Tue, Feb 19, 2019 at 8:36 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>> >>
>> >> On Tue, Jun 20, 2017 at 9:38 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>> >> >
>> >> > On Tue, Jun 13, 2017 at 12:11 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> >> > > To support ENDBR in Intel Control-flow Enforcement Technology (CET)
>> >> > > instructions:
>> >> > >
>> >> > > https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf
>> >> > >
>> >> > > following changes to i386 psABI are required.
>> >> >
>> >> > Here is the updated extension for both i386 and x86-64 psABI to
>> >> > support IBT.  I will post a binutls patch later.
>> >> >
>> >> > Any comments?
>> >> >
>> >> > --
>> >> > H.J.
>> >> > ---
>> >> > To support indirect branch tracking (IBT) in Intel Control-flow Enforcement
>> >> > Technology (CET) instructions:
>> >> >
>> >> > https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf
>> >> >
>> >> > following changes to x86 psABI are required.
>> >> >
>> >> > To program properties, add
>> >> >
>> >> > #define GNU_PROPERTY_X86_FEATURE_1_AND 0xc0000002
>> >> >
>> >> > #define GNU_PROPERTY_X86_FEATURE_1_IBT (1U << 0)
>> >> >
>> >> > to indicate that all executable sections are compatible with IBT when
>> >> > ENDBR instruction is inserted at:
>> >> >
>> >> >    a. All function entries whose addresses may be taken.
>> >> >    b. All branch targets whose addresses have been taken.
>> >> >
>> >> > GNU_PROPERTY_X86_FEATURE_1_IBT is set on output only if it is set on
>> >> > all relocatable inputs, which means that the C library must be compiled
>> >> > with IBT-enabled compiler.
>> >> >
>> >> > The followings changes are made to the Procedure Linkage Table (PLT) to
>> >> > enable IBT:
>> >> >
>> >> > 1. For 64-bit x86-64,  PLT is changed to:
>> >> >
>> >> > PLT0:  push       GOT[1]
>> >> >        bnd jmp    *GOT[2]
>> >> >        nop
>> >> > ...
>> >> > PLTn:  endbr64
>> >> >        push       namen_reloc_index
>> >> >        bnd jmp    PLT0
>> >> >
>> >> > together with the second PLT section:
>> >> >
>> >> > PLTn:  endbr64
>> >> >        bnd jmp   *GOT[namen_index]
>> >> >        nop
>> >> >
>> >> > BND prefix is also added so that IBT-enabled PLT is compatible with MPX.
>> >> >
>> >> > 2. For 32-bit x86-64 (x32) and i386,  PLT is changed to
>> >> >
>> >> > PLT0:  push       GOT[1]
>> >> >        jmp        *GOT[2]
>> >> >        nop
>> >> > ...
>> >> > PLTn:  endbr64                                 # endbr32 for i386.
>> >> >        push       namen_reloc_index
>> >> >        jmp        PLT0
>> >> >
>> >> > together with the second PLT section:
>> >> >
>> >> > PLTn:  endbr64                                 # endbr32 for i386.
>> >> >        jmp       *GOT[namen_index]
>> >> >        nop
>> >> >
>> >> > BND prefix isn't used since MPX isn't supported on x32 and BND registers
>> >> > aren't used in parameter passing on i386.
>> >> >
>> >>
>> >> There are 2 reasons for this 2-PLT scheme:
>> >>
>> >> 1.  Provide compatibility with other tools that have an hardcoded limit of 16
>> >> bytes for an x86 PLT entry.
>> >
>> >
>> > I don't think that the 2-PLT scheme actually provides compatibility with existing tools. The new PLT uses different code instructions, and the usage of the .plt section has changed as well. IIUC, foo@PLT is now resolved to its entry in the second PLT instead of the first regular PLT.
>> >
>> > I know that some existing tools even crash if we change the PLT entry size, so keeping the PLT entry size would at least keep them from crashing. But I'd think compatibility means more than that.
>> >
>>
>> We are doing the best we can.
>>
>> >> 2.  Improve code cache locality: since most of the instructions in .plt would be
>> >> executed only the first time a symbol is resolved they would waste space in
>> >> the cache and, by having a .plt.sec, only instructions that are often executed
>> >> would be cached.
>> >
>> >
>> > This is personally much more convincing answer than keeping the compatibility. The PLT section could be hot, and separating hot code from relatively cold code could have an performance impact. But do you know how much is the impact? I wonder if there's a measurable difference if you simply extend the PLT size to 32-byte.
>> >
>>
>> We don't have such data.
>
>
> Then it could be a premature optimization. The single PLT scheme would be undeniably much simpler, so unless it is shown to not work, we probably shouldn't have splitted a PLT into two, no?
>

Simpler to implement, yes.   We designed it with performance in mind.
We have implemented it many years ago starting from MPX.  It shouldn't
be changed just because it is "hard" to implement.

-- 
H.J.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RFC: Update x86 psABIs to support IBT
       [not found]           ` <CAJENXgtB0eT=tqY7pKyaJDihcE2c9t+CW5s0_sU2hFNC6CN8Xw@mail.gmail.com>
@ 2019-01-01  0:00             ` H.J. Lu
  0 siblings, 0 replies; 5+ messages in thread
From: H.J. Lu @ 2019-01-01  0:00 UTC (permalink / raw)
  To: Rui Ueyama
  Cc: IA32 System V Application Binary Interface, x86-64-abi, gnu-gabi

On Thu, Feb 21, 2019 at 2:30 PM Rui Ueyama <ruiu@google.com> wrote:
>
> On Thu, Feb 21, 2019 at 11:22 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>>
>> On Thu, Feb 21, 2019 at 11:18 AM Rui Ueyama <ruiu@google.com> wrote:
>> >
>> > On Wed, Feb 20, 2019 at 7:01 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>> >>
>> >> On Wed, Feb 20, 2019 at 4:30 PM Rui Ueyama <ruiu@google.com> wrote:
>> >> >
>> >> > Hi H.J.Lu,
>> >> >
>> >> > I'm replying because I was wondering why the 2-PLT scheme was chosen to support Intel CET.
>> >> >
>> >> > On Tue, Feb 19, 2019 at 8:36 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>> >> >>
>> >> >> On Tue, Jun 20, 2017 at 9:38 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>> >> >> >
>> >> >> > On Tue, Jun 13, 2017 at 12:11 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> >> >> > > To support ENDBR in Intel Control-flow Enforcement Technology (CET)
>> >> >> > > instructions:
>> >> >> > >
>> >> >> > > https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf
>> >> >> > >
>> >> >> > > following changes to i386 psABI are required.
>> >> >> >
>> >> >> > Here is the updated extension for both i386 and x86-64 psABI to
>> >> >> > support IBT.  I will post a binutls patch later.
>> >> >> >
>> >> >> > Any comments?
>> >> >> >
>> >> >> > --
>> >> >> > H.J.
>> >> >> > ---
>> >> >> > To support indirect branch tracking (IBT) in Intel Control-flow Enforcement
>> >> >> > Technology (CET) instructions:
>> >> >> >
>> >> >> > https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf
>> >> >> >
>> >> >> > following changes to x86 psABI are required.
>> >> >> >
>> >> >> > To program properties, add
>> >> >> >
>> >> >> > #define GNU_PROPERTY_X86_FEATURE_1_AND 0xc0000002
>> >> >> >
>> >> >> > #define GNU_PROPERTY_X86_FEATURE_1_IBT (1U << 0)
>> >> >> >
>> >> >> > to indicate that all executable sections are compatible with IBT when
>> >> >> > ENDBR instruction is inserted at:
>> >> >> >
>> >> >> >    a. All function entries whose addresses may be taken.
>> >> >> >    b. All branch targets whose addresses have been taken.
>> >> >> >
>> >> >> > GNU_PROPERTY_X86_FEATURE_1_IBT is set on output only if it is set on
>> >> >> > all relocatable inputs, which means that the C library must be compiled
>> >> >> > with IBT-enabled compiler.
>> >> >> >
>> >> >> > The followings changes are made to the Procedure Linkage Table (PLT) to
>> >> >> > enable IBT:
>> >> >> >
>> >> >> > 1. For 64-bit x86-64,  PLT is changed to:
>> >> >> >
>> >> >> > PLT0:  push       GOT[1]
>> >> >> >        bnd jmp    *GOT[2]
>> >> >> >        nop
>> >> >> > ...
>> >> >> > PLTn:  endbr64
>> >> >> >        push       namen_reloc_index
>> >> >> >        bnd jmp    PLT0
>> >> >> >
>> >> >> > together with the second PLT section:
>> >> >> >
>> >> >> > PLTn:  endbr64
>> >> >> >        bnd jmp   *GOT[namen_index]
>> >> >> >        nop
>> >> >> >
>> >> >> > BND prefix is also added so that IBT-enabled PLT is compatible with MPX.
>> >> >> >
>> >> >> > 2. For 32-bit x86-64 (x32) and i386,  PLT is changed to
>> >> >> >
>> >> >> > PLT0:  push       GOT[1]
>> >> >> >        jmp        *GOT[2]
>> >> >> >        nop
>> >> >> > ...
>> >> >> > PLTn:  endbr64                                 # endbr32 for i386.
>> >> >> >        push       namen_reloc_index
>> >> >> >        jmp        PLT0
>> >> >> >
>> >> >> > together with the second PLT section:
>> >> >> >
>> >> >> > PLTn:  endbr64                                 # endbr32 for i386.
>> >> >> >        jmp       *GOT[namen_index]
>> >> >> >        nop
>> >> >> >
>> >> >> > BND prefix isn't used since MPX isn't supported on x32 and BND registers
>> >> >> > aren't used in parameter passing on i386.
>> >> >> >
>> >> >>
>> >> >> There are 2 reasons for this 2-PLT scheme:
>> >> >>
>> >> >> 1.  Provide compatibility with other tools that have an hardcoded limit of 16
>> >> >> bytes for an x86 PLT entry.
>> >> >
>> >> >
>> >> > I don't think that the 2-PLT scheme actually provides compatibility with existing tools. The new PLT uses different code instructions, and the usage of the .plt section has changed as well. IIUC, foo@PLT is now resolved to its entry in the second PLT instead of the first regular PLT.
>> >> >
>> >> > I know that some existing tools even crash if we change the PLT entry size, so keeping the PLT entry size would at least keep them from crashing. But I'd think compatibility means more than that.
>> >> >
>> >>
>> >> We are doing the best we can.
>> >>
>> >> >> 2.  Improve code cache locality: since most of the instructions in .plt would be
>> >> >> executed only the first time a symbol is resolved they would waste space in
>> >> >> the cache and, by having a .plt.sec, only instructions that are often executed
>> >> >> would be cached.
>> >> >
>> >> >
>> >> > This is personally much more convincing answer than keeping the compatibility. The PLT section could be hot, and separating hot code from relatively cold code could have an performance impact. But do you know how much is the impact? I wonder if there's a measurable difference if you simply extend the PLT size to 32-byte.
>> >> >
>> >>
>> >> We don't have such data.
>> >
>> >
>> > Then it could be a premature optimization. The single PLT scheme would be undeniably much simpler, so unless it is shown to not work, we probably shouldn't have splitted a PLT into two, no?
>> >
>>
>> Simpler to implement, yes.   We designed it with performance in mind.
>> We have implemented it many years ago starting from MPX.  It shouldn't
>> be changed just because it is "hard" to implement.
>
>
> I can see that the 2-PLT scheme performs better in theory. That being said, I don't think I'm convinced that the design is better in practice if the expected advantage was not measured.

We went with better in theory in our design.  We may not see performance
differences in practice in most cases.  In some cases, PLT section can be
quite large:

libLLVM-7.0.1.so:
  [11] .plt              PROGBITS        0000000000658020 658020
043bd0 10  AX  0   0 16
  [12] .plt.sec          PROGBITS        000000000069bbf0 69bbf0
043bc0 10  AX  0   0 16

> I don't think I'm requesting a change to the spec at least at the moment. What I'm trying to do is to understand the rationale behind a choice of the spec before implementing it to our linker, lld. Even if there's no evidence that the 2-PLT scheme performs better than the 1-PLT scheme, we might still want to implement as the spec says, considering the cost of breaking ABI compatibility. But if we take the route, we'd like to document that fact as-is.

Sure.  We'd like to get as many feedbacks and inputs as we can when we propose
ABI changes.   We encourage you participate in future discussions.

-- 
H.J.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-02-21 23:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-01  0:00 RFC: Update x86 psABIs to support IBT H.J. Lu
2019-01-01  0:00 ` H.J. Lu
     [not found]   ` <CAJENXgtX3Foh2gAHt4yxOg92rSSnf9-WjBNn3fn3M3tgbDKpEw@mail.gmail.com>
2019-01-01  0:00     ` H.J. Lu
     [not found]       ` <CAJENXgvnZZHxb=2x4xwJPf8mtcxkaY+0TLLaawRoEA-gP=Topw@mail.gmail.com>
2019-01-01  0:00         ` H.J. Lu
     [not found]           ` <CAJENXgtB0eT=tqY7pKyaJDihcE2c9t+CW5s0_sU2hFNC6CN8Xw@mail.gmail.com>
2019-01-01  0:00             ` H.J. Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).