Re: [discuss] [x86-64 psABI] RFC: Extend x86-64 psABI to support x32

public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed

From: "H.J. Lu" <hjl.tools@gmail.com>
To: Michael Matz <matz@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>,
	discuss@x86-64.org, 	GNU C Library <libc-alpha@sourceware.org>,
	GCC Development <gcc@gcc.gnu.org>, GDB <gdb@sourceware.org>,
		x32-abi@googlegroups.com, Binutils <binutils@sourceware.org>
Subject: Re: [discuss] [x86-64 psABI] RFC: Extend x86-64 psABI to support x32
Date: Thu, 17 May 2012 19:50:00 -0000	[thread overview]
Message-ID: <CAMe9rOptzCATWKgnz=cKGsTnyeFWSGdeLJDHFNk5qtBL9tx29A@mail.gmail.com> (raw)
In-Reply-To: <Pine.LNX.4.64.1205151806180.25409@wotan.suse.de>

[-- Attachment #1: Type: text/plain, Size: 599 bytes --]

On Tue, May 15, 2012 at 9:07 AM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> On Mon, 14 May 2012, H.J. Lu wrote:
>
>> > As a minor nitpick, I have always used x32 with a lower case x.  The
>> > capital X32 looks odd to me.
>> >
>>
>> I used X32 together with LP64.  I can use ILP32 instead of X32 when LP64
>> is mentioned at the same time.
>
> I'd prefer that.  x32 is a nice short-hand name for the whole thing, but
> not descriptive, unlike LP64.  So, yes, IMO it should be ILP32 in the ABI
> document.
>

Here is the updated change.  Any comments?

Thanks.



-- 
H.J.

[-- Attachment #2: psabi-x32-2.patch --]
[-- Type: application/octet-stream, Size: 27350 bytes --]

2012-05-17  H.J. Lu  <hongjiu.lu@intel.com>

	* abi.tex (title): Mention LP64/ILP32. 
	(author): Add H.J. Lu and Milind Girkar.
	Include x32.tex.

	* development.tex: Add _ILP32 and __ILP32__ for ILP32.  Also
	document _LP64 and __LP64__.

	* dl.tex: List ILP32 program interpreter.

	* introduction.tex (Introduction): Add a label.
	Describe ILP32 and LP64.

	* low-level-sys-info.tex (Scalar Types table): Add ILP32/LP64 to
	long and long long.  Modify long and pointer types for ILP32 and
	LP64.  Use \myfontsize instead of \small.
	(Architectural Constraints): Add a lebel.  Mention small model
	for ILP32.

	* macros.tex (myfontsize): New.

	* object-files.tex (Programming Model): New subsubsection
	(File Class): Likewise.
	(Data Encoding): Likewise.
	(Processor identification}): Likewise.
	(Relocation Types): Add wordclass.  Allow Elf32_Rel relocations
	within ILP32 executable files or shared objects.
	(Relocation Types): Use small font.  Mark R_X86_64_GLOB_DAT,
	R_X86_64_JUMP_SLOT, R_X86_64_RELATIVE and R_X86_64_IRELATIVE
	with wordclass.  Mark R_X86_64_PC64, R_X86_64_GOTOFF64 and
	R_X86_64_SIZE64 used only for LP64.  Add R_X86_64_RELATIVE64 for
	ILP32.

	* x32.tex: New file.

diff --git a/abi.tex b/abi.tex
index 2b56d94..a301b5d 100644
--- a/abi.tex
+++ b/abi.tex
@@ -5,13 +5,18 @@
 \begin{document}
 
 \author{Edited by\\
+  Milind Girkar\thanks{milind.girkar@intel.com},
+  Jan Hubi\v{c}ka\thanks{jh@suse.cz},\\
+  Andreas Jaeger\thanks{aj@suse.de},
+  H.J. Lu\thanks{hongjiu.lu@intel.com},
   Michael Matz\thanks{matz@suse.de},
-  Jan Hubi\v{c}ka\thanks{jh@suse.cz}, Andreas Jaeger\thanks{aj@suse.de},
-  Mark Mitchell\thanks{mark@codesourcery.com}}
+  Mark Mitchell\thanks{mark@codesourcery.com}
+  }
 
 \title{System V Application Binary Interface\\
-{\Large AMD64 Architecture Processor Supplement\\
-Draft Version \version}}
+{\Large AMD64 Architecture Processor Supplement}\\
+{\large (With LP64 and ILP32 Programming Models)}\\
+{\Large Draft Version \version}}
 \maketitle
 \tableofcontents
 \listoftables
@@ -99,6 +104,7 @@ Draft Version \version}}
   place or removed completely.}
 \include{conventions}
 \include{fortran}
+\include{x32}
 
 \appendix
 \include{kernel}
diff --git a/development.tex b/development.tex
index d1388b5..e9a2e47 100644
--- a/development.tex
+++ b/development.tex
@@ -2,18 +2,24 @@
 \chapter{Development Environment}
 
 During compilation of C or C++ code at least the symbols in
-table \ref{prepro_defines} are defined by the pre-processor.
+table \ref{prepro_defines} are defined by the pre-processor
+\footnote{\code{__LP64} and \code{__LP64__} were added to GCC 3.3 in
+March, 2003.}.
 
 \begin{table}[H]
 \Hrule
 \caption{Predefined Pre-Processor Symbols}
 \label{prepro_defines}
-  \begin{center}\code{
-    \begin{tabular}[t]{l}
-      __amd64\\
-      __amd64__\\
-      __x86_64\\
-      __x86_64__\\
+  \begin{center}\small\code{
+    \begin{tabular}[t]{ll}
+      __amd64      & Defined for both LP64 and ILP32 programming models.\\
+      __amd64__    & Defined for both LP64 and ILP32 programming models.\\
+      __x86_64     & Defined for both LP64 and ILP32 programming models.\\
+      __x86_64__   & Defined for both LP64 and ILP32 programming models.\\
+      _LP64        & Defined for LP64 programming model.\\
+      __LP64__     & Defined for LP64 programming model.\\
+      _ILP32       & Defined for ILP32 programming model.\\
+      __ILP32__    & Defined for ILP32 programming model.\\
     \end{tabular}
   }\end{center}
 \Hrule
diff --git a/dl.tex b/dl.tex
index a67f4f8..68c955f 100644
--- a/dl.tex
+++ b/dl.tex
@@ -355,17 +355,24 @@ use.
 
 \subsection{Program Interpreter}
 
-There is one valid \textindex{program interpreter} for
-programs conforming to the \xARCH ABI:
-
-\bigskip
-\path{/lib/ld64.so.1}
-
-However, Linux puts this in
-
-\bigskip
-\path{/lib64/ld-linux-x86-64.so.2}
+The valid \textindex{program interpreter} for programs conforming to the
+\xARCH ABI is listed in Table \ref{interp}, which also contains the
+\textindex{program interpreter} used by Linux.
 
+\begin{figure}
+  \caption{\xARCH Program Interpreter}
+  \label{interp}
+  \begin{center}
+    \begin{tabular}[t]{l|l|l}
+      \multicolumn{1}{c}{Data Model} & \multicolumn{1}{c}{Path} &
+      \multicolumn{1}{c}{Linux Path} \\
+      \hline
+      LP64 & \path{/lib/ld64.so.1} & \path{/lib64/ld-linux-x86-64.so.2} \\
+      \hline
+      ILP32 & \path{/lib/ldx32.so.1} & \path{/libx32/ld-linux-x32.so.2} \\
+    \end{tabular}
+  \end{center}
+\end{figure}
 
 \subsection{Initialization and Termination Functions}
 
diff --git a/introduction.tex b/introduction.tex
index 2148ab9..8a547da 100644
--- a/introduction.tex
+++ b/introduction.tex
@@ -1,4 +1,4 @@
-\chapter{Introduction}
+\chapter{Introduction\label{intro}}
 
 The AMD64\footnote{AMD64 has been previously called x86-64.  The
   latter name is used in a number of places out of historical reasons
@@ -15,6 +15,13 @@ compatibility modes.  The \xARCH ABI does not apply to such programs;
 this document applies only programs running in the ``long'' mode
 provided by the \xARCH architecture.
 
+Binaries using the \xARCH instruction set may program to either a 32-bit
+model, in which the C data types \code{int}, \code{long} and all
+pointer types are 32-bit objects (ILP32); or to a 64-bit model,
+in which the C \code{int} type is 32-bits but the C \code{long} type
+and all pointer types are 64-bit objects (LP64). This specification
+covers both LP64 and ILP32 programming models.
+
 Except where otherwise noted, the \xARCH architecture ABI follows the
 conventions described in the \intelabi.  Rather than replicate the
 entire contents of the \intelabi, the \xARCH ABI indicates only those
diff --git a/low-level-sys-info.tex b/low-level-sys-info.tex
index b030e42..c125a5f 100644
--- a/low-level-sys-info.tex
+++ b/low-level-sys-info.tex
@@ -32,7 +32,7 @@ scalar types and the processor's.  \code{__int128}, \code{__float128},
   \caption{Scalar Types}\label{basic-types}
 { % Use small here - the table is still too large
   % Has anybody an idea how to shrink the table so that it fits the page?
-  \small
+  \myfontsize
   \begin{tabular}{l|l|c|c|l}
     \hline\noalign{\smallskip}
      & &  & \multicolumn{1}{c|}{Alignment} & \multicolumn{1}{c|}{\xARCH} \\
@@ -58,12 +58,19 @@ scalar types and the processor's.  \code{__int128}, \code{__float128},
     \cline{2-5}
     & \texttt{unsigned int} & 4 & 4 & unsigned \fourbyte \\
     \cline{2-5}
-    & \texttt{long} & 8 & 8 & signed \eightbyte \\
-    & \texttt{signed long} & & \\
-    & \texttt{long long} & & \\
+    & \texttt{long (LP64)} & 8 & 8 & signed \eightbyte \\
+    & \texttt{signed long (LP64)} & & \\
+    \cline{2-5}
+    & \texttt{unsigned long (LP64)} & 8 & 8 & unsigned \eightbyte \\
+    \cline{2-5}
+    & \texttt{long (ILP32)} & 4 & 4 & signed \fourbyte \\
+    & \texttt{signed long (ILP32)} & & \\
+    \cline{2-5}
+    & \texttt{unsigned long (ILP32)} & 4 & 4 & unsigned \fourbyte \\
+    \cline{2-5}
+    & \texttt{long long} & 8 & 8 & signed \eightbyte \\
     & \texttt{signed long long} & & \\
     \cline{2-5}
-    & \texttt{unsigned long} & 8 & 8 & unsigned \eightbyte \\
     & \texttt{unsigned long long} & 8 & 8 & unsigned \eightbyte \\
     \cline{2-5}
     & \texttt{__int128}$^{\dagger\dagger}$ & 16 & 16 & signed \sixteenbyte \\
@@ -71,8 +78,12 @@ scalar types and the processor's.  \code{__int128}, \code{__float128},
     \cline{2-5}
     & \texttt{unsigned __int128}$^{\dagger\dagger}$ & 16 & 16 & unsigned \sixteenbyte \\
     \hline
-    Pointer & \texttt{\textit{any-type} *} & 8 & 8 & unsigned \eightbyte \\
-    & \texttt{\textit{any-type} (*)()} & & \\
+    Pointer
+    & \texttt{\textit{any-type} * (LP64)} & 8 & 8 & unsigned \eightbyte \\
+    & \texttt{\textit{any-type} (*)() (LP64)} & & \\
+    \cline{2-5}
+    & \texttt{\textit{any-type} * (ILP32)} & 4 & 4 & unsigned \fourbyte \\
+    & \texttt{\textit{any-type} (*)() (ILP32)} & & \\
     \hline
     Floating-& \texttt{float} & 4 & 4 & single (IEEE-754) \\
     \cline{2-5}
@@ -188,9 +199,16 @@ integral values of a specified size.
       \texttt{int} & 1 to 32 & 0 to $2^{w}-1$ \\
       \texttt{unsigned int} & & 0 to $2^{w}-1$ \\
       \hline
-      \texttt{signed long} & & $-2^{w - 1}$ to $2^{w-1}-1$ \\
-      \texttt{long} & 1 to 64 & 0 to $2^{w}-1$ \\
-      \texttt{unsigned long} & & 0 to $2^{w}-1$ \\
+      \texttt{signed long (LP64)} & & $-2^{w - 1}$ to $2^{w-1}-1$ \\
+      \texttt{long (LP64)} & 1 to 64 & 0 to $2^{w}-1$ \\
+      \texttt{unsigned long (LP64)} & & 0 to $2^{w}-1$ \\
+      \hline
+      \texttt{long (ILP32)} & 1 to 32 & 0 to $2^{w}-1$ \\
+      \texttt{unsigned long (ILP32)} & & 0 to $2^{w}-1$ \\
+      \hline
+      \texttt{signed long long} & & $-2^{w - 1}$ to $2^{w-1}-1$ \\
+      \texttt{long long} & 1 to 64 & 0 to $2^{w}-1$ \\
+      \texttt{unsigned long long} & & 0 to $2^{w}-1$ \\
     \end{tabular}
   \end{center}
 \Hrule
@@ -1102,7 +1120,7 @@ operations such as calling functions, accessing static objects, and
 transferring control from one part of a program to another.  Unlike
 previous material, this material is not normative.
 
-\subsection{Architectural Constraints}
+\subsection{Architectural Constraints\label{models}}
 
 The \xARCH architecture usually does not allow an instruction to encode
 arbitrary
@@ -1233,6 +1251,9 @@ that are of general interest:
 
 \end{description}
 
+Only small code model and small position independent code model
+(\textindex{PIC}) are used in ILP32 binaries.
+
 \subsection{Conventions}
 
 In this document some special assembler symbols are used in the coding
diff --git a/macros.tex b/macros.tex
index 0d20eac..9c4f915 100644
--- a/macros.tex
+++ b/macros.tex
@@ -107,6 +107,8 @@
 
 \newcommand*{\cbnew}{\marginpar{\textsf{New}}}
 
+\newcommand{\myfontsize}{\fontsize{9}{11}\selectfont}
+
 %%% Local Variables:
 %%% mode: latex
 %%% TeX-master: "abi"
diff --git a/object-files.tex b/object-files.tex
index 4705e96..eb1d544 100644
--- a/object-files.tex
+++ b/object-files.tex
@@ -5,22 +5,28 @@
 
 \subsection{Machine Information}
 
-For file identification in \texttt{e_ident}, the \xARCH architecture
-requires the following values.
+\subsubsection{Programming Model}
 
-\begin{table}[H]
-\Hrule
-  \caption{\xARCH Identification}
-  \begin{center}
-    \begin{tabular}[t]{l|l}
-      \multicolumn{1}{c}{Position} & \multicolumn{1}{c}{Value} \\
-      \hline
-      \texttt{e_ident[EI_CLASS]} & \texttt{ELFCLASS64} \\
-      \texttt{e_ident[EI_DATA]} & \texttt{ELFDATA2LSB}
-    \end{tabular}
-  \end{center}
-\Hrule
-\end{table}
+As described in Section \ref{intro}, binaries using the \xARCH instruction
+set may program to either a 32-bit model, in which the C data
+types \code{int}, \code{long} and all pointer types are 32-bit objects
+(ILP32); or to a 64-bit model, in which the C code{int} type is 32-bits
+but the C \code{long} type and all pointer types are 64-bit objects (LP64).
+This specification describes both binaries that use the ILP32 and the LP64
+model.
+
+\subsubsection{File Class}
+
+For \xARCH ILP32 objects, the file class value in e_ident[EI_CLASS] must
+be ELFCLASS32. For \xARCH LP64 objects, the file class value must be
+ELFCLASS64.
+
+\subsubsection{Data Encoding}
+
+For the data encoding in e_ident[EI_DATA], \xARCH objects use
+ELFDATA2LSB.
+
+\subsubsection{Processor identification}
 
 Processor identification resides in the ELF headers
 \texttt{e_machine} member and must have the value
@@ -397,6 +403,8 @@ Figure \ref{reloc_fields} shows the allowed relocatable fields.
                   with arbitrary byte alignment.  These values use
                   the same byte order as other word values in the
                   \xARCH architecture. \\
+\textit{wordclass} & This specifies \textit{word64} for LP64 and
+		     specifies \textit{word32} for ILP32. \\ 
 \end{tabular*}
 
 The following notations are used for specifying relocations in table
@@ -421,13 +429,19 @@ The following notations are used for specifying relocations in table
   relocation entry.
 \end{description}
 
-The \xARCH ABI architectures uses only \texttt{Elf64_Rela} relocation
+The \xARCH LP64 ABI architecture uses only \texttt{Elf64_Rela} relocation
 entries with explicit addends.  The \code{r_addend} member serves as
 the relocation addend.
 
+The \xARCH ILP32 ABI architecture uses only \texttt{Elf32_Rela} relocation
+entries in relocatable files.  Relocations contained within executable
+files or shared objects may use either \texttt{Elf32_Rela} relocation
+or \texttt{Elf32_Rel} relocation.
+
 \begin{table}[H]
 \Hrule
   \caption{Relocation Types}
+  \small
   \label{tab-relocations}
   \begin{center}
     \begin{tabular}[t]{l|r|l|l}
@@ -442,9 +456,9 @@ the relocation addend.
       \texttt{R_X86_64_GOT32} & 3 & \textit{word32} & \texttt{G + A} \\
       \texttt{R_X86_64_PLT32} & 4 & \textit{word32} & \texttt{L + A - P} \\
       \texttt{R_X86_64_COPY}  & 5 & none            & none \\
-      \texttt{R_X86_64_GLOB_DAT} & 6 & \textit{word64} & \texttt{S} \\
-      \texttt{R_X86_64_JUMP_SLOT} & 7 & \textit{word64} & \texttt{S} \\
-      \texttt{R_X86_64_RELATIVE} & 8 & \textit{word64} & \texttt{B + A} \\
+      \texttt{R_X86_64_GLOB_DAT} & 6 & \textit{wordclass} & \texttt{S} \\
+      \texttt{R_X86_64_JUMP_SLOT} & 7 & \textit{wordclass} & \texttt{S} \\
+      \texttt{R_X86_64_RELATIVE} & 8 & \textit{wordclass} & \texttt{B + A} \\
       \texttt{R_X86_64_GOTPCREL} & 9 & \textit{word32} & \texttt{G + GOT + A - P} \\
       \texttt{R_X86_64_32}    & 10 & \textit{word32} & \texttt{S + A} \\
       \texttt{R_X86_64_32S}   & 11 & \textit{word32} & \texttt{S + A} \\
@@ -460,17 +474,22 @@ the relocation addend.
       \texttt{R_X86_64_DTPOFF32}   & 21 & \textit{word32} &  \\
       \texttt{R_X86_64_GOTTPOFF}   & 22 & \textit{word32} &  \\
       \texttt{R_X86_64_TPOFF32}   & 23 & \textit{word32} &  \\
-      \texttt{R_X86_64_PC64}  & 24 & \textit{word64} & \texttt{S + A - P} \\
-      \texttt{R_X86_64_GOTOFF64} & 25 & \textit{word64} & \texttt{S + A - GOT} \\
+      \texttt{R_X86_64_PC64} $^\dagger$ & 24 & \textit{word64} & \texttt{S + A - P} \\
+      \texttt{R_X86_64_GOTOFF64} $^\dagger$ & 25 & \textit{word64} & \texttt{S + A - GOT} \\
       \texttt{R_X86_64_GOTPC32} & 26 & \textit{word32} & \texttt{GOT + A - P} \\
       \texttt{R_X86_64_SIZE32} & 32 & \textit{word32} & \texttt{Z + A} \\
-      \texttt{R_X86_64_SIZE64} & 33 & \textit{word64} & \texttt{Z + A} \\
+      \texttt{R_X86_64_SIZE64} $^\dagger$ & 33 & \textit{word64} & \texttt{Z + A} \\
       \texttt{R_X86_64_GOTPC32_TLSDESC} & 34 & \textit{word32} &  \\
       \texttt{R_X86_64_TLSDESC_CALL} & 35 & none &  \\
       \texttt{R_X86_64_TLSDESC} & 36 & \textit{word64}$\times 2$ & \\
-      \texttt{R_X86_64_IRELATIVE} & 37 & \textit{word64} & \texttt{indirect (B + A)}\\
+      \texttt{R_X86_64_IRELATIVE} & 37 & \textit{wordclass} & \texttt{indirect (B + A)}\\
+      \texttt{R_X86_64_RELATIVE64} $^{\dagger\dagger}$ & 38 & \textit{word64} & \texttt{B + A} \\
 %      \texttt{R_X86_64_GOT64} & 16 & \textit{word64} & \texttt{G + A} \\
 %      \texttt{R_X86_64_PLT64} & 17 & \textit{word64} & \texttt{L + A - P} \\
+     \cline{1-4}
+    \multicolumn{3}{l}{\small $^\dagger$ This relocation is used only for LP64.}\\
+    \multicolumn{3}{l}{\small $^{\dagger\dagger}$ This relocation only
+    appears in ILP32 executable files or shared objects.}\\
     \end{tabular}
   \end{center}
 \Hrule
diff --git a/x32.tex b/x32.tex
new file mode 100644
index 0000000..b7a4055
--- /dev/null
+++ b/x32.tex
@@ -0,0 +1,444 @@
+\chapter{ILP32 Programming Model\label{x32}}
+
+"x32" is commonly used to refer to \xARCH ILP32 programming model.
+
+\section{Parameter Passing}
+When a value of pointer type is returned or passed in a register, bits 32
+to 63 shall be zero.
+
+\section{Address Space}
+
+ILP32 binaries reside in the lower 32 bits of the 64-bit virtual
+address space and all addresses are 32 bits in size.  They should conform
+to \textindex{small code model} or
+\textindex{small position independent code model} (\textindex{PIC})
+described in Section \ref{models}.
+
+\section{Thread-Local Storage Support}
+
+ILP32 Thread-Local Storage (TLS) support is based on LP64 TLS
+implementation with some modifcations.
+
+\subsection{Global Thread-Local Variable}
+
+For a global thread-local variable x:
+
+\begin{verbatim}
+extern __thread int x;
+\end{verbatim}
+
+\begin{description}
+\item[\textindex{General Dynamic Model}]
+  Load address of \code{x} into \reg{rax}
+
+\begin{table}[H]
+\Hrule
+\caption{General Dynamic Model Code Sequence}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{ILP32} \\
+\hline
+0x00 & .byte & 0x66			& 0x00 & leaq  & x@tlsgd(\%rip),\%rdi \\
+0x01 & leaq  & x@tlsgd(\%rip),\%rdi	& 0x07 & .word & 0x6666 \\
+0x08 & .word & 0x6666			& 0x09 & rex64 & \\
+0x0a & rex64 &				& 0x0a & call  & \_\_tls\_get\_addr@plt \\
+0x0b & call  & \_\_tls\_get\_addr@plt	&      &       & \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Initial Exec Model}]
+  Load address of \code{x} into \reg{rax}
+
+\begin{table}[H]
+\Hrule
+\caption{Initial Exec Model Code Sequence}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{ILP32} \\
+\hline
+0x01 & movq & \%fs:0,\%rax		& 0x01 & movl & \%fs:0,\%eax \\
+0x09 & addq & x@gottpoff(\%rip),\%rax	& 0x08 & addl & x@gottpoff(\%rip),\%eax \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Initial Exec Model, II}]
+  Load value of \code{x} into \reg{edi}.  \code{\%fs:(\%eax)} memory
+  operand can't be used for ILP32 since its effective address is the base
+  address of \code{\%fs} + value of \reg{eax} zero-extended to a 64-bit
+  result, which is incorrect with negative value in \reg{eax}.
+
+\begin{table}[H]
+\Hrule
+\caption{Initial Exec Model Code Sequence, II}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{ILP32} \\
+\hline
+0x01 & movq & x@gottpoff(\%rip),\%rax	& 0x01 & movq & x@gottpoff(\%rip),\%rax \\
+0x07 & movl & \%fs:(\%rax),\%edi	& 0x07 & movl & \%fs:(\%rax),\%edi \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\end{description}
+
+\subsection{Static Thread-Local Variable}
+
+For a static thread-local variable x:
+
+\begin{verbatim}
+static __thread int x;
+\end{verbatim}
+
+\begin{description}
+\item[\textindex{Local Dynamic Model}]
+  Load address of \code{x} into \reg{rax}
+
+\begin{table}[H]
+\Hrule
+\caption{Local Dynamic Model Code Sequence With Lea}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{ILP32} \\
+\hline
+0x00 & leaq  & x@tlsld(\%rip),\%rdi	& 0x00 & leaq & x@tlsld(\%rip),\%rdi\\
+0x07 & call  & \_\_tls\_get\_addr@plt	& 0x07 & call & \_\_tls\_get\_addr@plt\\
+0x0c & leaq  & x@dtpoff(\%rax),\%rax	& 0x0c & leal & x@dtpoff(\%rax),\%eax\\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+or
+
+\begin{table}[H]
+\Hrule
+\caption{Local Dynamic Model Code Sequence With Add}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{ILP32} \\
+\hline
+0x00 & leaq  & x@tlsld(\%rip),\%rdi	& 0x00 & leaq & x@tlsld(\%rip),\%rdi\\
+0x07 & call  & \_\_tls\_get\_addr@plt	& 0x07 & call & \_\_tls\_get\_addr@plt\\
+0x0c & addq  & \$x@dtpoff,\%rax		& 0x0c & addl & \$x@dtpoff,\%eax\\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Local Dynamic Model, II}]
+  Load value of \code{x} into \reg{edi}
+
+\begin{table}[H]
+\Hrule
+\caption{Local Dynamic Model Code Sequence, II}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{ILP32} \\
+\hline
+0x01 & movq & \%fs:0,\%rax		& 0x01 & movl & \%fs:0,\%eax \\
+0x09 & movl & x@dtpoff(\%rax),\%edi	& 0x08 & movl & x@dtpoff(\%rax),\%edi \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Local Exec Model}]
+  Load address of \code{x} into \reg{rax}
+
+\begin{table}[H]
+\Hrule
+\caption{Local Exec Model Code Sequence With Lea}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{ILP32} \\
+\hline
+0x01 & movq & \%fs:0,\%rax		& 0x01 & movl & \%fs:0,\%eax \\
+0x09 & leaq & x@tpoff(\%rax),\%rax	& 0x08 & leal & x@tpoff(\%rax),\%eax \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+or
+
+\begin{table}[H]
+\Hrule
+\caption{Local Exec Model Code Sequence With Add}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{ILP32} \\
+\hline
+0x01 & movq & \%fs:0,\%rax		& 0x01 & movl & \%fs:0,\%eax \\
+0x09 & addq & \$x@tpoff,\%rax		& 0x08 & addl & \$x@tpoff,\%eax \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Local Exec Model, II}]
+  Load value of \code{x} into \reg{edi}
+
+\begin{table}[H]
+\Hrule
+\caption{Local Exec Model Code Sequence, II}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{ILP32} \\
+\hline
+0x01 & movq & \%fs:0,\%rax		& 0x01 & movl & \%fs:0,\%eax \\
+0x09 & movl & x@tpoff(\%rax),\%edi	& 0x08 & movl & x@tpoff(\%rax),\%edi \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Local Exec Model, III}]
+  Load value of \code{x} into \reg{edi}
+
+\begin{table}[H]
+\Hrule
+\caption{Local Exec Model Code Sequence, III}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{ILP32} \\
+\hline
+0x00 & movl & \%fs:x@tpoff,\%edi	& 0x00 & movl & \%fs:x@tpoff,\%edi \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\end{description}
+
+\subsection{TLS Linker Optimization}
+
+\begin{description}
+\item[\textindex{General Dynamic To Initial Exec}]
+  Load address of \code{x} into \reg{rax}
+
+\begin{table}[H]
+\Hrule
+\caption{GD -> IE Code Transition}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{GD} & \multicolumn{3}{c}{IE} \\
+\hline
+0x00 & leaq  & x@tlsgd(\%rip),\%rdi	& 0x00 & movl  & \%fs:0, \%eax \\
+0x07 & .word & 0x6666			& 0x08 & addq  & x@gottpoff(\%rip),\%rax\\
+0x09 & rex64 &				&      &       & \\
+0x0a & call  & \_\_tls\_get\_addr@plt	&      &       & \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\begin{table}[H]
+\Hrule
+\caption{GD -> LE Code Transition}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{GD} & \multicolumn{3}{c}{LE} \\
+\hline
+0x00 & leaq  & x@tlsgd(\%rip),\%rdi	& 0x00 & movl  & \%fs:0, \%eax \\
+0x07 & .word & 0x6666			& 0x08 & leal  & x@tpoff(\%rax),\%eax\\
+0x09 & rex64 &				&      &       & \\
+0x0a & call  & \_\_tls\_get\_addr@plt	&      &       & \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Initial Exec To Local Exec}]
+  Load address of \code{x} into \reg{rax}
+
+\begin{table}[H]
+\Hrule
+\caption{IE -> LE Code Transition With Lea}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{IE} & \multicolumn{3}{c}{LE} \\
+\hline
+0x01 & movl & \%fs:0,\%eax		& 0x01 & movl & \%fs:0,\%eax \\
+0x08 & addl & x@gottpoff(\%rip),\%eax	& 0x08 & leal & x@tpoff(\%rax),\%eax \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+or
+
+\begin{table}[H]
+\Hrule
+\caption{IE -> LE Code Transition With Add}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{IE} & \multicolumn{3}{c}{LE} \\
+\hline
+0x01 & movl & \%fs:0,\%eax		& 0x01 & movl & \%fs:0,\%eax \\
+0x08 & addl & \$x@gottpoff(\%rip),\%eax	& 0x08 & addl & \$x@tpoff,\%eax \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Initial Exec To Local Exec, II}]
+  Load value of \code{x} into \reg{edi}.
+
+\begin{table}[H]
+\Hrule
+\caption{IE -> LE Code Transition, II}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{IE} & \multicolumn{3}{c}{LE} \\
+\hline
+0x01 & movq & x@gottpoff(\%rip),\%rax	& 0x01 & movq & x@tpoff,\%rax \\
+0x07 & movl & \%fs:(\%rax),\%edi	& 0x07 & movl & \%fs:(\%rax),\%edi \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Local Dynamic to Local Exec}]
+  Load address of \code{x} into \reg{rax}
+
+\begin{table}[H]
+\Hrule
+\caption{LD -> LE Code Transition With Lea}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LD} & \multicolumn{3}{c}{LE} \\
+\hline
+0x00 & leaq  & x@tlsld(\%rip),\%rdi	& 0x00 & nopl & 0x0(\%rax) \\
+0x07 & call  & \_\_tls\_get\_addr@plt	& 0x04 & movl & \%fs:0,\%eax\\
+0x0c & leal  & x@dtpoff(\%rax),\%eax	& 0x0c & leal & x@tpoff(\%rax),\%eax\\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+or
+
+\begin{table}[H]
+\Hrule
+\caption{LD -> LE Code Transition With Add}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LD} & \multicolumn{3}{c}{LE} \\
+\hline
+0x00 & leaq  & x@tlsld(\%rip),\%rdi	& 0x00 & nopl & 0x0(\%rax) \\
+0x07 & call  & \_\_tls\_get\_addr@plt	& 0x04 & movl & \%fs:0,\%eax\\
+0x0c & addq  & \$x@dtpoff,\%rax		& 0x0c & addl & \$x@tpoff,\%eax\\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Local Dynamic To Local Exec, II}]
+  Load value of \code{x} into \reg{edi}.
+
+\begin{table}[H]
+\Hrule
+\caption{LD -> LE Code Transition, II}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LD} & \multicolumn{3}{c}{LE} \\
+0x00 & leaq  & x@tlsld(\%rip),\%rdi	& 0x00 & nopl & 0x0(\%rax) \\
+0x07 & call  & \_\_tls\_get\_addr@plt	& 0x04 & movl & \%fs:0,\%eax\\
+0x0c & movl  & x@dtpoff(\%rax),\%eax	& 0x0c & movl & x@tpoff(\%rax),\%eax\\
+\hline
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\end{description}
+
+\section{Kernel Support}
+Kernel should limit stack and addresses returned from system calls
+bewteen $0x00000000$ to $0xffffffff$.
+
+\section{Coding Examples}
+
+Although ILP32 binaries run in the 64-bit mode, not all 64-bit instructions
+are supported. This section discusses example code sequences for
+fundamental operations which are different from the 64-bit mode.
+
+\subsection{Indirect Branch}
+
+Since indirect branch via memory loads a 64-bit address at the memory
+location, it is not supported in ILP32.  Indirect branch via register
+should be used instead.  The 32-bit address from memory is loaded into
+the lower 32 bits of a register, which will automatically zero-extend
+the upper 32 bits of the register.  Then the indirect call can be
+performed via the 64-bit register. 
+
+\begin{table}[H]
+\Hrule
+\caption{Indirect Branch}
+\begin{center}
+\code{
+\begin{tabular}{ll|ll}
+\multicolumn{2}{c}{LP64} & \multicolumn{2}{c}{ILP32} \\
+\hline
+call & *\%rax          & call & *\%rax \\
+\hline
+call & *func\_p(\%rip) & movl & func\_p(\%rip), \%eax \\
+     &                 & call & *\%rax \\
+\hline
+call & *func\_p        & movl & func\_p, \%eax \\
+     &                 & call & *\%rax \\
+\hline
+jmp  & *\%rax          & jmp  & *\%rax \\
+\hline
+jmp  & *func\_p(\%rip) & movl & func\_p(\%rip), \%eax \\
+     &                 & jmp  & *\%rax \\
+\hline
+jmp  & *func\_p        & movl & func\_p, \%eax \\
+     &                 & jmp  & *\%rax \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}

next prev parent reply	other threads:[~2012-05-17 19:50 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-14 17:31 H.J. Lu
2012-05-14 17:34 ` H. Peter Anvin
2012-05-14 17:44   ` H.J. Lu
2012-05-15 16:08     ` [discuss] " Michael Matz
2012-05-15 16:18       ` H.J. Lu
2012-05-17 19:50       ` H.J. Lu [this message]
     [not found] ` <ccd4a6ab-f279-477f-b48b-94b8f4afd37d@googlegroups.com>
2012-06-26 19:48   ` H.J. Lu
2012-06-26 19:53     ` H. Peter Anvin
     [not found]       ` <af4adaed-508a-439f-92db-21d4385d316e@googlegroups.com>
2012-06-28 21:06         ` H. Peter Anvin
     [not found]     ` <69b1606d-6150-46eb-a426-93bfad19e7a2@googlegroups.com>
2012-06-26 21:23       ` H.J. Lu
     [not found]         ` <bde2af16-b04e-4e17-a22e-3fe0941e2496@googlegroups.com>
2012-06-27 12:02           ` H.J. Lu
2012-06-27 18:24             ` Magnus Fromreide
2012-06-27 18:29               ` H.J. Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMe9rOptzCATWKgnz=cKGsTnyeFWSGdeLJDHFNk5qtBL9tx29A@mail.gmail.com' \
    --to=hjl.tools@gmail.com \
    --cc=binutils@sourceware.org \
    --cc=discuss@x86-64.org \
    --cc=gcc@gcc.gnu.org \
    --cc=gdb@sourceware.org \
    --cc=hpa@zytor.com \
    --cc=libc-alpha@sourceware.org \
    --cc=matz@suse.de \
    --cc=x32-abi@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).