public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed
* [x86-64 psABI] RFC: Extend x86-64 psABI to support x32
@ 2012-05-14 17:31 H.J. Lu
  2012-05-14 17:34 ` H. Peter Anvin
       [not found] ` <ccd4a6ab-f279-477f-b48b-94b8f4afd37d@googlegroups.com>
  0 siblings, 2 replies; 13+ messages in thread
From: H.J. Lu @ 2012-05-14 17:31 UTC (permalink / raw)
  To: discuss; +Cc: GCC Development, Binutils, GNU C Library, GDB, x32-abi

[-- Attachment #1: Type: text/plain, Size: 414 bytes --]

Hi,

Support for the x32 psABI:

http://sites.google.com/site/x32abi/

is added in Linux kernel 3.4-rc1.  X32 uses the ILP32 model for x86-64
instruction set with size of long and pointers == 4 bytes.  X32 is
already supported in GCC 4.7.0 and binutils 2.22.  I am now working
to integrate x32 support into GLIBC 2.16 and GDB 7.5   Here is a
patch to extend x86-64 psABI for x32.  Any comments?

Thanks.

-- 
H.J.

[-- Attachment #2: psabi-x32.patch --]
[-- Type: application/octet-stream, Size: 27058 bytes --]

2012-05-14  H.J. Lu  <hongjiu.lu@intel.com>

	* abi.tex (title): Mention LP64/X32. 
	(author): Add H.J. Lu and Milind Girkar.
	Include x32.tex.

	* development.tex: Add _ILP32 and __ILP32__ for x32.  Also
	document _LP64 and __LP64__.

	* dl.tex: List X32 program interpreter.

	* introduction.tex (Introduction): Add a label.
	Describe X32 and LP64.

	* low-level-sys-info.tex (Scalar Types table): Add X32/LP64 to
	long and long long.  Modify long and pointer types for X32 and
	LP64.  Use \myfontsize instead of \small.
	(Architectural Constraints): Add a lebel.  Mention small model
	for X32.

	* macros.tex (myfontsize): New.

	* object-files.tex (Programming Model): New subsubsection
	(File Class): Likewise.
	(Data Encoding): Likewise.
	(Processor identification}): Likewise.
	(Relocation Types): Add wordclass.  Allow Elf32_Rel relocations
	within x32 executable files or shared objects.
	(Relocation Types): Use small font.  Mark R_X86_64_GLOB_DAT,
	R_X86_64_JUMP_SLOT, R_X86_64_RELATIVE and R_X86_64_IRELATIVE
	with wordclass.  Mark R_X86_64_PC64, R_X86_64_GOTOFF64 and
	R_X86_64_SIZE64 used only for LP64.  Add R_X86_64_RELATIVE64 for
	x32.

	* x32.tex: New file.

diff --git a/abi.tex b/abi.tex
index 2b56d94..4de644e 100644
--- a/abi.tex
+++ b/abi.tex
@@ -5,13 +5,16 @@
 \begin{document}
 
 \author{Edited by\\
+  H.J. Lu\thanks{hongjiu.lu@intel.com},
+  Milind Girkar\thanks{milind.girkar@intel.com},\\
   Michael Matz\thanks{matz@suse.de},
   Jan Hubi\v{c}ka\thanks{jh@suse.cz}, Andreas Jaeger\thanks{aj@suse.de},
   Mark Mitchell\thanks{mark@codesourcery.com}}
 
 \title{System V Application Binary Interface\\
-{\Large AMD64 Architecture Processor Supplement\\
-Draft Version \version}}
+{\Large AMD64 Architecture Processor Supplement}\\
+{\large (With LP64 and X32 Programming Models)}\\
+{\Large Draft Version \version}}
 \maketitle
 \tableofcontents
 \listoftables
@@ -99,6 +102,7 @@ Draft Version \version}}
   place or removed completely.}
 \include{conventions}
 \include{fortran}
+\include{x32}
 
 \appendix
 \include{kernel}
diff --git a/development.tex b/development.tex
index d1388b5..10669e1 100644
--- a/development.tex
+++ b/development.tex
@@ -2,18 +2,24 @@
 \chapter{Development Environment}
 
 During compilation of C or C++ code at least the symbols in
-table \ref{prepro_defines} are defined by the pre-processor.
+table \ref{prepro_defines} are defined by the pre-processor
+\footnote{\code{__LP64} and \code{__LP64__} were added to GCC 3.3 in
+March, 2003.}.
 
 \begin{table}[H]
 \Hrule
 \caption{Predefined Pre-Processor Symbols}
 \label{prepro_defines}
-  \begin{center}\code{
-    \begin{tabular}[t]{l}
-      __amd64\\
-      __amd64__\\
-      __x86_64\\
-      __x86_64__\\
+  \begin{center}\small\code{
+    \begin{tabular}[t]{ll}
+      __amd64      & Defined for both LP64 and X32 programming models.\\
+      __amd64__    & Defined for both LP64 and X32 programming models.\\
+      __x86_64     & Defined for both LP64 and X32 programming models.\\
+      __x86_64__   & Defined for both LP64 and X32 programming models.\\
+      _LP64        & Defined for LP64 programming model.\\
+      __LP64__     & Defined for LP64 programming model.\\
+      _ILP32       & Defined for X32 programming model.\\
+      __ILP32__    & Defined for X32 programming model.\\
     \end{tabular}
   }\end{center}
 \Hrule
diff --git a/dl.tex b/dl.tex
index a67f4f8..8fecec7 100644
--- a/dl.tex
+++ b/dl.tex
@@ -355,17 +355,24 @@ use.
 
 \subsection{Program Interpreter}
 
-There is one valid \textindex{program interpreter} for
-programs conforming to the \xARCH ABI:
-
-\bigskip
-\path{/lib/ld64.so.1}
-
-However, Linux puts this in
-
-\bigskip
-\path{/lib64/ld-linux-x86-64.so.2}
+The valid \textindex{program interpreter} for programs conforming to the
+\xARCH ABI is listed in Table \ref{interp}, which also contains the
+\textindex{program interpreter} used by Linux.
 
+\begin{figure}
+  \caption{\xARCH Program Interpreter}
+  \label{interp}
+  \begin{center}
+    \begin{tabular}[t]{l|l|l}
+      \multicolumn{1}{c}{Data Model} & \multicolumn{1}{c}{Path} &
+      \multicolumn{1}{c}{Linux Path} \\
+      \hline
+      LP64 & \path{/lib/ld64.so.1} & \path{/lib64/ld-linux-x86-64.so.2} \\
+      \hline
+      X32 & \path{/lib/ldx32.so.1} & \path{/libx32/ld-linux-x32.so.2} \\
+    \end{tabular}
+  \end{center}
+\end{figure}
 
 \subsection{Initialization and Termination Functions}
 
diff --git a/introduction.tex b/introduction.tex
index 2148ab9..aa89e87 100644
--- a/introduction.tex
+++ b/introduction.tex
@@ -1,4 +1,4 @@
-\chapter{Introduction}
+\chapter{Introduction\label{intro}}
 
 The AMD64\footnote{AMD64 has been previously called x86-64.  The
   latter name is used in a number of places out of historical reasons
@@ -15,6 +15,13 @@ compatibility modes.  The \xARCH ABI does not apply to such programs;
 this document applies only programs running in the ``long'' mode
 provided by the \xARCH architecture.
 
+Binaries using the \xARCH instruction set may program to either a 32-bit
+model, in which the C data types \code{int}, \code{long} and all
+pointer types are 32-bit objects (X32); or to a 64-bit model,
+in which the C \code{int} type is 32-bits but the C \code{long} type
+and all pointer types are 64-bit objects (LP64). This specification
+covers both LP64 and X32 programming models.
+
 Except where otherwise noted, the \xARCH architecture ABI follows the
 conventions described in the \intelabi.  Rather than replicate the
 entire contents of the \intelabi, the \xARCH ABI indicates only those
diff --git a/low-level-sys-info.tex b/low-level-sys-info.tex
index b030e42..15b5a5d 100644
--- a/low-level-sys-info.tex
+++ b/low-level-sys-info.tex
@@ -32,7 +32,7 @@ scalar types and the processor's.  \code{__int128}, \code{__float128},
   \caption{Scalar Types}\label{basic-types}
 { % Use small here - the table is still too large
   % Has anybody an idea how to shrink the table so that it fits the page?
-  \small
+  \myfontsize
   \begin{tabular}{l|l|c|c|l}
     \hline\noalign{\smallskip}
      & &  & \multicolumn{1}{c|}{Alignment} & \multicolumn{1}{c|}{\xARCH} \\
@@ -58,12 +58,19 @@ scalar types and the processor's.  \code{__int128}, \code{__float128},
     \cline{2-5}
     & \texttt{unsigned int} & 4 & 4 & unsigned \fourbyte \\
     \cline{2-5}
-    & \texttt{long} & 8 & 8 & signed \eightbyte \\
-    & \texttt{signed long} & & \\
-    & \texttt{long long} & & \\
+    & \texttt{long (LP64)} & 8 & 8 & signed \eightbyte \\
+    & \texttt{signed long (LP64)} & & \\
+    \cline{2-5}
+    & \texttt{unsigned long (LP64)} & 8 & 8 & unsigned \eightbyte \\
+    \cline{2-5}
+    & \texttt{long (X32)} & 4 & 4 & signed \fourbyte \\
+    & \texttt{signed long (X32)} & & \\
+    \cline{2-5}
+    & \texttt{unsigned long (X32)} & 4 & 4 & unsigned \fourbyte \\
+    \cline{2-5}
+    & \texttt{long long} & 8 & 8 & signed \eightbyte \\
     & \texttt{signed long long} & & \\
     \cline{2-5}
-    & \texttt{unsigned long} & 8 & 8 & unsigned \eightbyte \\
     & \texttt{unsigned long long} & 8 & 8 & unsigned \eightbyte \\
     \cline{2-5}
     & \texttt{__int128}$^{\dagger\dagger}$ & 16 & 16 & signed \sixteenbyte \\
@@ -71,8 +78,12 @@ scalar types and the processor's.  \code{__int128}, \code{__float128},
     \cline{2-5}
     & \texttt{unsigned __int128}$^{\dagger\dagger}$ & 16 & 16 & unsigned \sixteenbyte \\
     \hline
-    Pointer & \texttt{\textit{any-type} *} & 8 & 8 & unsigned \eightbyte \\
-    & \texttt{\textit{any-type} (*)()} & & \\
+    Pointer
+    & \texttt{\textit{any-type} * (LP64)} & 8 & 8 & unsigned \eightbyte \\
+    & \texttt{\textit{any-type} (*)() (LP64)} & & \\
+    \cline{2-5}
+    & \texttt{\textit{any-type} * (X32)} & 4 & 4 & unsigned \fourbyte \\
+    & \texttt{\textit{any-type} (*)() (X32)} & & \\
     \hline
     Floating-& \texttt{float} & 4 & 4 & single (IEEE-754) \\
     \cline{2-5}
@@ -188,9 +199,16 @@ integral values of a specified size.
       \texttt{int} & 1 to 32 & 0 to $2^{w}-1$ \\
       \texttt{unsigned int} & & 0 to $2^{w}-1$ \\
       \hline
-      \texttt{signed long} & & $-2^{w - 1}$ to $2^{w-1}-1$ \\
-      \texttt{long} & 1 to 64 & 0 to $2^{w}-1$ \\
-      \texttt{unsigned long} & & 0 to $2^{w}-1$ \\
+      \texttt{signed long (LP64)} & & $-2^{w - 1}$ to $2^{w-1}-1$ \\
+      \texttt{long (LP64)} & 1 to 64 & 0 to $2^{w}-1$ \\
+      \texttt{unsigned long (LP64)} & & 0 to $2^{w}-1$ \\
+      \hline
+      \texttt{long (X32)} & 1 to 32 & 0 to $2^{w}-1$ \\
+      \texttt{unsigned long (X32)} & & 0 to $2^{w}-1$ \\
+      \hline
+      \texttt{signed long long} & & $-2^{w - 1}$ to $2^{w-1}-1$ \\
+      \texttt{long long} & 1 to 64 & 0 to $2^{w}-1$ \\
+      \texttt{unsigned long long} & & 0 to $2^{w}-1$ \\
     \end{tabular}
   \end{center}
 \Hrule
@@ -1102,7 +1120,7 @@ operations such as calling functions, accessing static objects, and
 transferring control from one part of a program to another.  Unlike
 previous material, this material is not normative.
 
-\subsection{Architectural Constraints}
+\subsection{Architectural Constraints\label{models}}
 
 The \xARCH architecture usually does not allow an instruction to encode
 arbitrary
@@ -1233,6 +1251,9 @@ that are of general interest:
 
 \end{description}
 
+Only small code model and small position independent code model
+(\textindex{PIC}) are used in X32 binaries.
+
 \subsection{Conventions}
 
 In this document some special assembler symbols are used in the coding
diff --git a/macros.tex b/macros.tex
index 0d20eac..9c4f915 100644
--- a/macros.tex
+++ b/macros.tex
@@ -107,6 +107,8 @@
 
 \newcommand*{\cbnew}{\marginpar{\textsf{New}}}
 
+\newcommand{\myfontsize}{\fontsize{9}{11}\selectfont}
+
 %%% Local Variables:
 %%% mode: latex
 %%% TeX-master: "abi"
diff --git a/object-files.tex b/object-files.tex
index 4705e96..d36f4c2 100644
--- a/object-files.tex
+++ b/object-files.tex
@@ -5,22 +5,28 @@
 
 \subsection{Machine Information}
 
-For file identification in \texttt{e_ident}, the \xARCH architecture
-requires the following values.
+\subsubsection{Programming Model}
 
-\begin{table}[H]
-\Hrule
-  \caption{\xARCH Identification}
-  \begin{center}
-    \begin{tabular}[t]{l|l}
-      \multicolumn{1}{c}{Position} & \multicolumn{1}{c}{Value} \\
-      \hline
-      \texttt{e_ident[EI_CLASS]} & \texttt{ELFCLASS64} \\
-      \texttt{e_ident[EI_DATA]} & \texttt{ELFDATA2LSB}
-    \end{tabular}
-  \end{center}
-\Hrule
-\end{table}
+As described in Section \ref{intro}, binaries using the \xARCH instruction
+set may program to either a 32-bit model, in which the C data
+types \code{int}, \code{long} and all pointer types are 32-bit objects
+(X32); or to a 64-bit model, in which the C code{int} type is 32-bits
+but the C \code{long} type and all pointer types are 64-bit objects (LP64).
+This specification describes both binaries that use the X32 and the LP64
+model.
+
+\subsubsection{File Class}
+
+For \xARCH X32 objects, the file class value in e_ident[EI_CLASS] must
+be ELFCLASS32. For \xARCH LP64 objects, the file class value must be
+ELFCLASS64.
+
+\subsubsection{Data Encoding}
+
+For the data encoding in e_ident[EI_DATA], \xARCH objects use
+ELFDATA2LSB.
+
+\subsubsection{Processor identification}
 
 Processor identification resides in the ELF headers
 \texttt{e_machine} member and must have the value
@@ -397,6 +403,8 @@ Figure \ref{reloc_fields} shows the allowed relocatable fields.
                   with arbitrary byte alignment.  These values use
                   the same byte order as other word values in the
                   \xARCH architecture. \\
+\textit{wordclass} & This specifies \textit{word64} for LP64 and
+		     specifies \textit{word32} for X32. \\ 
 \end{tabular*}
 
 The following notations are used for specifying relocations in table
@@ -421,13 +429,19 @@ The following notations are used for specifying relocations in table
   relocation entry.
 \end{description}
 
-The \xARCH ABI architectures uses only \texttt{Elf64_Rela} relocation
+The \xARCH LP64 ABI architecture uses only \texttt{Elf64_Rela} relocation
 entries with explicit addends.  The \code{r_addend} member serves as
 the relocation addend.
 
+The \xARCH X32 ABI architecture uses only \texttt{Elf32_Rela} relocation
+entries in relocatable files.  Relocations contained within executable
+files or shared objects may use either \texttt{Elf32_Rela} relocation
+or \texttt{Elf32_Rel} relocation.
+
 \begin{table}[H]
 \Hrule
   \caption{Relocation Types}
+  \small
   \label{tab-relocations}
   \begin{center}
     \begin{tabular}[t]{l|r|l|l}
@@ -442,9 +456,9 @@ the relocation addend.
       \texttt{R_X86_64_GOT32} & 3 & \textit{word32} & \texttt{G + A} \\
       \texttt{R_X86_64_PLT32} & 4 & \textit{word32} & \texttt{L + A - P} \\
       \texttt{R_X86_64_COPY}  & 5 & none            & none \\
-      \texttt{R_X86_64_GLOB_DAT} & 6 & \textit{word64} & \texttt{S} \\
-      \texttt{R_X86_64_JUMP_SLOT} & 7 & \textit{word64} & \texttt{S} \\
-      \texttt{R_X86_64_RELATIVE} & 8 & \textit{word64} & \texttt{B + A} \\
+      \texttt{R_X86_64_GLOB_DAT} & 6 & \textit{wordclass} & \texttt{S} \\
+      \texttt{R_X86_64_JUMP_SLOT} & 7 & \textit{wordclass} & \texttt{S} \\
+      \texttt{R_X86_64_RELATIVE} & 8 & \textit{wordclass} & \texttt{B + A} \\
       \texttt{R_X86_64_GOTPCREL} & 9 & \textit{word32} & \texttt{G + GOT + A - P} \\
       \texttt{R_X86_64_32}    & 10 & \textit{word32} & \texttt{S + A} \\
       \texttt{R_X86_64_32S}   & 11 & \textit{word32} & \texttt{S + A} \\
@@ -460,17 +474,22 @@ the relocation addend.
       \texttt{R_X86_64_DTPOFF32}   & 21 & \textit{word32} &  \\
       \texttt{R_X86_64_GOTTPOFF}   & 22 & \textit{word32} &  \\
       \texttt{R_X86_64_TPOFF32}   & 23 & \textit{word32} &  \\
-      \texttt{R_X86_64_PC64}  & 24 & \textit{word64} & \texttt{S + A - P} \\
-      \texttt{R_X86_64_GOTOFF64} & 25 & \textit{word64} & \texttt{S + A - GOT} \\
+      \texttt{R_X86_64_PC64} $^\dagger$ & 24 & \textit{word64} & \texttt{S + A - P} \\
+      \texttt{R_X86_64_GOTOFF64} $^\dagger$ & 25 & \textit{word64} & \texttt{S + A - GOT} \\
       \texttt{R_X86_64_GOTPC32} & 26 & \textit{word32} & \texttt{GOT + A - P} \\
       \texttt{R_X86_64_SIZE32} & 32 & \textit{word32} & \texttt{Z + A} \\
-      \texttt{R_X86_64_SIZE64} & 33 & \textit{word64} & \texttt{Z + A} \\
+      \texttt{R_X86_64_SIZE64} $^\dagger$ & 33 & \textit{word64} & \texttt{Z + A} \\
       \texttt{R_X86_64_GOTPC32_TLSDESC} & 34 & \textit{word32} &  \\
       \texttt{R_X86_64_TLSDESC_CALL} & 35 & none &  \\
       \texttt{R_X86_64_TLSDESC} & 36 & \textit{word64}$\times 2$ & \\
-      \texttt{R_X86_64_IRELATIVE} & 37 & \textit{word64} & \texttt{indirect (B + A)}\\
+      \texttt{R_X86_64_IRELATIVE} & 37 & \textit{wordclass} & \texttt{indirect (B + A)}\\
+      \texttt{R_X86_64_RELATIVE64} $^{\dagger\dagger}$ & 38 & \textit{word64} & \texttt{B + A} \\
 %      \texttt{R_X86_64_GOT64} & 16 & \textit{word64} & \texttt{G + A} \\
 %      \texttt{R_X86_64_PLT64} & 17 & \textit{word64} & \texttt{L + A - P} \\
+     \cline{1-4}
+    \multicolumn{3}{l}{\small $^\dagger$ This relocation is used only for LP64.}\\
+    \multicolumn{3}{l}{\small $^{\dagger\dagger}$ This relocation only
+    appears in X32 executable files or shared objects.}\\
     \end{tabular}
   \end{center}
 \Hrule
diff --git a/x32.tex b/x32.tex
new file mode 100644
index 0000000..d419e52
--- /dev/null
+++ b/x32.tex
@@ -0,0 +1,442 @@
+\chapter{X32 Programming Model\label{x32}}
+
+\section{Parameter Passing}
+When a value of pointer type is returned or passed in a register, bits 32
+to 63 shall be zero.
+
+\section{Address Space}
+
+\xARCH X32 binaries reside in the lower 32 bits of the 64-bit virtual
+address space and all addresses are 32 bits in size.  They should conform
+to \textindex{small code model} or
+\textindex{small position independent code model} (\textindex{PIC})
+described in Section \ref{models}.
+
+\section{Thread-Local Storage Support}
+
+X32 Thread-Local Storage (TLS) support is based on LP64 TLS
+implementation with some modifcations.
+
+\subsection{Global Thread-Local Variable}
+
+For a global thread-local variable x:
+
+\begin{verbatim}
+extern __thread int x;
+\end{verbatim}
+
+\begin{description}
+\item[\textindex{General Dynamic Model}]
+  Load address of \code{x} into \reg{rax}
+
+\begin{table}[H]
+\Hrule
+\caption{General Dynamic Model Code Sequence}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{X32} \\
+\hline
+0x00 & .byte & 0x66			& 0x00 & leaq  & x@tlsgd(\%rip),\%rdi \\
+0x01 & leaq  & x@tlsgd(\%rip),\%rdi	& 0x07 & .word & 0x6666 \\
+0x08 & .word & 0x6666			& 0x09 & rex64 & \\
+0x0a & rex64 &				& 0x0a & call  & \_\_tls\_get\_addr@plt \\
+0x0b & call  & \_\_tls\_get\_addr@plt	&      &       & \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Initial Exec Model}]
+  Load address of \code{x} into \reg{rax}
+
+\begin{table}[H]
+\Hrule
+\caption{Initial Exec Model Code Sequence}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{X32} \\
+\hline
+0x01 & movq & \%fs:0,\%rax		& 0x01 & movl & \%fs:0,\%eax \\
+0x09 & addq & x@gottpoff(\%rip),\%rax	& 0x08 & addl & x@gottpoff(\%rip),\%eax \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Initial Exec Model, II}]
+  Load value of \code{x} into \reg{edi}.  \code{\%fs:(\%eax)} memory
+  operand can't be used for X32 since its effective address is the base
+  address of \code{\%fs} + value of \reg{eax} zero-extended to a 64-bit
+  result, which is incorrect with negative value in \reg{eax}.
+
+\begin{table}[H]
+\Hrule
+\caption{Initial Exec Model Code Sequence, II}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{X32} \\
+\hline
+0x01 & movq & x@gottpoff(\%rip),\%rax	& 0x01 & movq & x@gottpoff(\%rip),\%rax \\
+0x07 & movl & \%fs:(\%rax),\%edi	& 0x07 & movl & \%fs:(\%rax),\%edi \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\end{description}
+
+\subsection{Static Thread-Local Variable}
+
+For a static thread-local variable x:
+
+\begin{verbatim}
+static __thread int x;
+\end{verbatim}
+
+\begin{description}
+\item[\textindex{Local Dynamic Model}]
+  Load address of \code{x} into \reg{rax}
+
+\begin{table}[H]
+\Hrule
+\caption{Local Dynamic Model Code Sequence With Lea}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{X32} \\
+\hline
+0x00 & leaq  & x@tlsld(\%rip),\%rdi	& 0x00 & leaq & x@tlsld(\%rip),\%rdi\\
+0x07 & call  & \_\_tls\_get\_addr@plt	& 0x07 & call & \_\_tls\_get\_addr@plt\\
+0x0c & leaq  & x@dtpoff(\%rax),\%rax	& 0x0c & leal & x@dtpoff(\%rax),\%eax\\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+or
+
+\begin{table}[H]
+\Hrule
+\caption{Local Dynamic Model Code Sequence With Add}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{X32} \\
+\hline
+0x00 & leaq  & x@tlsld(\%rip),\%rdi	& 0x00 & leaq & x@tlsld(\%rip),\%rdi\\
+0x07 & call  & \_\_tls\_get\_addr@plt	& 0x07 & call & \_\_tls\_get\_addr@plt\\
+0x0c & addq  & \$x@dtpoff,\%rax		& 0x0c & addl & \$x@dtpoff,\%eax\\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Local Dynamic Model, II}]
+  Load value of \code{x} into \reg{edi}
+
+\begin{table}[H]
+\Hrule
+\caption{Local Dynamic Model Code Sequence, II}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{X32} \\
+\hline
+0x01 & movq & \%fs:0,\%rax		& 0x01 & movl & \%fs:0,\%eax \\
+0x09 & movl & x@dtpoff(\%rax),\%edi	& 0x08 & movl & x@dtpoff(\%rax),\%edi \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Local Exec Model}]
+  Load address of \code{x} into \reg{rax}
+
+\begin{table}[H]
+\Hrule
+\caption{Local Exec Model Code Sequence With Lea}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{X32} \\
+\hline
+0x01 & movq & \%fs:0,\%rax		& 0x01 & movl & \%fs:0,\%eax \\
+0x09 & leaq & x@tpoff(\%rax),\%rax	& 0x08 & leal & x@tpoff(\%rax),\%eax \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+or
+
+\begin{table}[H]
+\Hrule
+\caption{Local Exec Model Code Sequence With Add}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{X32} \\
+\hline
+0x01 & movq & \%fs:0,\%rax		& 0x01 & movl & \%fs:0,\%eax \\
+0x09 & addq & \$x@tpoff,\%rax		& 0x08 & addl & \$x@tpoff,\%eax \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Local Exec Model, II}]
+  Load value of \code{x} into \reg{edi}
+
+\begin{table}[H]
+\Hrule
+\caption{Local Exec Model Code Sequence, II}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{X32} \\
+\hline
+0x01 & movq & \%fs:0,\%rax		& 0x01 & movl & \%fs:0,\%eax \\
+0x09 & movl & x@tpoff(\%rax),\%edi	& 0x08 & movl & x@tpoff(\%rax),\%edi \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Local Exec Model, III}]
+  Load value of \code{x} into \reg{edi}
+
+\begin{table}[H]
+\Hrule
+\caption{Local Exec Model Code Sequence, III}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{X32} \\
+\hline
+0x00 & movl & \%fs:x@tpoff,\%edi	& 0x00 & movl & \%fs:x@tpoff,\%edi \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\end{description}
+
+\subsection{TLS Linker Optimization}
+
+\begin{description}
+\item[\textindex{General Dynamic To Initial Exec}]
+  Load address of \code{x} into \reg{rax}
+
+\begin{table}[H]
+\Hrule
+\caption{GD -> IE Code Transition}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{GD} & \multicolumn{3}{c}{IE} \\
+\hline
+0x00 & leaq  & x@tlsgd(\%rip),\%rdi	& 0x00 & movl  & \%fs:0, \%eax \\
+0x07 & .word & 0x6666			& 0x08 & addq  & x@gottpoff(\%rip),\%rax\\
+0x09 & rex64 &				&      &       & \\
+0x0a & call  & \_\_tls\_get\_addr@plt	&      &       & \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\begin{table}[H]
+\Hrule
+\caption{GD -> LE Code Transition}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{GD} & \multicolumn{3}{c}{LE} \\
+\hline
+0x00 & leaq  & x@tlsgd(\%rip),\%rdi	& 0x00 & movl  & \%fs:0, \%eax \\
+0x07 & .word & 0x6666			& 0x08 & leal  & x@tpoff(\%rax),\%eax\\
+0x09 & rex64 &				&      &       & \\
+0x0a & call  & \_\_tls\_get\_addr@plt	&      &       & \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Initial Exec To Local Exec}]
+  Load address of \code{x} into \reg{rax}
+
+\begin{table}[H]
+\Hrule
+\caption{IE -> LE Code Transition With Lea}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{IE} & \multicolumn{3}{c}{LE} \\
+\hline
+0x01 & movl & \%fs:0,\%eax		& 0x01 & movl & \%fs:0,\%eax \\
+0x08 & addl & x@gottpoff(\%rip),\%eax	& 0x08 & leal & x@tpoff(\%rax),\%eax \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+or
+
+\begin{table}[H]
+\Hrule
+\caption{IE -> LE Code Transition With Add}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{IE} & \multicolumn{3}{c}{LE} \\
+\hline
+0x01 & movl & \%fs:0,\%eax		& 0x01 & movl & \%fs:0,\%eax \\
+0x08 & addl & \$x@gottpoff(\%rip),\%eax	& 0x08 & addl & \$x@tpoff,\%eax \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Initial Exec To Local Exec, II}]
+  Load value of \code{x} into \reg{edi}.
+
+\begin{table}[H]
+\Hrule
+\caption{IE -> LE Code Transition, II}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{IE} & \multicolumn{3}{c}{LE} \\
+\hline
+0x01 & movq & x@gottpoff(\%rip),\%rax	& 0x01 & movq & x@tpoff,\%rax \\
+0x07 & movl & \%fs:(\%rax),\%edi	& 0x07 & movl & \%fs:(\%rax),\%edi \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Local Dynamic to Local Exec}]
+  Load address of \code{x} into \reg{rax}
+
+\begin{table}[H]
+\Hrule
+\caption{LD -> LE Code Transition With Lea}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LD} & \multicolumn{3}{c}{LE} \\
+\hline
+0x00 & leaq  & x@tlsld(\%rip),\%rdi	& 0x00 & nopl & 0x0(\%rax) \\
+0x07 & call  & \_\_tls\_get\_addr@plt	& 0x04 & movl & \%fs:0,\%eax\\
+0x0c & leal  & x@dtpoff(\%rax),\%eax	& 0x0c & leal & x@tpoff(\%rax),\%eax\\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+or
+
+\begin{table}[H]
+\Hrule
+\caption{LD -> LE Code Transition With Add}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LD} & \multicolumn{3}{c}{LE} \\
+\hline
+0x00 & leaq  & x@tlsld(\%rip),\%rdi	& 0x00 & nopl & 0x0(\%rax) \\
+0x07 & call  & \_\_tls\_get\_addr@plt	& 0x04 & movl & \%fs:0,\%eax\\
+0x0c & addq  & \$x@dtpoff,\%rax		& 0x0c & addl & \$x@tpoff,\%eax\\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Local Dynamic To Local Exec, II}]
+  Load value of \code{x} into \reg{edi}.
+
+\begin{table}[H]
+\Hrule
+\caption{LD -> LE Code Transition, II}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LD} & \multicolumn{3}{c}{LE} \\
+0x00 & leaq  & x@tlsld(\%rip),\%rdi	& 0x00 & nopl & 0x0(\%rax) \\
+0x07 & call  & \_\_tls\_get\_addr@plt	& 0x04 & movl & \%fs:0,\%eax\\
+0x0c & movl  & x@dtpoff(\%rax),\%eax	& 0x0c & movl & x@tpoff(\%rax),\%eax\\
+\hline
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\end{description}
+
+\section{Kernel Support}
+Kernel should limit stack and addresses returned from system calls
+bewteen $0x00000000$ to $0xffffffff$.
+
+\section{Coding Examples}
+
+Although X32 binaries run in the 64-bit mode, not all 64-bit instructions
+are supported. This section discusses example code sequences for
+fundamental operations which are different from the 64-bit mode.
+
+\subsection{Indirect Branch}
+
+Since indirect branch via memory loads a 64-bit address at the memory
+location, it is not supported in X32.  Indirect branch via register
+should be used instead.  The 32-bit address from memory is loaded into
+the lower 32 bits of a register, which will automatically zero-extend
+the upper 32 bits of the register.  Then the indirect call can be
+performed via the 64-bit register. 
+
+\begin{table}[H]
+\Hrule
+\caption{Indirect Branch}
+\begin{center}
+\code{
+\begin{tabular}{ll|ll}
+\multicolumn{2}{c}{LP64} & \multicolumn{2}{c}{X32} \\
+\hline
+call & *\%rax          & call & *\%rax \\
+\hline
+call & *func\_p(\%rip) & movl & func\_p(\%rip), \%eax \\
+     &                 & call & *\%rax \\
+\hline
+call & *func\_p        & movl & func\_p, \%eax \\
+     &                 & call & *\%rax \\
+\hline
+jmp  & *\%rax          & jmp  & *\%rax \\
+\hline
+jmp  & *func\_p(\%rip) & movl & func\_p(\%rip), \%eax \\
+     &                 & jmp  & *\%rax \\
+\hline
+jmp  & *func\_p        & movl & func\_p, \%eax \\
+     &                 & jmp  & *\%rax \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [x86-64 psABI] RFC: Extend x86-64 psABI to support x32
  2012-05-14 17:31 [x86-64 psABI] RFC: Extend x86-64 psABI to support x32 H.J. Lu
@ 2012-05-14 17:34 ` H. Peter Anvin
  2012-05-14 17:44   ` H.J. Lu
       [not found] ` <ccd4a6ab-f279-477f-b48b-94b8f4afd37d@googlegroups.com>
  1 sibling, 1 reply; 13+ messages in thread
From: H. Peter Anvin @ 2012-05-14 17:34 UTC (permalink / raw)
  To: x32-abi; +Cc: H.J. Lu, discuss, GCC Development, Binutils, GNU C Library, GDB

On 05/14/2012 10:31 AM, H.J. Lu wrote:
> Hi,
> 
> Support for the x32 psABI:
> 
> http://sites.google.com/site/x32abi/
> 
> is added in Linux kernel 3.4-rc1.  X32 uses the ILP32 model for x86-64
> instruction set with size of long and pointers == 4 bytes.  X32 is
> already supported in GCC 4.7.0 and binutils 2.22.  I am now working
> to integrate x32 support into GLIBC 2.16 and GDB 7.5   Here is a
> patch to extend x86-64 psABI for x32.  Any comments?
> 

As a minor nitpick, I have always used x32 with a lower case x.  The
capital X32 looks odd to me.

	-hpa

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [x86-64 psABI] RFC: Extend x86-64 psABI to support x32
  2012-05-14 17:34 ` H. Peter Anvin
@ 2012-05-14 17:44   ` H.J. Lu
  2012-05-15 16:08     ` [discuss] " Michael Matz
  0 siblings, 1 reply; 13+ messages in thread
From: H.J. Lu @ 2012-05-14 17:44 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: x32-abi, discuss, GCC Development, Binutils, GNU C Library, GDB

On Mon, May 14, 2012 at 10:34 AM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 05/14/2012 10:31 AM, H.J. Lu wrote:
>> Hi,
>>
>> Support for the x32 psABI:
>>
>> http://sites.google.com/site/x32abi/
>>
>> is added in Linux kernel 3.4-rc1.  X32 uses the ILP32 model for x86-64
>> instruction set with size of long and pointers == 4 bytes.  X32 is
>> already supported in GCC 4.7.0 and binutils 2.22.  I am now working
>> to integrate x32 support into GLIBC 2.16 and GDB 7.5   Here is a
>> patch to extend x86-64 psABI for x32.  Any comments?
>>
>
> As a minor nitpick, I have always used x32 with a lower case x.  The
> capital X32 looks odd to me.
>

I used X32 together with LP64.  I can use ILP32 instead of X32 when
LP64 is mentioned at the same time.

-- 
H.J.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [discuss] [x86-64 psABI] RFC: Extend x86-64 psABI to support x32
  2012-05-14 17:44   ` H.J. Lu
@ 2012-05-15 16:08     ` Michael Matz
  2012-05-15 16:18       ` H.J. Lu
  2012-05-17 19:50       ` H.J. Lu
  0 siblings, 2 replies; 13+ messages in thread
From: Michael Matz @ 2012-05-15 16:08 UTC (permalink / raw)
  To: H.J. Lu
  Cc: H. Peter Anvin, discuss, GNU C Library, GCC Development, GDB,
	x32-abi, Binutils

[-- Attachment #1: Type: TEXT/PLAIN, Size: 444 bytes --]

Hi,

On Mon, 14 May 2012, H.J. Lu wrote:

> > As a minor nitpick, I have always used x32 with a lower case x.  The 
> > capital X32 looks odd to me.
> >
> 
> I used X32 together with LP64.  I can use ILP32 instead of X32 when LP64 
> is mentioned at the same time.

I'd prefer that.  x32 is a nice short-hand name for the whole thing, but 
not descriptive, unlike LP64.  So, yes, IMO it should be ILP32 in the ABI 
document.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [discuss] [x86-64 psABI] RFC: Extend x86-64 psABI to support x32
  2012-05-15 16:08     ` [discuss] " Michael Matz
@ 2012-05-15 16:18       ` H.J. Lu
  2012-05-17 19:50       ` H.J. Lu
  1 sibling, 0 replies; 13+ messages in thread
From: H.J. Lu @ 2012-05-15 16:18 UTC (permalink / raw)
  To: Michael Matz
  Cc: H. Peter Anvin, discuss, GNU C Library, GCC Development, GDB,
	x32-abi, Binutils

On Tue, May 15, 2012 at 9:07 AM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> On Mon, 14 May 2012, H.J. Lu wrote:
>
>> > As a minor nitpick, I have always used x32 with a lower case x.  The
>> > capital X32 looks odd to me.
>> >
>>
>> I used X32 together with LP64.  I can use ILP32 instead of X32 when LP64
>> is mentioned at the same time.
>
> I'd prefer that.  x32 is a nice short-hand name for the whole thing, but
> not descriptive, unlike LP64.  So, yes, IMO it should be ILP32 in the ABI
> document.
>

I will make the change and post a new patch.

Thanks.

-- 
H.J.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [discuss] [x86-64 psABI] RFC: Extend x86-64 psABI to support x32
  2012-05-15 16:08     ` [discuss] " Michael Matz
  2012-05-15 16:18       ` H.J. Lu
@ 2012-05-17 19:50       ` H.J. Lu
  1 sibling, 0 replies; 13+ messages in thread
From: H.J. Lu @ 2012-05-17 19:50 UTC (permalink / raw)
  To: Michael Matz
  Cc: H. Peter Anvin, discuss, GNU C Library, GCC Development, GDB,
	x32-abi, Binutils

[-- Attachment #1: Type: text/plain, Size: 599 bytes --]

On Tue, May 15, 2012 at 9:07 AM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> On Mon, 14 May 2012, H.J. Lu wrote:
>
>> > As a minor nitpick, I have always used x32 with a lower case x.  The
>> > capital X32 looks odd to me.
>> >
>>
>> I used X32 together with LP64.  I can use ILP32 instead of X32 when LP64
>> is mentioned at the same time.
>
> I'd prefer that.  x32 is a nice short-hand name for the whole thing, but
> not descriptive, unlike LP64.  So, yes, IMO it should be ILP32 in the ABI
> document.
>

Here is the updated change.  Any comments?

Thanks.



-- 
H.J.

[-- Attachment #2: psabi-x32-2.patch --]
[-- Type: application/octet-stream, Size: 27350 bytes --]

2012-05-17  H.J. Lu  <hongjiu.lu@intel.com>

	* abi.tex (title): Mention LP64/ILP32. 
	(author): Add H.J. Lu and Milind Girkar.
	Include x32.tex.

	* development.tex: Add _ILP32 and __ILP32__ for ILP32.  Also
	document _LP64 and __LP64__.

	* dl.tex: List ILP32 program interpreter.

	* introduction.tex (Introduction): Add a label.
	Describe ILP32 and LP64.

	* low-level-sys-info.tex (Scalar Types table): Add ILP32/LP64 to
	long and long long.  Modify long and pointer types for ILP32 and
	LP64.  Use \myfontsize instead of \small.
	(Architectural Constraints): Add a lebel.  Mention small model
	for ILP32.

	* macros.tex (myfontsize): New.

	* object-files.tex (Programming Model): New subsubsection
	(File Class): Likewise.
	(Data Encoding): Likewise.
	(Processor identification}): Likewise.
	(Relocation Types): Add wordclass.  Allow Elf32_Rel relocations
	within ILP32 executable files or shared objects.
	(Relocation Types): Use small font.  Mark R_X86_64_GLOB_DAT,
	R_X86_64_JUMP_SLOT, R_X86_64_RELATIVE and R_X86_64_IRELATIVE
	with wordclass.  Mark R_X86_64_PC64, R_X86_64_GOTOFF64 and
	R_X86_64_SIZE64 used only for LP64.  Add R_X86_64_RELATIVE64 for
	ILP32.

	* x32.tex: New file.

diff --git a/abi.tex b/abi.tex
index 2b56d94..a301b5d 100644
--- a/abi.tex
+++ b/abi.tex
@@ -5,13 +5,18 @@
 \begin{document}
 
 \author{Edited by\\
+  Milind Girkar\thanks{milind.girkar@intel.com},
+  Jan Hubi\v{c}ka\thanks{jh@suse.cz},\\
+  Andreas Jaeger\thanks{aj@suse.de},
+  H.J. Lu\thanks{hongjiu.lu@intel.com},
   Michael Matz\thanks{matz@suse.de},
-  Jan Hubi\v{c}ka\thanks{jh@suse.cz}, Andreas Jaeger\thanks{aj@suse.de},
-  Mark Mitchell\thanks{mark@codesourcery.com}}
+  Mark Mitchell\thanks{mark@codesourcery.com}
+  }
 
 \title{System V Application Binary Interface\\
-{\Large AMD64 Architecture Processor Supplement\\
-Draft Version \version}}
+{\Large AMD64 Architecture Processor Supplement}\\
+{\large (With LP64 and ILP32 Programming Models)}\\
+{\Large Draft Version \version}}
 \maketitle
 \tableofcontents
 \listoftables
@@ -99,6 +104,7 @@ Draft Version \version}}
   place or removed completely.}
 \include{conventions}
 \include{fortran}
+\include{x32}
 
 \appendix
 \include{kernel}
diff --git a/development.tex b/development.tex
index d1388b5..e9a2e47 100644
--- a/development.tex
+++ b/development.tex
@@ -2,18 +2,24 @@
 \chapter{Development Environment}
 
 During compilation of C or C++ code at least the symbols in
-table \ref{prepro_defines} are defined by the pre-processor.
+table \ref{prepro_defines} are defined by the pre-processor
+\footnote{\code{__LP64} and \code{__LP64__} were added to GCC 3.3 in
+March, 2003.}.
 
 \begin{table}[H]
 \Hrule
 \caption{Predefined Pre-Processor Symbols}
 \label{prepro_defines}
-  \begin{center}\code{
-    \begin{tabular}[t]{l}
-      __amd64\\
-      __amd64__\\
-      __x86_64\\
-      __x86_64__\\
+  \begin{center}\small\code{
+    \begin{tabular}[t]{ll}
+      __amd64      & Defined for both LP64 and ILP32 programming models.\\
+      __amd64__    & Defined for both LP64 and ILP32 programming models.\\
+      __x86_64     & Defined for both LP64 and ILP32 programming models.\\
+      __x86_64__   & Defined for both LP64 and ILP32 programming models.\\
+      _LP64        & Defined for LP64 programming model.\\
+      __LP64__     & Defined for LP64 programming model.\\
+      _ILP32       & Defined for ILP32 programming model.\\
+      __ILP32__    & Defined for ILP32 programming model.\\
     \end{tabular}
   }\end{center}
 \Hrule
diff --git a/dl.tex b/dl.tex
index a67f4f8..68c955f 100644
--- a/dl.tex
+++ b/dl.tex
@@ -355,17 +355,24 @@ use.
 
 \subsection{Program Interpreter}
 
-There is one valid \textindex{program interpreter} for
-programs conforming to the \xARCH ABI:
-
-\bigskip
-\path{/lib/ld64.so.1}
-
-However, Linux puts this in
-
-\bigskip
-\path{/lib64/ld-linux-x86-64.so.2}
+The valid \textindex{program interpreter} for programs conforming to the
+\xARCH ABI is listed in Table \ref{interp}, which also contains the
+\textindex{program interpreter} used by Linux.
 
+\begin{figure}
+  \caption{\xARCH Program Interpreter}
+  \label{interp}
+  \begin{center}
+    \begin{tabular}[t]{l|l|l}
+      \multicolumn{1}{c}{Data Model} & \multicolumn{1}{c}{Path} &
+      \multicolumn{1}{c}{Linux Path} \\
+      \hline
+      LP64 & \path{/lib/ld64.so.1} & \path{/lib64/ld-linux-x86-64.so.2} \\
+      \hline
+      ILP32 & \path{/lib/ldx32.so.1} & \path{/libx32/ld-linux-x32.so.2} \\
+    \end{tabular}
+  \end{center}
+\end{figure}
 
 \subsection{Initialization and Termination Functions}
 
diff --git a/introduction.tex b/introduction.tex
index 2148ab9..8a547da 100644
--- a/introduction.tex
+++ b/introduction.tex
@@ -1,4 +1,4 @@
-\chapter{Introduction}
+\chapter{Introduction\label{intro}}
 
 The AMD64\footnote{AMD64 has been previously called x86-64.  The
   latter name is used in a number of places out of historical reasons
@@ -15,6 +15,13 @@ compatibility modes.  The \xARCH ABI does not apply to such programs;
 this document applies only programs running in the ``long'' mode
 provided by the \xARCH architecture.
 
+Binaries using the \xARCH instruction set may program to either a 32-bit
+model, in which the C data types \code{int}, \code{long} and all
+pointer types are 32-bit objects (ILP32); or to a 64-bit model,
+in which the C \code{int} type is 32-bits but the C \code{long} type
+and all pointer types are 64-bit objects (LP64). This specification
+covers both LP64 and ILP32 programming models.
+
 Except where otherwise noted, the \xARCH architecture ABI follows the
 conventions described in the \intelabi.  Rather than replicate the
 entire contents of the \intelabi, the \xARCH ABI indicates only those
diff --git a/low-level-sys-info.tex b/low-level-sys-info.tex
index b030e42..c125a5f 100644
--- a/low-level-sys-info.tex
+++ b/low-level-sys-info.tex
@@ -32,7 +32,7 @@ scalar types and the processor's.  \code{__int128}, \code{__float128},
   \caption{Scalar Types}\label{basic-types}
 { % Use small here - the table is still too large
   % Has anybody an idea how to shrink the table so that it fits the page?
-  \small
+  \myfontsize
   \begin{tabular}{l|l|c|c|l}
     \hline\noalign{\smallskip}
      & &  & \multicolumn{1}{c|}{Alignment} & \multicolumn{1}{c|}{\xARCH} \\
@@ -58,12 +58,19 @@ scalar types and the processor's.  \code{__int128}, \code{__float128},
     \cline{2-5}
     & \texttt{unsigned int} & 4 & 4 & unsigned \fourbyte \\
     \cline{2-5}
-    & \texttt{long} & 8 & 8 & signed \eightbyte \\
-    & \texttt{signed long} & & \\
-    & \texttt{long long} & & \\
+    & \texttt{long (LP64)} & 8 & 8 & signed \eightbyte \\
+    & \texttt{signed long (LP64)} & & \\
+    \cline{2-5}
+    & \texttt{unsigned long (LP64)} & 8 & 8 & unsigned \eightbyte \\
+    \cline{2-5}
+    & \texttt{long (ILP32)} & 4 & 4 & signed \fourbyte \\
+    & \texttt{signed long (ILP32)} & & \\
+    \cline{2-5}
+    & \texttt{unsigned long (ILP32)} & 4 & 4 & unsigned \fourbyte \\
+    \cline{2-5}
+    & \texttt{long long} & 8 & 8 & signed \eightbyte \\
     & \texttt{signed long long} & & \\
     \cline{2-5}
-    & \texttt{unsigned long} & 8 & 8 & unsigned \eightbyte \\
     & \texttt{unsigned long long} & 8 & 8 & unsigned \eightbyte \\
     \cline{2-5}
     & \texttt{__int128}$^{\dagger\dagger}$ & 16 & 16 & signed \sixteenbyte \\
@@ -71,8 +78,12 @@ scalar types and the processor's.  \code{__int128}, \code{__float128},
     \cline{2-5}
     & \texttt{unsigned __int128}$^{\dagger\dagger}$ & 16 & 16 & unsigned \sixteenbyte \\
     \hline
-    Pointer & \texttt{\textit{any-type} *} & 8 & 8 & unsigned \eightbyte \\
-    & \texttt{\textit{any-type} (*)()} & & \\
+    Pointer
+    & \texttt{\textit{any-type} * (LP64)} & 8 & 8 & unsigned \eightbyte \\
+    & \texttt{\textit{any-type} (*)() (LP64)} & & \\
+    \cline{2-5}
+    & \texttt{\textit{any-type} * (ILP32)} & 4 & 4 & unsigned \fourbyte \\
+    & \texttt{\textit{any-type} (*)() (ILP32)} & & \\
     \hline
     Floating-& \texttt{float} & 4 & 4 & single (IEEE-754) \\
     \cline{2-5}
@@ -188,9 +199,16 @@ integral values of a specified size.
       \texttt{int} & 1 to 32 & 0 to $2^{w}-1$ \\
       \texttt{unsigned int} & & 0 to $2^{w}-1$ \\
       \hline
-      \texttt{signed long} & & $-2^{w - 1}$ to $2^{w-1}-1$ \\
-      \texttt{long} & 1 to 64 & 0 to $2^{w}-1$ \\
-      \texttt{unsigned long} & & 0 to $2^{w}-1$ \\
+      \texttt{signed long (LP64)} & & $-2^{w - 1}$ to $2^{w-1}-1$ \\
+      \texttt{long (LP64)} & 1 to 64 & 0 to $2^{w}-1$ \\
+      \texttt{unsigned long (LP64)} & & 0 to $2^{w}-1$ \\
+      \hline
+      \texttt{long (ILP32)} & 1 to 32 & 0 to $2^{w}-1$ \\
+      \texttt{unsigned long (ILP32)} & & 0 to $2^{w}-1$ \\
+      \hline
+      \texttt{signed long long} & & $-2^{w - 1}$ to $2^{w-1}-1$ \\
+      \texttt{long long} & 1 to 64 & 0 to $2^{w}-1$ \\
+      \texttt{unsigned long long} & & 0 to $2^{w}-1$ \\
     \end{tabular}
   \end{center}
 \Hrule
@@ -1102,7 +1120,7 @@ operations such as calling functions, accessing static objects, and
 transferring control from one part of a program to another.  Unlike
 previous material, this material is not normative.
 
-\subsection{Architectural Constraints}
+\subsection{Architectural Constraints\label{models}}
 
 The \xARCH architecture usually does not allow an instruction to encode
 arbitrary
@@ -1233,6 +1251,9 @@ that are of general interest:
 
 \end{description}
 
+Only small code model and small position independent code model
+(\textindex{PIC}) are used in ILP32 binaries.
+
 \subsection{Conventions}
 
 In this document some special assembler symbols are used in the coding
diff --git a/macros.tex b/macros.tex
index 0d20eac..9c4f915 100644
--- a/macros.tex
+++ b/macros.tex
@@ -107,6 +107,8 @@
 
 \newcommand*{\cbnew}{\marginpar{\textsf{New}}}
 
+\newcommand{\myfontsize}{\fontsize{9}{11}\selectfont}
+
 %%% Local Variables:
 %%% mode: latex
 %%% TeX-master: "abi"
diff --git a/object-files.tex b/object-files.tex
index 4705e96..eb1d544 100644
--- a/object-files.tex
+++ b/object-files.tex
@@ -5,22 +5,28 @@
 
 \subsection{Machine Information}
 
-For file identification in \texttt{e_ident}, the \xARCH architecture
-requires the following values.
+\subsubsection{Programming Model}
 
-\begin{table}[H]
-\Hrule
-  \caption{\xARCH Identification}
-  \begin{center}
-    \begin{tabular}[t]{l|l}
-      \multicolumn{1}{c}{Position} & \multicolumn{1}{c}{Value} \\
-      \hline
-      \texttt{e_ident[EI_CLASS]} & \texttt{ELFCLASS64} \\
-      \texttt{e_ident[EI_DATA]} & \texttt{ELFDATA2LSB}
-    \end{tabular}
-  \end{center}
-\Hrule
-\end{table}
+As described in Section \ref{intro}, binaries using the \xARCH instruction
+set may program to either a 32-bit model, in which the C data
+types \code{int}, \code{long} and all pointer types are 32-bit objects
+(ILP32); or to a 64-bit model, in which the C code{int} type is 32-bits
+but the C \code{long} type and all pointer types are 64-bit objects (LP64).
+This specification describes both binaries that use the ILP32 and the LP64
+model.
+
+\subsubsection{File Class}
+
+For \xARCH ILP32 objects, the file class value in e_ident[EI_CLASS] must
+be ELFCLASS32. For \xARCH LP64 objects, the file class value must be
+ELFCLASS64.
+
+\subsubsection{Data Encoding}
+
+For the data encoding in e_ident[EI_DATA], \xARCH objects use
+ELFDATA2LSB.
+
+\subsubsection{Processor identification}
 
 Processor identification resides in the ELF headers
 \texttt{e_machine} member and must have the value
@@ -397,6 +403,8 @@ Figure \ref{reloc_fields} shows the allowed relocatable fields.
                   with arbitrary byte alignment.  These values use
                   the same byte order as other word values in the
                   \xARCH architecture. \\
+\textit{wordclass} & This specifies \textit{word64} for LP64 and
+		     specifies \textit{word32} for ILP32. \\ 
 \end{tabular*}
 
 The following notations are used for specifying relocations in table
@@ -421,13 +429,19 @@ The following notations are used for specifying relocations in table
   relocation entry.
 \end{description}
 
-The \xARCH ABI architectures uses only \texttt{Elf64_Rela} relocation
+The \xARCH LP64 ABI architecture uses only \texttt{Elf64_Rela} relocation
 entries with explicit addends.  The \code{r_addend} member serves as
 the relocation addend.
 
+The \xARCH ILP32 ABI architecture uses only \texttt{Elf32_Rela} relocation
+entries in relocatable files.  Relocations contained within executable
+files or shared objects may use either \texttt{Elf32_Rela} relocation
+or \texttt{Elf32_Rel} relocation.
+
 \begin{table}[H]
 \Hrule
   \caption{Relocation Types}
+  \small
   \label{tab-relocations}
   \begin{center}
     \begin{tabular}[t]{l|r|l|l}
@@ -442,9 +456,9 @@ the relocation addend.
       \texttt{R_X86_64_GOT32} & 3 & \textit{word32} & \texttt{G + A} \\
       \texttt{R_X86_64_PLT32} & 4 & \textit{word32} & \texttt{L + A - P} \\
       \texttt{R_X86_64_COPY}  & 5 & none            & none \\
-      \texttt{R_X86_64_GLOB_DAT} & 6 & \textit{word64} & \texttt{S} \\
-      \texttt{R_X86_64_JUMP_SLOT} & 7 & \textit{word64} & \texttt{S} \\
-      \texttt{R_X86_64_RELATIVE} & 8 & \textit{word64} & \texttt{B + A} \\
+      \texttt{R_X86_64_GLOB_DAT} & 6 & \textit{wordclass} & \texttt{S} \\
+      \texttt{R_X86_64_JUMP_SLOT} & 7 & \textit{wordclass} & \texttt{S} \\
+      \texttt{R_X86_64_RELATIVE} & 8 & \textit{wordclass} & \texttt{B + A} \\
       \texttt{R_X86_64_GOTPCREL} & 9 & \textit{word32} & \texttt{G + GOT + A - P} \\
       \texttt{R_X86_64_32}    & 10 & \textit{word32} & \texttt{S + A} \\
       \texttt{R_X86_64_32S}   & 11 & \textit{word32} & \texttt{S + A} \\
@@ -460,17 +474,22 @@ the relocation addend.
       \texttt{R_X86_64_DTPOFF32}   & 21 & \textit{word32} &  \\
       \texttt{R_X86_64_GOTTPOFF}   & 22 & \textit{word32} &  \\
       \texttt{R_X86_64_TPOFF32}   & 23 & \textit{word32} &  \\
-      \texttt{R_X86_64_PC64}  & 24 & \textit{word64} & \texttt{S + A - P} \\
-      \texttt{R_X86_64_GOTOFF64} & 25 & \textit{word64} & \texttt{S + A - GOT} \\
+      \texttt{R_X86_64_PC64} $^\dagger$ & 24 & \textit{word64} & \texttt{S + A - P} \\
+      \texttt{R_X86_64_GOTOFF64} $^\dagger$ & 25 & \textit{word64} & \texttt{S + A - GOT} \\
       \texttt{R_X86_64_GOTPC32} & 26 & \textit{word32} & \texttt{GOT + A - P} \\
       \texttt{R_X86_64_SIZE32} & 32 & \textit{word32} & \texttt{Z + A} \\
-      \texttt{R_X86_64_SIZE64} & 33 & \textit{word64} & \texttt{Z + A} \\
+      \texttt{R_X86_64_SIZE64} $^\dagger$ & 33 & \textit{word64} & \texttt{Z + A} \\
       \texttt{R_X86_64_GOTPC32_TLSDESC} & 34 & \textit{word32} &  \\
       \texttt{R_X86_64_TLSDESC_CALL} & 35 & none &  \\
       \texttt{R_X86_64_TLSDESC} & 36 & \textit{word64}$\times 2$ & \\
-      \texttt{R_X86_64_IRELATIVE} & 37 & \textit{word64} & \texttt{indirect (B + A)}\\
+      \texttt{R_X86_64_IRELATIVE} & 37 & \textit{wordclass} & \texttt{indirect (B + A)}\\
+      \texttt{R_X86_64_RELATIVE64} $^{\dagger\dagger}$ & 38 & \textit{word64} & \texttt{B + A} \\
 %      \texttt{R_X86_64_GOT64} & 16 & \textit{word64} & \texttt{G + A} \\
 %      \texttt{R_X86_64_PLT64} & 17 & \textit{word64} & \texttt{L + A - P} \\
+     \cline{1-4}
+    \multicolumn{3}{l}{\small $^\dagger$ This relocation is used only for LP64.}\\
+    \multicolumn{3}{l}{\small $^{\dagger\dagger}$ This relocation only
+    appears in ILP32 executable files or shared objects.}\\
     \end{tabular}
   \end{center}
 \Hrule
diff --git a/x32.tex b/x32.tex
new file mode 100644
index 0000000..b7a4055
--- /dev/null
+++ b/x32.tex
@@ -0,0 +1,444 @@
+\chapter{ILP32 Programming Model\label{x32}}
+
+"x32" is commonly used to refer to \xARCH ILP32 programming model.
+
+\section{Parameter Passing}
+When a value of pointer type is returned or passed in a register, bits 32
+to 63 shall be zero.
+
+\section{Address Space}
+
+ILP32 binaries reside in the lower 32 bits of the 64-bit virtual
+address space and all addresses are 32 bits in size.  They should conform
+to \textindex{small code model} or
+\textindex{small position independent code model} (\textindex{PIC})
+described in Section \ref{models}.
+
+\section{Thread-Local Storage Support}
+
+ILP32 Thread-Local Storage (TLS) support is based on LP64 TLS
+implementation with some modifcations.
+
+\subsection{Global Thread-Local Variable}
+
+For a global thread-local variable x:
+
+\begin{verbatim}
+extern __thread int x;
+\end{verbatim}
+
+\begin{description}
+\item[\textindex{General Dynamic Model}]
+  Load address of \code{x} into \reg{rax}
+
+\begin{table}[H]
+\Hrule
+\caption{General Dynamic Model Code Sequence}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{ILP32} \\
+\hline
+0x00 & .byte & 0x66			& 0x00 & leaq  & x@tlsgd(\%rip),\%rdi \\
+0x01 & leaq  & x@tlsgd(\%rip),\%rdi	& 0x07 & .word & 0x6666 \\
+0x08 & .word & 0x6666			& 0x09 & rex64 & \\
+0x0a & rex64 &				& 0x0a & call  & \_\_tls\_get\_addr@plt \\
+0x0b & call  & \_\_tls\_get\_addr@plt	&      &       & \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Initial Exec Model}]
+  Load address of \code{x} into \reg{rax}
+
+\begin{table}[H]
+\Hrule
+\caption{Initial Exec Model Code Sequence}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{ILP32} \\
+\hline
+0x01 & movq & \%fs:0,\%rax		& 0x01 & movl & \%fs:0,\%eax \\
+0x09 & addq & x@gottpoff(\%rip),\%rax	& 0x08 & addl & x@gottpoff(\%rip),\%eax \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Initial Exec Model, II}]
+  Load value of \code{x} into \reg{edi}.  \code{\%fs:(\%eax)} memory
+  operand can't be used for ILP32 since its effective address is the base
+  address of \code{\%fs} + value of \reg{eax} zero-extended to a 64-bit
+  result, which is incorrect with negative value in \reg{eax}.
+
+\begin{table}[H]
+\Hrule
+\caption{Initial Exec Model Code Sequence, II}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{ILP32} \\
+\hline
+0x01 & movq & x@gottpoff(\%rip),\%rax	& 0x01 & movq & x@gottpoff(\%rip),\%rax \\
+0x07 & movl & \%fs:(\%rax),\%edi	& 0x07 & movl & \%fs:(\%rax),\%edi \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\end{description}
+
+\subsection{Static Thread-Local Variable}
+
+For a static thread-local variable x:
+
+\begin{verbatim}
+static __thread int x;
+\end{verbatim}
+
+\begin{description}
+\item[\textindex{Local Dynamic Model}]
+  Load address of \code{x} into \reg{rax}
+
+\begin{table}[H]
+\Hrule
+\caption{Local Dynamic Model Code Sequence With Lea}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{ILP32} \\
+\hline
+0x00 & leaq  & x@tlsld(\%rip),\%rdi	& 0x00 & leaq & x@tlsld(\%rip),\%rdi\\
+0x07 & call  & \_\_tls\_get\_addr@plt	& 0x07 & call & \_\_tls\_get\_addr@plt\\
+0x0c & leaq  & x@dtpoff(\%rax),\%rax	& 0x0c & leal & x@dtpoff(\%rax),\%eax\\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+or
+
+\begin{table}[H]
+\Hrule
+\caption{Local Dynamic Model Code Sequence With Add}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{ILP32} \\
+\hline
+0x00 & leaq  & x@tlsld(\%rip),\%rdi	& 0x00 & leaq & x@tlsld(\%rip),\%rdi\\
+0x07 & call  & \_\_tls\_get\_addr@plt	& 0x07 & call & \_\_tls\_get\_addr@plt\\
+0x0c & addq  & \$x@dtpoff,\%rax		& 0x0c & addl & \$x@dtpoff,\%eax\\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Local Dynamic Model, II}]
+  Load value of \code{x} into \reg{edi}
+
+\begin{table}[H]
+\Hrule
+\caption{Local Dynamic Model Code Sequence, II}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{ILP32} \\
+\hline
+0x01 & movq & \%fs:0,\%rax		& 0x01 & movl & \%fs:0,\%eax \\
+0x09 & movl & x@dtpoff(\%rax),\%edi	& 0x08 & movl & x@dtpoff(\%rax),\%edi \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Local Exec Model}]
+  Load address of \code{x} into \reg{rax}
+
+\begin{table}[H]
+\Hrule
+\caption{Local Exec Model Code Sequence With Lea}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{ILP32} \\
+\hline
+0x01 & movq & \%fs:0,\%rax		& 0x01 & movl & \%fs:0,\%eax \\
+0x09 & leaq & x@tpoff(\%rax),\%rax	& 0x08 & leal & x@tpoff(\%rax),\%eax \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+or
+
+\begin{table}[H]
+\Hrule
+\caption{Local Exec Model Code Sequence With Add}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{ILP32} \\
+\hline
+0x01 & movq & \%fs:0,\%rax		& 0x01 & movl & \%fs:0,\%eax \\
+0x09 & addq & \$x@tpoff,\%rax		& 0x08 & addl & \$x@tpoff,\%eax \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Local Exec Model, II}]
+  Load value of \code{x} into \reg{edi}
+
+\begin{table}[H]
+\Hrule
+\caption{Local Exec Model Code Sequence, II}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{ILP32} \\
+\hline
+0x01 & movq & \%fs:0,\%rax		& 0x01 & movl & \%fs:0,\%eax \\
+0x09 & movl & x@tpoff(\%rax),\%edi	& 0x08 & movl & x@tpoff(\%rax),\%edi \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Local Exec Model, III}]
+  Load value of \code{x} into \reg{edi}
+
+\begin{table}[H]
+\Hrule
+\caption{Local Exec Model Code Sequence, III}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LP64} & \multicolumn{3}{c}{ILP32} \\
+\hline
+0x00 & movl & \%fs:x@tpoff,\%edi	& 0x00 & movl & \%fs:x@tpoff,\%edi \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\end{description}
+
+\subsection{TLS Linker Optimization}
+
+\begin{description}
+\item[\textindex{General Dynamic To Initial Exec}]
+  Load address of \code{x} into \reg{rax}
+
+\begin{table}[H]
+\Hrule
+\caption{GD -> IE Code Transition}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{GD} & \multicolumn{3}{c}{IE} \\
+\hline
+0x00 & leaq  & x@tlsgd(\%rip),\%rdi	& 0x00 & movl  & \%fs:0, \%eax \\
+0x07 & .word & 0x6666			& 0x08 & addq  & x@gottpoff(\%rip),\%rax\\
+0x09 & rex64 &				&      &       & \\
+0x0a & call  & \_\_tls\_get\_addr@plt	&      &       & \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\begin{table}[H]
+\Hrule
+\caption{GD -> LE Code Transition}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{GD} & \multicolumn{3}{c}{LE} \\
+\hline
+0x00 & leaq  & x@tlsgd(\%rip),\%rdi	& 0x00 & movl  & \%fs:0, \%eax \\
+0x07 & .word & 0x6666			& 0x08 & leal  & x@tpoff(\%rax),\%eax\\
+0x09 & rex64 &				&      &       & \\
+0x0a & call  & \_\_tls\_get\_addr@plt	&      &       & \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Initial Exec To Local Exec}]
+  Load address of \code{x} into \reg{rax}
+
+\begin{table}[H]
+\Hrule
+\caption{IE -> LE Code Transition With Lea}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{IE} & \multicolumn{3}{c}{LE} \\
+\hline
+0x01 & movl & \%fs:0,\%eax		& 0x01 & movl & \%fs:0,\%eax \\
+0x08 & addl & x@gottpoff(\%rip),\%eax	& 0x08 & leal & x@tpoff(\%rax),\%eax \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+or
+
+\begin{table}[H]
+\Hrule
+\caption{IE -> LE Code Transition With Add}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{IE} & \multicolumn{3}{c}{LE} \\
+\hline
+0x01 & movl & \%fs:0,\%eax		& 0x01 & movl & \%fs:0,\%eax \\
+0x08 & addl & \$x@gottpoff(\%rip),\%eax	& 0x08 & addl & \$x@tpoff,\%eax \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Initial Exec To Local Exec, II}]
+  Load value of \code{x} into \reg{edi}.
+
+\begin{table}[H]
+\Hrule
+\caption{IE -> LE Code Transition, II}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{IE} & \multicolumn{3}{c}{LE} \\
+\hline
+0x01 & movq & x@gottpoff(\%rip),\%rax	& 0x01 & movq & x@tpoff,\%rax \\
+0x07 & movl & \%fs:(\%rax),\%edi	& 0x07 & movl & \%fs:(\%rax),\%edi \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Local Dynamic to Local Exec}]
+  Load address of \code{x} into \reg{rax}
+
+\begin{table}[H]
+\Hrule
+\caption{LD -> LE Code Transition With Lea}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LD} & \multicolumn{3}{c}{LE} \\
+\hline
+0x00 & leaq  & x@tlsld(\%rip),\%rdi	& 0x00 & nopl & 0x0(\%rax) \\
+0x07 & call  & \_\_tls\_get\_addr@plt	& 0x04 & movl & \%fs:0,\%eax\\
+0x0c & leal  & x@dtpoff(\%rax),\%eax	& 0x0c & leal & x@tpoff(\%rax),\%eax\\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+or
+
+\begin{table}[H]
+\Hrule
+\caption{LD -> LE Code Transition With Add}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LD} & \multicolumn{3}{c}{LE} \\
+\hline
+0x00 & leaq  & x@tlsld(\%rip),\%rdi	& 0x00 & nopl & 0x0(\%rax) \\
+0x07 & call  & \_\_tls\_get\_addr@plt	& 0x04 & movl & \%fs:0,\%eax\\
+0x0c & addq  & \$x@dtpoff,\%rax		& 0x0c & addl & \$x@tpoff,\%eax\\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\item[\textindex{Local Dynamic To Local Exec, II}]
+  Load value of \code{x} into \reg{edi}.
+
+\begin{table}[H]
+\Hrule
+\caption{LD -> LE Code Transition, II}
+\begin{center}
+\small\code{
+\begin{tabular}{lll|lll}
+\multicolumn{3}{c}{LD} & \multicolumn{3}{c}{LE} \\
+0x00 & leaq  & x@tlsld(\%rip),\%rdi	& 0x00 & nopl & 0x0(\%rax) \\
+0x07 & call  & \_\_tls\_get\_addr@plt	& 0x04 & movl & \%fs:0,\%eax\\
+0x0c & movl  & x@dtpoff(\%rax),\%eax	& 0x0c & movl & x@tpoff(\%rax),\%eax\\
+\hline
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}
+
+\end{description}
+
+\section{Kernel Support}
+Kernel should limit stack and addresses returned from system calls
+bewteen $0x00000000$ to $0xffffffff$.
+
+\section{Coding Examples}
+
+Although ILP32 binaries run in the 64-bit mode, not all 64-bit instructions
+are supported. This section discusses example code sequences for
+fundamental operations which are different from the 64-bit mode.
+
+\subsection{Indirect Branch}
+
+Since indirect branch via memory loads a 64-bit address at the memory
+location, it is not supported in ILP32.  Indirect branch via register
+should be used instead.  The 32-bit address from memory is loaded into
+the lower 32 bits of a register, which will automatically zero-extend
+the upper 32 bits of the register.  Then the indirect call can be
+performed via the 64-bit register. 
+
+\begin{table}[H]
+\Hrule
+\caption{Indirect Branch}
+\begin{center}
+\code{
+\begin{tabular}{ll|ll}
+\multicolumn{2}{c}{LP64} & \multicolumn{2}{c}{ILP32} \\
+\hline
+call & *\%rax          & call & *\%rax \\
+\hline
+call & *func\_p(\%rip) & movl & func\_p(\%rip), \%eax \\
+     &                 & call & *\%rax \\
+\hline
+call & *func\_p        & movl & func\_p, \%eax \\
+     &                 & call & *\%rax \\
+\hline
+jmp  & *\%rax          & jmp  & *\%rax \\
+\hline
+jmp  & *func\_p(\%rip) & movl & func\_p(\%rip), \%eax \\
+     &                 & jmp  & *\%rax \\
+\hline
+jmp  & *func\_p        & movl & func\_p, \%eax \\
+     &                 & jmp  & *\%rax \\
+\end{tabular}
+}
+\end{center}
+\Hrule
+\end{table}

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [x86-64 psABI] RFC: Extend x86-64 psABI to support x32
       [not found] ` <ccd4a6ab-f279-477f-b48b-94b8f4afd37d@googlegroups.com>
@ 2012-06-26 19:48   ` H.J. Lu
  2012-06-26 19:53     ` H. Peter Anvin
       [not found]     ` <69b1606d-6150-46eb-a426-93bfad19e7a2@googlegroups.com>
  0 siblings, 2 replies; 13+ messages in thread
From: H.J. Lu @ 2012-06-26 19:48 UTC (permalink / raw)
  To: x32-abi; +Cc: discuss, GCC Development, Binutils, GNU C Library, GDB

On Tue, Jun 26, 2012 at 12:36 PM, Mark Butler <butlerm@middle.net> wrote:
> On Monday, May 14, 2012 11:31:11 AM UTC-6, H.J. wrote:
>>
>> Support for the x32 psABI:
>>
>> http://sites.google.com/site/x32abi/
>>
>> is added in Linux kernel 3.4-rc1.  X32 uses the ILP32 model for x86-64
>> instruction set with size of long and pointers == 4 bytes.  X32 is
>> already supported in GCC 4.7.0 and binutils 2.22...Here is a
>> patch to extend x86-64 psABI for x32.  Any comments?
>>
>
> May I ask why the decision was made to use ILP32 instead of L64P32?   The
> latter would seem to avoid lots of porting problems in particular.  And if
> porting difficulties are the major complained about x32, is it really too
> late to switch?  Thanks - mdb

x32 is designed to replace ia32 where long is 32-bit, not x86-64.


-- 
H.J.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [x86-64 psABI] RFC: Extend x86-64 psABI to support x32
  2012-06-26 19:48   ` H.J. Lu
@ 2012-06-26 19:53     ` H. Peter Anvin
       [not found]       ` <af4adaed-508a-439f-92db-21d4385d316e@googlegroups.com>
       [not found]     ` <69b1606d-6150-46eb-a426-93bfad19e7a2@googlegroups.com>
  1 sibling, 1 reply; 13+ messages in thread
From: H. Peter Anvin @ 2012-06-26 19:53 UTC (permalink / raw)
  To: x32-abi; +Cc: H.J. Lu, discuss, GCC Development, Binutils, GNU C Library, GDB

On 06/26/2012 12:47 PM, H.J. Lu wrote:
>>
>> May I ask why the decision was made to use ILP32 instead of L64P32?   The
>> latter would seem to avoid lots of porting problems in particular.  And if
>> porting difficulties are the major complained about x32, is it really too
>> late to switch?  Thanks - mdb
> 
> x32 is designed to replace ia32 where long is 32-bit, not x86-64.
> 

It's worth noting that there are *no* Linux platforms that are not ILP32
or LP64, so adding a third memory model is likely to cause even more
problems...

	-hpa

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [x86-64 psABI] RFC: Extend x86-64 psABI to support x32
       [not found]     ` <69b1606d-6150-46eb-a426-93bfad19e7a2@googlegroups.com>
@ 2012-06-26 21:23       ` H.J. Lu
       [not found]         ` <bde2af16-b04e-4e17-a22e-3fe0941e2496@googlegroups.com>
  0 siblings, 1 reply; 13+ messages in thread
From: H.J. Lu @ 2012-06-26 21:23 UTC (permalink / raw)
  To: x32-abi; +Cc: discuss, GCC Development, Binutils, GNU C Library, GDB

On Tue, Jun 26, 2012 at 2:11 PM, Mark Butler <butlerm@middle.net> wrote:
>
>> x32 is designed to replace ia32 where long is 32-bit, not x86-64.
>>
> I understand, but wouldn't L64P32 be much better in the long run? In terms
> of compatibility with LP64, and an LP64 kernel in particular?  The structure
> layouts of any structure that did not contain pointers would be identical,
> for example.  struct timeval, struct timespec, struct stat, and on and on...

Linux/x32 uses the same layout for struct timeval, struct timespec, struct stat,
as Linux/x86-64. It is orthogonal to L64 vs L32.

-- 
H.J.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [x86-64 psABI] RFC: Extend x86-64 psABI to support x32
       [not found]         ` <bde2af16-b04e-4e17-a22e-3fe0941e2496@googlegroups.com>
@ 2012-06-27 12:02           ` H.J. Lu
  2012-06-27 18:24             ` Magnus Fromreide
  0 siblings, 1 reply; 13+ messages in thread
From: H.J. Lu @ 2012-06-27 12:02 UTC (permalink / raw)
  To: x32-abi; +Cc: discuss, GCC Development, Binutils, GNU C Library, GDB

On Tue, Jun 26, 2012 at 10:56 PM, Mark Butler <butlerm@middle.net> wrote:
>
>
> On Tuesday, June 26, 2012 3:22:45 PM UTC-6, H.J. wrote:
>>
>> On Tue, Jun 26, 2012 at 2:11 PM, Mark Butler wrote:
>> >
>> >> x32 is designed to replace ia32 where long is 32-bit, not x86-64.
>> >>
>> > I understand, but wouldn't L64P32 be much better in the long run? In
>> > terms
>> > of compatibility with LP64, and an LP64 kernel in particular?  The
>> > structure
>> > layouts of any structure that did not contain pointers would be
>> > identical,
>> > for example.  struct timeval, struct timespec, struct stat, and on and
>> > on...
>>
>> Linux/x32 uses the same layout for struct timeval, struct timespec, struct
>> stat,
>> as Linux/x86-64. It is orthogonal to L64 vs L32.
>>
> If POSIX requires struct timespec to look like this:
>
> struct timespec {
>   time_t tv_sec;
>   long   tv_nsec;
> }
>
> then how can an ABI with 32 bit longs have the same struct timespec layout
> as an ABI with 64 bit longs?
>

We changed it to

struct timespec
  {
    __time_t tv_sec;		/* Seconds.  */
    __syscall_slong_t tv_nsec;	/* Nanoseconds.  */
  };


-- 
H.J.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [x86-64 psABI] RFC: Extend x86-64 psABI to support x32
  2012-06-27 12:02           ` H.J. Lu
@ 2012-06-27 18:24             ` Magnus Fromreide
  2012-06-27 18:29               ` H.J. Lu
  0 siblings, 1 reply; 13+ messages in thread
From: Magnus Fromreide @ 2012-06-27 18:24 UTC (permalink / raw)
  To: H.J. Lu; +Cc: x32-abi, discuss, GCC Development, Binutils, GNU C Library, GDB

On Wed, 2012-06-27 at 05:01 -0700, H.J. Lu wrote:
> On Tue, Jun 26, 2012 at 10:56 PM, Mark Butler <butlerm@middle.net> wrote:
> >
> >
> > On Tuesday, June 26, 2012 3:22:45 PM UTC-6, H.J. wrote:
> >>
> >> On Tue, Jun 26, 2012 at 2:11 PM, Mark Butler wrote:
> >> >
> >> >> x32 is designed to replace ia32 where long is 32-bit, not x86-64.
> >> >>
> >> > I understand, but wouldn't L64P32 be much better in the long run? In
> >> > terms
> >> > of compatibility with LP64, and an LP64 kernel in particular?  The
> >> > structure
> >> > layouts of any structure that did not contain pointers would be
> >> > identical,
> >> > for example.  struct timeval, struct timespec, struct stat, and on and
> >> > on...
> >>
> >> Linux/x32 uses the same layout for struct timeval, struct timespec, struct
> >> stat,
> >> as Linux/x86-64. It is orthogonal to L64 vs L32.
> >>
> > If POSIX requires struct timespec to look like this:
> >
> > struct timespec {
> >   time_t tv_sec;
> >   long   tv_nsec;
> > }
> >
> > then how can an ABI with 32 bit longs have the same struct timespec layout
> > as an ABI with 64 bit longs?
> >
> 
> We changed it to
> 
> struct timespec
>   {
>     __time_t tv_sec;		/* Seconds.  */
>     __syscall_slong_t tv_nsec;	/* Nanoseconds.  */
>   };
> 

I think that means you fails to conform to posix unless
__syscall_slong_t is an alias for long.

If I understand the posix spec correctly then, in a conforming
implementation,

struct timespec ts;
if (sizeof(long) != sizeof(ts.tv_nsec))
  abort();

never calls abort.

For your purpose it would have been much better if tv_nsec had been
specified with a type with allowed values, similarly to how suseconds_t
that is used for timeval.tv_usec is specified.

I suppose this is something to bring up for posix-next.

/MF

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [x86-64 psABI] RFC: Extend x86-64 psABI to support x32
  2012-06-27 18:24             ` Magnus Fromreide
@ 2012-06-27 18:29               ` H.J. Lu
  0 siblings, 0 replies; 13+ messages in thread
From: H.J. Lu @ 2012-06-27 18:29 UTC (permalink / raw)
  To: Magnus Fromreide
  Cc: x32-abi, discuss, GCC Development, Binutils, GNU C Library, GDB

On Wed, Jun 27, 2012 at 11:24 AM, Magnus Fromreide <magfr@lysator.liu.se> wrote:
> On Wed, 2012-06-27 at 05:01 -0700, H.J. Lu wrote:
>> On Tue, Jun 26, 2012 at 10:56 PM, Mark Butler <butlerm@middle.net> wrote:
>> >
>> >
>> > On Tuesday, June 26, 2012 3:22:45 PM UTC-6, H.J. wrote:
>> >>
>> >> On Tue, Jun 26, 2012 at 2:11 PM, Mark Butler wrote:
>> >> >
>> >> >> x32 is designed to replace ia32 where long is 32-bit, not x86-64.
>> >> >>
>> >> > I understand, but wouldn't L64P32 be much better in the long run? In
>> >> > terms
>> >> > of compatibility with LP64, and an LP64 kernel in particular?  The
>> >> > structure
>> >> > layouts of any structure that did not contain pointers would be
>> >> > identical,
>> >> > for example.  struct timeval, struct timespec, struct stat, and on and
>> >> > on...
>> >>
>> >> Linux/x32 uses the same layout for struct timeval, struct timespec, struct
>> >> stat,
>> >> as Linux/x86-64. It is orthogonal to L64 vs L32.
>> >>
>> > If POSIX requires struct timespec to look like this:
>> >
>> > struct timespec {
>> >   time_t tv_sec;
>> >   long   tv_nsec;
>> > }
>> >
>> > then how can an ABI with 32 bit longs have the same struct timespec layout
>> > as an ABI with 64 bit longs?
>> >
>>
>> We changed it to
>>
>> struct timespec
>>   {
>>     __time_t tv_sec;          /* Seconds.  */
>>     __syscall_slong_t tv_nsec;        /* Nanoseconds.  */
>>   };
>>
>
> I think that means you fails to conform to posix unless
> __syscall_slong_t is an alias for long.

That is true.

> If I understand the posix spec correctly then, in a conforming
> implementation,
>
> struct timespec ts;
> if (sizeof(long) != sizeof(ts.tv_nsec))
>  abort();
>
> never calls abort.

It will abort on x32.

> For your purpose it would have been much better if tv_nsec had been
> specified with a type with allowed values, similarly to how suseconds_t
> that is used for timeval.tv_usec is specified.
>
> I suppose this is something to bring up for posix-next.
>

Yes, that is the intention.


-- 
H.J.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [x86-64 psABI] RFC: Extend x86-64 psABI to support x32
       [not found]       ` <af4adaed-508a-439f-92db-21d4385d316e@googlegroups.com>
@ 2012-06-28 21:06         ` H. Peter Anvin
  0 siblings, 0 replies; 13+ messages in thread
From: H. Peter Anvin @ 2012-06-28 21:06 UTC (permalink / raw)
  To: x32-abi
  Cc: Mark Butler, H.J. Lu, discuss, GCC Development, Binutils,
	GNU C Library, GDB

On 06/28/2012 02:03 PM, Mark Butler wrote:
> On Tuesday, June 26, 2012 1:53:01 PM UTC-6, H. Peter Anvin wrote:
>
>     It's worth noting that there are *no* Linux platforms that are not
>     ILP32
>     or LP64, so adding a third memory model is likely to cause even more
>     problems...
>
>
> Care to comment on what sort of things would be likely to cause a large
> number of problems porting to an L64P32 model?  I understand that L32P64
> (as in Windows 64 bit) causes lots of problems, because there is a lot
> of code that assumes that a pointer can be converted to a long and back.
>   That would not be a problem with L64P32 however, because there
> pointers would be smaller than longs rather than larger.

Every time you introduce a new model you will have problems, but in 
Linux it is a strong assumption that sizeof(long) == sizeof(void *).

	-hpa


-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-06-28 21:06 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-14 17:31 [x86-64 psABI] RFC: Extend x86-64 psABI to support x32 H.J. Lu
2012-05-14 17:34 ` H. Peter Anvin
2012-05-14 17:44   ` H.J. Lu
2012-05-15 16:08     ` [discuss] " Michael Matz
2012-05-15 16:18       ` H.J. Lu
2012-05-17 19:50       ` H.J. Lu
     [not found] ` <ccd4a6ab-f279-477f-b48b-94b8f4afd37d@googlegroups.com>
2012-06-26 19:48   ` H.J. Lu
2012-06-26 19:53     ` H. Peter Anvin
     [not found]       ` <af4adaed-508a-439f-92db-21d4385d316e@googlegroups.com>
2012-06-28 21:06         ` H. Peter Anvin
     [not found]     ` <69b1606d-6150-46eb-a426-93bfad19e7a2@googlegroups.com>
2012-06-26 21:23       ` H.J. Lu
     [not found]         ` <bde2af16-b04e-4e17-a22e-3fe0941e2496@googlegroups.com>
2012-06-27 12:02           ` H.J. Lu
2012-06-27 18:24             ` Magnus Fromreide
2012-06-27 18:29               ` H.J. Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).