From: Siddhesh Poyarekar <siddhesh@gotplt.org>
To: Florian Weimer <fweimer@redhat.com>, libc-alpha@sourceware.org
Cc: Siddhesh Poyarekar <siddhesh@redhat.com>
Subject: Re: [PATCH v4] scripts: Add glibcelf.py module
Date: Fri, 22 Apr 2022 14:15:37 +0530 [thread overview]
Message-ID: <b6a9d88d-9950-c640-4553-4097d11044c5@gotplt.org> (raw)
In-Reply-To: <87levyt8ny.fsf@oldenburg.str.redhat.com>
On 22/04/2022 03:12, Florian Weimer via Libc-alpha wrote:
> Hopefully, this will lead to tests that are easier to maintain. The
> current approach of parsing readelf -W output using regular expressions
> is not necessarily easier than parsing the ELF data directly.
>
> This module is still somewhat incomplete (e.g., coverage of relocation
> types and versioning information is missing), but it is sufficient to
> perform basic symbol analysis or program header analysis.
>
> The EM_* mapping for architecture-specific constant classes (e.g.,
> SttX86_64) is not yet implemented. The classes are defined for the
> benefit of elf/tst-glibcelf.py.
>
> ---
> v2: Fix STB_WEAK value. Add a consistency check against <elf.h> and
> the required constants to make it pass.
LGTM.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
>
> elf/Makefile | 7 +
> elf/tst-glibcelf.py | 260 ++++++++++++
> scripts/glibcelf.py | 1135 +++++++++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 1402 insertions(+)
>
> diff --git a/elf/Makefile b/elf/Makefile
> index d30d0ee917..3d79f40879 100644
> --- a/elf/Makefile
> +++ b/elf/Makefile
> @@ -1115,6 +1115,13 @@ tests-special += $(objpfx)check-abi-ld.out
> update-abi: update-abi-ld
> update-all-abi: update-all-abi-ld
>
> +tests-special += $(objpfx)tst-glibcelf.out
> +$(objpfx)tst-glibcelf.out: tst-glibcelf.py elf.h $(..)/scripts/glibcelf.py \
> + $(..)/scripts/glibcextract.py
> + PYTHONPATH=$(..)scripts $(PYTHON) tst-glibcelf.py \
> + --cc="$(CC) $(patsubst -DMODULE_NAME=%,-DMODULE_NAME=testsuite,$(CPPFLAGS))" \
> + < /dev/null > $@ 2>&1; $(evaluate-test)
> +
OK.
> # The test requires shared _and_ PIE because the executable
> # unit test driver must be able to link with the shared object
> # that is going to eventually go into an installed DSO.
> diff --git a/elf/tst-glibcelf.py b/elf/tst-glibcelf.py
> new file mode 100644
> index 0000000000..bf15a3bad4
> --- /dev/null
> +++ b/elf/tst-glibcelf.py
> @@ -0,0 +1,260 @@
> +#!/usr/bin/python3
> +# Verify scripts/glibcelf.py contents against elf/elf.h.
> +# Copyright (C) 2022 Free Software Foundation, Inc.
> +# This file is part of the GNU C Library.
> +#
> +# The GNU C Library is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU Lesser General Public
> +# License as published by the Free Software Foundation; either
> +# version 2.1 of the License, or (at your option) any later version.
> +#
> +# The GNU C Library is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> +# Lesser General Public License for more details.
> +#
> +# You should have received a copy of the GNU Lesser General Public
> +# License along with the GNU C Library; if not, see
> +# <https://www.gnu.org/licenses/>.
> +
> +import argparse
> +import enum
> +import sys
> +
> +import glibcelf
> +import glibcextract
> +
> +errors_encountered = 0
> +
> +def error(message):
> + global errors_encountered
> + sys.stdout.write('error: {}\n'.format(message))
> + errors_encountered += 1
> +
> +# The enum constants in glibcelf are expected to have exactly these
> +# prefixes.
> +expected_constant_prefixes = tuple(
> + 'ELFCLASS ELFDATA EM_ ET_ DT_ PF_ PT_ SHF_ SHN_ SHT_ STB_ STT_'.split())
> +
> +def find_constant_prefix(name):
> + """Returns a matching prefix from expected_constant_prefixes or None."""
> + for prefix in expected_constant_prefixes:
> + if name.startswith(prefix):
> + return prefix
> + return None
> +
> +def find_enum_types():
> + """A generator for OpenIntEnum and IntFlag classes in glibcelf."""
> + for obj in vars(glibcelf).values():
> + if isinstance(obj, type) and obj.__bases__[0] in (
> + glibcelf._OpenIntEnum, enum.Enum, enum.IntFlag):
> + yield obj
> +
> +def check_duplicates():
> + """Verifies that enum types do not have duplicate values.
> +
> + Different types must have different member names, too.
> +
> + """
> + global_seen = {}
> + for typ in find_enum_types():
> + seen = {}
> + last = None
> + for (name, e) in typ.__members__.items():
> + if e.value in seen:
> + error('{} has {}={} and {}={}'.format(
> + typ, seen[e.value], e.value, name, e.value))
> + last = e
> + else:
> + seen[e.value] = name
> + if last is not None and last.value > e.value:
> + error('{} has {}={} after {}={}'.format(
> + typ, name, e.value, last.name, last.value))
> + if name in global_seen:
> + error('{} used in {} and {}'.format(
> + name, global_seen[name], typ))
> + else:
> + global_seen[name] = typ
> +
> +def check_constant_prefixes():
> + """Check that the constant prefixes match expected_constant_prefixes."""
> + seen = set()
> + for typ in find_enum_types():
> + typ_prefix = None
> + for val in typ:
> + prefix = find_constant_prefix(val.name)
> + if prefix is None:
> + error('constant {!r} for {} has unknown prefix'.format(
> + val, typ))
> + break
> + elif typ_prefix is None:
> + typ_prefix = prefix
> + seen.add(typ_prefix)
> + elif prefix != typ_prefix:
> + error('prefix {!r} for constant {!r}, expected {!r}'.format(
> + prefix, val, typ_prefix))
> + if typ_prefix is None:
> + error('empty enum type {}'.format(typ))
> +
> + for prefix in sorted(set(expected_constant_prefixes) - seen):
> + error('missing constant prefix {!r}'.format(prefix))
> + # Reverse difference is already covered inside the loop.
> +
> +def find_elf_h_constants(cc):
> + """Returns a dictionary of relevant constants from <elf.h>."""
> + return glibcextract.compute_macro_consts(
> + source_text='#include <elf.h>',
> + cc=cc,
> + macro_re='|'.join(
> + prefix + '.*' for prefix in expected_constant_prefixes))
> +
> +# The first part of the pair is a name of an <elf.h> constant that is
> +# dropped from glibcelf. The second part is the constant as it is
> +# used in <elf.h>.
> +glibcelf_skipped_aliases = (
> + ('EM_ARC_A5', 'EM_ARC_COMPACT'),
> + ('PF_PARISC_SBP', 'PF_HP_SBP')
> +)
> +
> +# Constants that provide little value and are not included in
> +# glibcelf: *LO*/*HI* range constants, *NUM constants counting the
> +# number of constants. Also includes the alias names from
> +# glibcelf_skipped_aliases.
> +glibcelf_skipped_constants = frozenset(
> + [e[0] for e in glibcelf_skipped_aliases]) | frozenset("""
> +DT_AARCH64_NUM
> +DT_ADDRNUM
> +DT_ADDRRNGHI
> +DT_ADDRRNGLO
> +DT_ALPHA_NUM
> +DT_ENCODING
> +DT_EXTRANUM
> +DT_HIOS
> +DT_HIPROC
> +DT_IA_64_NUM
> +DT_LOOS
> +DT_LOPROC
> +DT_MIPS_NUM
> +DT_NUM
> +DT_PPC64_NUM
> +DT_PPC_NUM
> +DT_PROCNUM
> +DT_SPARC_NUM
> +DT_VALNUM
> +DT_VALRNGHI
> +DT_VALRNGLO
> +DT_VERSIONTAGNUM
> +ELFCLASSNUM
> +ELFDATANUM
> +ET_HIOS
> +ET_HIPROC
> +ET_LOOS
> +ET_LOPROC
> +ET_NUM
> +PF_MASKOS
> +PF_MASKPROC
> +PT_HIOS
> +PT_HIPROC
> +PT_HISUNW
> +PT_LOOS
> +PT_LOPROC
> +PT_LOSUNW
> +SHF_MASKOS
> +SHF_MASKPROC
> +SHN_HIOS
> +SHN_HIPROC
> +SHN_HIRESERVE
> +SHN_LOOS
> +SHN_LOPROC
> +SHN_LORESERVE
> +SHT_HIOS
> +SHT_HIPROC
> +SHT_HIPROC
> +SHT_HISUNW
> +SHT_HIUSER
> +SHT_LOOS
> +SHT_LOPROC
> +SHT_LOSUNW
> +SHT_LOUSER
> +SHT_NUM
> +STB_HIOS
> +STB_HIPROC
> +STB_LOOS
> +STB_LOPROC
> +STB_NUM
> +STT_HIOS
> +STT_HIPROC
> +STT_LOOS
> +STT_LOPROC
> +STT_NUM
> +""".strip().split())
> +
> +def check_constant_values(cc):
> + """Checks the values of <elf.h> constants against glibcelf."""
> +
> + glibcelf_constants = {
> + e.name: e for typ in find_enum_types() for e in typ}
> + elf_h_constants = find_elf_h_constants(cc=cc)
> +
> + missing_in_glibcelf = (set(elf_h_constants) - set(glibcelf_constants)
> + - glibcelf_skipped_constants)
> + for name in sorted(missing_in_glibcelf):
> + error('constant {} is missing from glibcelf'.format(name))
> +
> + unexpected_in_glibcelf = \
> + set(glibcelf_constants) & glibcelf_skipped_constants
> + for name in sorted(unexpected_in_glibcelf):
> + error('constant {} is supposed to be filtered from glibcelf'.format(
> + name))
> +
> + missing_in_elf_h = set(glibcelf_constants) - set(elf_h_constants)
> + for name in sorted(missing_in_elf_h):
> + error('constant {} is missing from <elf.h>'.format(name))
> +
> + expected_in_elf_h = glibcelf_skipped_constants - set(elf_h_constants)
> + for name in expected_in_elf_h:
> + error('filtered constant {} is missing from <elf.h>'.format(name))
> +
> + for alias_name, name_in_glibcelf in glibcelf_skipped_aliases:
> + if name_in_glibcelf not in glibcelf_constants:
> + error('alias value {} for {} not in glibcelf'.format(
> + name_in_glibcelf, alias_name))
> + elif (int(elf_h_constants[alias_name])
> + != glibcelf_constants[name_in_glibcelf].value):
> + error('<elf.h> has {}={}, glibcelf has {}={}'.format(
> + alias_name, elf_h_constants[alias_name],
> + name_in_glibcelf, glibcelf_constants[name_in_glibcelf]))
> +
> + # Check for value mismatches:
> + for name in sorted(set(glibcelf_constants) & set(elf_h_constants)):
> + glibcelf_value = glibcelf_constants[name].value
> + elf_h_value = int(elf_h_constants[name])
> + # On 32-bit architectures <elf.h> as some constants that are
> + # parsed as signed, while they are unsigned in glibcelf. So
> + # far, this only affects some flag constants, so special-case
> + # them here.
> + if (glibcelf_value != elf_h_value
> + and not (isinstance(glibcelf_constants[name], enum.IntFlag)
> + and glibcelf_value == 1 << 31
> + and elf_h_value == -(1 << 31))):
> + error('{}: glibcelf has {!r}, <elf.h> has {!r}'.format(
> + name, glibcelf_value, elf_h_value))
> +
> +def main():
> + """The main entry point."""
> + parser = argparse.ArgumentParser(
> + description="Check glibcelf.py and elf.h against each other.")
> + parser.add_argument('--cc', metavar='CC',
> + help='C compiler (including options) to use')
> + args = parser.parse_args()
> +
> + check_duplicates()
> + check_constant_prefixes()
> + check_constant_values(cc=args.cc)
> +
> + if errors_encountered > 0:
> + print("note: errors encountered:", errors_encountered)
> + sys.exit(1)
> +
> +if __name__ == '__main__':
> + main()
OK.
> diff --git a/scripts/glibcelf.py b/scripts/glibcelf.py
> new file mode 100644
> index 0000000000..8f7d0ca184
> --- /dev/null
> +++ b/scripts/glibcelf.py
> @@ -0,0 +1,1135 @@
> +#!/usr/bin/python3
> +# ELF support functionality for Python.
> +# Copyright (C) 2022 Free Software Foundation, Inc.
> +# This file is part of the GNU C Library.
> +#
> +# The GNU C Library is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU Lesser General Public
> +# License as published by the Free Software Foundation; either
> +# version 2.1 of the License, or (at your option) any later version.
> +#
> +# The GNU C Library is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> +# Lesser General Public License for more details.
> +#
> +# You should have received a copy of the GNU Lesser General Public
> +# License along with the GNU C Library; if not, see
> +# <https://www.gnu.org/licenses/>.
> +
> +"""Basic ELF parser.
> +
> +Use Image.readfile(path) to read an ELF file into memory and begin
> +parsing it.
> +
> +"""
> +
> +import collections
> +import enum
> +import struct
> +
> +class _OpenIntEnum(enum.IntEnum):
> + """Integer enumeration that supports arbitrary int values."""
> + @classmethod
> + def _missing_(cls, value):
> + # See enum.IntFlag._create_pseudo_member_. This allows
> + # creating of enum constants with arbitrary integer values.
> + pseudo_member = int.__new__(cls, value)
> + pseudo_member._name_ = None
> + pseudo_member._value_ = value
> + return pseudo_member
> +
> + def __repr__(self):
> + name = self._name_
> + if name is not None:
> + # The names have prefixes like SHT_, implying their type.
> + return name
> + return '{}({})'.format(self.__class__.__name__, self._value_)
> +
> + def __str__(self):
> + name = self._name_
> + if name is not None:
> + return name
> + return str(self._value_)
> +
> +class ElfClass(_OpenIntEnum):
> + """ELF word size. Type of EI_CLASS values."""
> + ELFCLASSNONE = 0
> + ELFCLASS32 = 1
> + ELFCLASS64 = 2
> +
> +class ElfData(_OpenIntEnum):
> + """ELF endianess. Type of EI_DATA values."""
> + ELFDATANONE = 0
> + ELFDATA2LSB = 1
> + ELFDATA2MSB = 2
> +
> +class Machine(_OpenIntEnum):
> + """ELF machine type. Type of values in Ehdr.e_machine field."""
> + EM_NONE = 0
> + EM_M32 = 1
> + EM_SPARC = 2
> + EM_386 = 3
> + EM_68K = 4
> + EM_88K = 5
> + EM_IAMCU = 6
> + EM_860 = 7
> + EM_MIPS = 8
> + EM_S370 = 9
> + EM_MIPS_RS3_LE = 10
> + EM_PARISC = 15
> + EM_VPP500 = 17
> + EM_SPARC32PLUS = 18
> + EM_960 = 19
> + EM_PPC = 20
> + EM_PPC64 = 21
> + EM_S390 = 22
> + EM_SPU = 23
> + EM_V800 = 36
> + EM_FR20 = 37
> + EM_RH32 = 38
> + EM_RCE = 39
> + EM_ARM = 40
> + EM_FAKE_ALPHA = 41
> + EM_SH = 42
> + EM_SPARCV9 = 43
> + EM_TRICORE = 44
> + EM_ARC = 45
> + EM_H8_300 = 46
> + EM_H8_300H = 47
> + EM_H8S = 48
> + EM_H8_500 = 49
> + EM_IA_64 = 50
> + EM_MIPS_X = 51
> + EM_COLDFIRE = 52
> + EM_68HC12 = 53
> + EM_MMA = 54
> + EM_PCP = 55
> + EM_NCPU = 56
> + EM_NDR1 = 57
> + EM_STARCORE = 58
> + EM_ME16 = 59
> + EM_ST100 = 60
> + EM_TINYJ = 61
> + EM_X86_64 = 62
> + EM_PDSP = 63
> + EM_PDP10 = 64
> + EM_PDP11 = 65
> + EM_FX66 = 66
> + EM_ST9PLUS = 67
> + EM_ST7 = 68
> + EM_68HC16 = 69
> + EM_68HC11 = 70
> + EM_68HC08 = 71
> + EM_68HC05 = 72
> + EM_SVX = 73
> + EM_ST19 = 74
> + EM_VAX = 75
> + EM_CRIS = 76
> + EM_JAVELIN = 77
> + EM_FIREPATH = 78
> + EM_ZSP = 79
> + EM_MMIX = 80
> + EM_HUANY = 81
> + EM_PRISM = 82
> + EM_AVR = 83
> + EM_FR30 = 84
> + EM_D10V = 85
> + EM_D30V = 86
> + EM_V850 = 87
> + EM_M32R = 88
> + EM_MN10300 = 89
> + EM_MN10200 = 90
> + EM_PJ = 91
> + EM_OPENRISC = 92
> + EM_ARC_COMPACT = 93
> + EM_XTENSA = 94
> + EM_VIDEOCORE = 95
> + EM_TMM_GPP = 96
> + EM_NS32K = 97
> + EM_TPC = 98
> + EM_SNP1K = 99
> + EM_ST200 = 100
> + EM_IP2K = 101
> + EM_MAX = 102
> + EM_CR = 103
> + EM_F2MC16 = 104
> + EM_MSP430 = 105
> + EM_BLACKFIN = 106
> + EM_SE_C33 = 107
> + EM_SEP = 108
> + EM_ARCA = 109
> + EM_UNICORE = 110
> + EM_EXCESS = 111
> + EM_DXP = 112
> + EM_ALTERA_NIOS2 = 113
> + EM_CRX = 114
> + EM_XGATE = 115
> + EM_C166 = 116
> + EM_M16C = 117
> + EM_DSPIC30F = 118
> + EM_CE = 119
> + EM_M32C = 120
> + EM_TSK3000 = 131
> + EM_RS08 = 132
> + EM_SHARC = 133
> + EM_ECOG2 = 134
> + EM_SCORE7 = 135
> + EM_DSP24 = 136
> + EM_VIDEOCORE3 = 137
> + EM_LATTICEMICO32 = 138
> + EM_SE_C17 = 139
> + EM_TI_C6000 = 140
> + EM_TI_C2000 = 141
> + EM_TI_C5500 = 142
> + EM_TI_ARP32 = 143
> + EM_TI_PRU = 144
> + EM_MMDSP_PLUS = 160
> + EM_CYPRESS_M8C = 161
> + EM_R32C = 162
> + EM_TRIMEDIA = 163
> + EM_QDSP6 = 164
> + EM_8051 = 165
> + EM_STXP7X = 166
> + EM_NDS32 = 167
> + EM_ECOG1X = 168
> + EM_MAXQ30 = 169
> + EM_XIMO16 = 170
> + EM_MANIK = 171
> + EM_CRAYNV2 = 172
> + EM_RX = 173
> + EM_METAG = 174
> + EM_MCST_ELBRUS = 175
> + EM_ECOG16 = 176
> + EM_CR16 = 177
> + EM_ETPU = 178
> + EM_SLE9X = 179
> + EM_L10M = 180
> + EM_K10M = 181
> + EM_AARCH64 = 183
> + EM_AVR32 = 185
> + EM_STM8 = 186
> + EM_TILE64 = 187
> + EM_TILEPRO = 188
> + EM_MICROBLAZE = 189
> + EM_CUDA = 190
> + EM_TILEGX = 191
> + EM_CLOUDSHIELD = 192
> + EM_COREA_1ST = 193
> + EM_COREA_2ND = 194
> + EM_ARCV2 = 195
> + EM_OPEN8 = 196
> + EM_RL78 = 197
> + EM_VIDEOCORE5 = 198
> + EM_78KOR = 199
> + EM_56800EX = 200
> + EM_BA1 = 201
> + EM_BA2 = 202
> + EM_XCORE = 203
> + EM_MCHP_PIC = 204
> + EM_INTELGT = 205
> + EM_KM32 = 210
> + EM_KMX32 = 211
> + EM_EMX16 = 212
> + EM_EMX8 = 213
> + EM_KVARC = 214
> + EM_CDP = 215
> + EM_COGE = 216
> + EM_COOL = 217
> + EM_NORC = 218
> + EM_CSR_KALIMBA = 219
> + EM_Z80 = 220
> + EM_VISIUM = 221
> + EM_FT32 = 222
> + EM_MOXIE = 223
> + EM_AMDGPU = 224
> + EM_RISCV = 243
> + EM_BPF = 247
> + EM_CSKY = 252
> + EM_NUM = 253
> + EM_ALPHA = 0x9026
> +
> +class Et(_OpenIntEnum):
> + """ELF file type. Type of ET_* values and the Ehdr.e_type field."""
> + ET_NONE = 0
> + ET_REL = 1
> + ET_EXEC = 2
> + ET_DYN = 3
> + ET_CORE = 4
> +
> +class Shn(_OpenIntEnum):
> + """ELF reserved section indices."""
> + SHN_UNDEF = 0
> + SHN_BEFORE = 0xff00
> + SHN_AFTER = 0xff01
> + SHN_ABS = 0xfff1
> + SHN_COMMON = 0xfff2
> + SHN_XINDEX = 0xffff
> +
> +class ShnMIPS(enum.Enum):
> + """Supplemental SHN_* constants for EM_MIPS."""
> + SHN_MIPS_ACOMMON = 0xff00
> + SHN_MIPS_TEXT = 0xff01
> + SHN_MIPS_DATA = 0xff02
> + SHN_MIPS_SCOMMON = 0xff03
> + SHN_MIPS_SUNDEFINED = 0xff04
> +
> +class ShnPARISC(enum.Enum):
> + """Supplemental SHN_* constants for EM_PARISC."""
> + SHN_PARISC_ANSI_COMMON = 0xff00
> + SHN_PARISC_HUGE_COMMON = 0xff01
> +
> +class Sht(_OpenIntEnum):
> + """ELF section types. Type of SHT_* values."""
> + SHT_NULL = 0
> + SHT_PROGBITS = 1
> + SHT_SYMTAB = 2
> + SHT_STRTAB = 3
> + SHT_RELA = 4
> + SHT_HASH = 5
> + SHT_DYNAMIC = 6
> + SHT_NOTE = 7
> + SHT_NOBITS = 8
> + SHT_REL = 9
> + SHT_SHLIB = 10
> + SHT_DYNSYM = 11
> + SHT_INIT_ARRAY = 14
> + SHT_FINI_ARRAY = 15
> + SHT_PREINIT_ARRAY = 16
> + SHT_GROUP = 17
> + SHT_SYMTAB_SHNDX = 18
> + SHT_GNU_ATTRIBUTES = 0x6ffffff5
> + SHT_GNU_HASH = 0x6ffffff6
> + SHT_GNU_LIBLIST = 0x6ffffff7
> + SHT_CHECKSUM = 0x6ffffff8
> + SHT_SUNW_move = 0x6ffffffa
> + SHT_SUNW_COMDAT = 0x6ffffffb
> + SHT_SUNW_syminfo = 0x6ffffffc
> + SHT_GNU_verdef = 0x6ffffffd
> + SHT_GNU_verneed = 0x6ffffffe
> + SHT_GNU_versym = 0x6fffffff
> +
> +class ShtALPHA(enum.Enum):
> + """Supplemental SHT_* constants for EM_ALPHA."""
> + SHT_ALPHA_DEBUG = 0x70000001
> + SHT_ALPHA_REGINFO = 0x70000002
> +
> +class ShtARM(enum.Enum):
> + """Supplemental SHT_* constants for EM_ARM."""
> + SHT_ARM_EXIDX = 0x70000001
> + SHT_ARM_PREEMPTMAP = 0x70000002
> + SHT_ARM_ATTRIBUTES = 0x70000003
> +
> +class ShtCSKY(enum.Enum):
> + """Supplemental SHT_* constants for EM_CSKY."""
> + SHT_CSKY_ATTRIBUTES = 0x70000001
> +
> +class ShtIA_64(enum.Enum):
> + """Supplemental SHT_* constants for EM_IA_64."""
> + SHT_IA_64_EXT = 0x70000000
> + SHT_IA_64_UNWIND = 0x70000001
> +
> +class ShtMIPS(enum.Enum):
> + """Supplemental SHT_* constants for EM_MIPS."""
> + SHT_MIPS_LIBLIST = 0x70000000
> + SHT_MIPS_MSYM = 0x70000001
> + SHT_MIPS_CONFLICT = 0x70000002
> + SHT_MIPS_GPTAB = 0x70000003
> + SHT_MIPS_UCODE = 0x70000004
> + SHT_MIPS_DEBUG = 0x70000005
> + SHT_MIPS_REGINFO = 0x70000006
> + SHT_MIPS_PACKAGE = 0x70000007
> + SHT_MIPS_PACKSYM = 0x70000008
> + SHT_MIPS_RELD = 0x70000009
> + SHT_MIPS_IFACE = 0x7000000b
> + SHT_MIPS_CONTENT = 0x7000000c
> + SHT_MIPS_OPTIONS = 0x7000000d
> + SHT_MIPS_SHDR = 0x70000010
> + SHT_MIPS_FDESC = 0x70000011
> + SHT_MIPS_EXTSYM = 0x70000012
> + SHT_MIPS_DENSE = 0x70000013
> + SHT_MIPS_PDESC = 0x70000014
> + SHT_MIPS_LOCSYM = 0x70000015
> + SHT_MIPS_AUXSYM = 0x70000016
> + SHT_MIPS_OPTSYM = 0x70000017
> + SHT_MIPS_LOCSTR = 0x70000018
> + SHT_MIPS_LINE = 0x70000019
> + SHT_MIPS_RFDESC = 0x7000001a
> + SHT_MIPS_DELTASYM = 0x7000001b
> + SHT_MIPS_DELTAINST = 0x7000001c
> + SHT_MIPS_DELTACLASS = 0x7000001d
> + SHT_MIPS_DWARF = 0x7000001e
> + SHT_MIPS_DELTADECL = 0x7000001f
> + SHT_MIPS_SYMBOL_LIB = 0x70000020
> + SHT_MIPS_EVENTS = 0x70000021
> + SHT_MIPS_TRANSLATE = 0x70000022
> + SHT_MIPS_PIXIE = 0x70000023
> + SHT_MIPS_XLATE = 0x70000024
> + SHT_MIPS_XLATE_DEBUG = 0x70000025
> + SHT_MIPS_WHIRL = 0x70000026
> + SHT_MIPS_EH_REGION = 0x70000027
> + SHT_MIPS_XLATE_OLD = 0x70000028
> + SHT_MIPS_PDR_EXCEPTION = 0x70000029
> + SHT_MIPS_XHASH = 0x7000002b
> +
> +class ShtPARISC(enum.Enum):
> + """Supplemental SHT_* constants for EM_PARISC."""
> + SHT_PARISC_EXT = 0x70000000
> + SHT_PARISC_UNWIND = 0x70000001
> + SHT_PARISC_DOC = 0x70000002
> +
> +class Pf(enum.IntFlag):
> + """Program header flags. Type of Phdr.p_flags values."""
> + PF_X = 1
> + PF_W = 2
> + PF_R = 4
> +
> +class PfARM(enum.IntFlag):
> + """Supplemental PF_* flags for EM_ARM."""
> + PF_ARM_SB = 0x10000000
> + PF_ARM_PI = 0x20000000
> + PF_ARM_ABS = 0x40000000
> +
> +class PfPARISC(enum.IntFlag):
> + """Supplemental PF_* flags for EM_PARISC."""
> + PF_HP_PAGE_SIZE = 0x00100000
> + PF_HP_FAR_SHARED = 0x00200000
> + PF_HP_NEAR_SHARED = 0x00400000
> + PF_HP_CODE = 0x01000000
> + PF_HP_MODIFY = 0x02000000
> + PF_HP_LAZYSWAP = 0x04000000
> + PF_HP_SBP = 0x08000000
> +
> +class PfIA_64(enum.IntFlag):
> + """Supplemental PF_* flags for EM_IA_64."""
> + PF_IA_64_NORECOV = 0x80000000
> +
> +class PfMIPS(enum.IntFlag):
> + """Supplemental PF_* flags for EM_MIPS."""
> + PF_MIPS_LOCAL = 0x10000000
> +
> +class Shf(enum.IntFlag):
> + """Section flags. Type of Shdr.sh_type values."""
> + SHF_WRITE = 1 << 0
> + SHF_ALLOC = 1 << 1
> + SHF_EXECINSTR = 1 << 2
> + SHF_MERGE = 1 << 4
> + SHF_STRINGS = 1 << 5
> + SHF_INFO_LINK = 1 << 6
> + SHF_LINK_ORDER = 1 << 7
> + SHF_OS_NONCONFORMING = 256
> + SHF_GROUP = 1 << 9
> + SHF_TLS = 1 << 10
> + SHF_COMPRESSED = 1 << 11
> + SHF_GNU_RETAIN = 1 << 21
> + SHF_ORDERED = 1 << 30
> + SHF_EXCLUDE = 1 << 31
> +
> +class ShfALPHA(enum.IntFlag):
> + """Supplemental SHF_* constants for EM_ALPHA."""
> + SHF_ALPHA_GPREL = 0x10000000
> +
> +class ShfARM(enum.IntFlag):
> + """Supplemental SHF_* constants for EM_ARM."""
> + SHF_ARM_ENTRYSECT = 0x10000000
> + SHF_ARM_COMDEF = 0x80000000
> +
> +class ShfIA_64(enum.IntFlag):
> + """Supplemental SHF_* constants for EM_IA_64."""
> + SHF_IA_64_SHORT = 0x10000000
> + SHF_IA_64_NORECOV = 0x20000000
> +
> +class ShfMIPS(enum.IntFlag):
> + """Supplemental SHF_* constants for EM_MIPS."""
> + SHF_MIPS_GPREL = 0x10000000
> + SHF_MIPS_MERGE = 0x20000000
> + SHF_MIPS_ADDR = 0x40000000
> + SHF_MIPS_STRINGS = 0x80000000
> + SHF_MIPS_NOSTRIP = 0x08000000
> + SHF_MIPS_LOCAL = 0x04000000
> + SHF_MIPS_NAMES = 0x02000000
> + SHF_MIPS_NODUPE = 0x01000000
> +
> +class ShfPARISC(enum.IntFlag):
> + """Supplemental SHF_* constants for EM_PARISC."""
> + SHF_PARISC_SHORT = 0x20000000
> + SHF_PARISC_HUGE = 0x40000000
> + SHF_PARISC_SBP = 0x80000000
> +
> +class Stb(_OpenIntEnum):
> + """ELF symbol binding type."""
> + STB_LOCAL = 0
> + STB_GLOBAL = 1
> + STB_WEAK = 2
> + STB_GNU_UNIQUE = 10
> + STB_MIPS_SPLIT_COMMON = 13
> +
> +class Stt(_OpenIntEnum):
> + """ELF symbol type."""
> + STT_NOTYPE = 0
> + STT_OBJECT = 1
> + STT_FUNC = 2
> + STT_SECTION = 3
> + STT_FILE = 4
> + STT_COMMON = 5
> + STT_TLS = 6
> + STT_GNU_IFUNC = 10
> +
> +class SttARM(enum.Enum):
> + """Supplemental STT_* constants for EM_ARM."""
> + STT_ARM_TFUNC = 13
> + STT_ARM_16BIT = 15
> +
> +class SttPARISC(enum.Enum):
> + """Supplemental STT_* constants for EM_PARISC."""
> + STT_HP_OPAQUE = 11
> + STT_HP_STUB = 12
> + STT_PARISC_MILLICODE = 13
> +
> +class SttSPARC(enum.Enum):
> + """Supplemental STT_* constants for EM_SPARC."""
> + STT_SPARC_REGISTER = 13
> +
> +class SttX86_64(enum.Enum):
> + """Supplemental STT_* constants for EM_X86_64."""
> + SHT_X86_64_UNWIND = 0x70000001
> +
> +class Pt(_OpenIntEnum):
> + """ELF program header types. Type of Phdr.p_type."""
> + PT_NULL = 0
> + PT_LOAD = 1
> + PT_DYNAMIC = 2
> + PT_INTERP = 3
> + PT_NOTE = 4
> + PT_SHLIB = 5
> + PT_PHDR = 6
> + PT_TLS = 7
> + PT_NUM = 8
> + PT_GNU_EH_FRAME = 0x6474e550
> + PT_GNU_STACK = 0x6474e551
> + PT_GNU_RELRO = 0x6474e552
> + PT_GNU_PROPERTY = 0x6474e553
> + PT_SUNWBSS = 0x6ffffffa
> + PT_SUNWSTACK = 0x6ffffffb
> +
> +class PtARM(enum.Enum):
> + """Supplemental PT_* constants for EM_ARM."""
> + PT_ARM_EXIDX = 0x70000001
> +
> +class PtIA_64(enum.Enum):
> + """Supplemental PT_* constants for EM_IA_64."""
> + PT_IA_64_HP_OPT_ANOT = 0x60000012
> + PT_IA_64_HP_HSL_ANOT = 0x60000013
> + PT_IA_64_HP_STACK = 0x60000014
> + PT_IA_64_ARCHEXT = 0x70000000
> + PT_IA_64_UNWIND = 0x70000001
> +
> +class PtMIPS(enum.Enum):
> + """Supplemental PT_* constants for EM_MIPS."""
> + PT_MIPS_REGINFO = 0x70000000
> + PT_MIPS_RTPROC = 0x70000001
> + PT_MIPS_OPTIONS = 0x70000002
> + PT_MIPS_ABIFLAGS = 0x70000003
> +
> +class PtPARISC(enum.Enum):
> + """Supplemental PT_* constants for EM_PARISC."""
> + PT_HP_TLS = 0x60000000
> + PT_HP_CORE_NONE = 0x60000001
> + PT_HP_CORE_VERSION = 0x60000002
> + PT_HP_CORE_KERNEL = 0x60000003
> + PT_HP_CORE_COMM = 0x60000004
> + PT_HP_CORE_PROC = 0x60000005
> + PT_HP_CORE_LOADABLE = 0x60000006
> + PT_HP_CORE_STACK = 0x60000007
> + PT_HP_CORE_SHM = 0x60000008
> + PT_HP_CORE_MMF = 0x60000009
> + PT_HP_PARALLEL = 0x60000010
> + PT_HP_FASTBIND = 0x60000011
> + PT_HP_OPT_ANNOT = 0x60000012
> + PT_HP_HSL_ANNOT = 0x60000013
> + PT_HP_STACK = 0x60000014
> + PT_PARISC_ARCHEXT = 0x70000000
> + PT_PARISC_UNWIND = 0x70000001
> +
> +class Dt(_OpenIntEnum):
> + """ELF dynamic segment tags. Type of Dyn.d_val."""
> + DT_NULL = 0
> + DT_NEEDED = 1
> + DT_PLTRELSZ = 2
> + DT_PLTGOT = 3
> + DT_HASH = 4
> + DT_STRTAB = 5
> + DT_SYMTAB = 6
> + DT_RELA = 7
> + DT_RELASZ = 8
> + DT_RELAENT = 9
> + DT_STRSZ = 10
> + DT_SYMENT = 11
> + DT_INIT = 12
> + DT_FINI = 13
> + DT_SONAME = 14
> + DT_RPATH = 15
> + DT_SYMBOLIC = 16
> + DT_REL = 17
> + DT_RELSZ = 18
> + DT_RELENT = 19
> + DT_PLTREL = 20
> + DT_DEBUG = 21
> + DT_TEXTREL = 22
> + DT_JMPREL = 23
> + DT_BIND_NOW = 24
> + DT_INIT_ARRAY = 25
> + DT_FINI_ARRAY = 26
> + DT_INIT_ARRAYSZ = 27
> + DT_FINI_ARRAYSZ = 28
> + DT_RUNPATH = 29
> + DT_FLAGS = 30
> + DT_PREINIT_ARRAY = 32
> + DT_PREINIT_ARRAYSZ = 33
> + DT_SYMTAB_SHNDX = 34
> + DT_GNU_PRELINKED = 0x6ffffdf5
> + DT_GNU_CONFLICTSZ = 0x6ffffdf6
> + DT_GNU_LIBLISTSZ = 0x6ffffdf7
> + DT_CHECKSUM = 0x6ffffdf8
> + DT_PLTPADSZ = 0x6ffffdf9
> + DT_MOVEENT = 0x6ffffdfa
> + DT_MOVESZ = 0x6ffffdfb
> + DT_FEATURE_1 = 0x6ffffdfc
> + DT_POSFLAG_1 = 0x6ffffdfd
> + DT_SYMINSZ = 0x6ffffdfe
> + DT_SYMINENT = 0x6ffffdff
> + DT_GNU_HASH = 0x6ffffef5
> + DT_TLSDESC_PLT = 0x6ffffef6
> + DT_TLSDESC_GOT = 0x6ffffef7
> + DT_GNU_CONFLICT = 0x6ffffef8
> + DT_GNU_LIBLIST = 0x6ffffef9
> + DT_CONFIG = 0x6ffffefa
> + DT_DEPAUDIT = 0x6ffffefb
> + DT_AUDIT = 0x6ffffefc
> + DT_PLTPAD = 0x6ffffefd
> + DT_MOVETAB = 0x6ffffefe
> + DT_SYMINFO = 0x6ffffeff
> + DT_VERSYM = 0x6ffffff0
> + DT_RELACOUNT = 0x6ffffff9
> + DT_RELCOUNT = 0x6ffffffa
> + DT_FLAGS_1 = 0x6ffffffb
> + DT_VERDEF = 0x6ffffffc
> + DT_VERDEFNUM = 0x6ffffffd
> + DT_VERNEED = 0x6ffffffe
> + DT_VERNEEDNUM = 0x6fffffff
> + DT_AUXILIARY = 0x7ffffffd
> + DT_FILTER = 0x7fffffff
> +
> +class DtAARCH64(enum.Enum):
> + """Supplemental DT_* constants for EM_AARCH64."""
> + DT_AARCH64_BTI_PLT = 0x70000001
> + DT_AARCH64_PAC_PLT = 0x70000003
> + DT_AARCH64_VARIANT_PCS = 0x70000005
> +
> +class DtALPHA(enum.Enum):
> + """Supplemental DT_* constants for EM_ALPHA."""
> + DT_ALPHA_PLTRO = 0x70000000
> +
> +class DtALTERA_NIOS2(enum.Enum):
> + """Supplemental DT_* constants for EM_ALTERA_NIOS2."""
> + DT_NIOS2_GP = 0x70000002
> +
> +class DtIA_64(enum.Enum):
> + """Supplemental DT_* constants for EM_IA_64."""
> + DT_IA_64_PLT_RESERVE = 0x70000000
> +
> +class DtMIPS(enum.Enum):
> + """Supplemental DT_* constants for EM_MIPS."""
> + DT_MIPS_RLD_VERSION = 0x70000001
> + DT_MIPS_TIME_STAMP = 0x70000002
> + DT_MIPS_ICHECKSUM = 0x70000003
> + DT_MIPS_IVERSION = 0x70000004
> + DT_MIPS_FLAGS = 0x70000005
> + DT_MIPS_BASE_ADDRESS = 0x70000006
> + DT_MIPS_MSYM = 0x70000007
> + DT_MIPS_CONFLICT = 0x70000008
> + DT_MIPS_LIBLIST = 0x70000009
> + DT_MIPS_LOCAL_GOTNO = 0x7000000a
> + DT_MIPS_CONFLICTNO = 0x7000000b
> + DT_MIPS_LIBLISTNO = 0x70000010
> + DT_MIPS_SYMTABNO = 0x70000011
> + DT_MIPS_UNREFEXTNO = 0x70000012
> + DT_MIPS_GOTSYM = 0x70000013
> + DT_MIPS_HIPAGENO = 0x70000014
> + DT_MIPS_RLD_MAP = 0x70000016
> + DT_MIPS_DELTA_CLASS = 0x70000017
> + DT_MIPS_DELTA_CLASS_NO = 0x70000018
> + DT_MIPS_DELTA_INSTANCE = 0x70000019
> + DT_MIPS_DELTA_INSTANCE_NO = 0x7000001a
> + DT_MIPS_DELTA_RELOC = 0x7000001b
> + DT_MIPS_DELTA_RELOC_NO = 0x7000001c
> + DT_MIPS_DELTA_SYM = 0x7000001d
> + DT_MIPS_DELTA_SYM_NO = 0x7000001e
> + DT_MIPS_DELTA_CLASSSYM = 0x70000020
> + DT_MIPS_DELTA_CLASSSYM_NO = 0x70000021
> + DT_MIPS_CXX_FLAGS = 0x70000022
> + DT_MIPS_PIXIE_INIT = 0x70000023
> + DT_MIPS_SYMBOL_LIB = 0x70000024
> + DT_MIPS_LOCALPAGE_GOTIDX = 0x70000025
> + DT_MIPS_LOCAL_GOTIDX = 0x70000026
> + DT_MIPS_HIDDEN_GOTIDX = 0x70000027
> + DT_MIPS_PROTECTED_GOTIDX = 0x70000028
> + DT_MIPS_OPTIONS = 0x70000029
> + DT_MIPS_INTERFACE = 0x7000002a
> + DT_MIPS_DYNSTR_ALIGN = 0x7000002b
> + DT_MIPS_INTERFACE_SIZE = 0x7000002c
> + DT_MIPS_RLD_TEXT_RESOLVE_ADDR = 0x7000002d
> + DT_MIPS_PERF_SUFFIX = 0x7000002e
> + DT_MIPS_COMPACT_SIZE = 0x7000002f
> + DT_MIPS_GP_VALUE = 0x70000030
> + DT_MIPS_AUX_DYNAMIC = 0x70000031
> + DT_MIPS_PLTGOT = 0x70000032
> + DT_MIPS_RWPLT = 0x70000034
> + DT_MIPS_RLD_MAP_REL = 0x70000035
> + DT_MIPS_XHASH = 0x70000036
> +
> +class DtPPC(enum.Enum):
> + """Supplemental DT_* constants for EM_PPC."""
> + DT_PPC_GOT = 0x70000000
> + DT_PPC_OPT = 0x70000001
> +
> +class DtPPC64(enum.Enum):
> + """Supplemental DT_* constants for EM_PPC64."""
> + DT_PPC64_GLINK = 0x70000000
> + DT_PPC64_OPD = 0x70000001
> + DT_PPC64_OPDSZ = 0x70000002
> + DT_PPC64_OPT = 0x70000003
> +
> +class DtSPARC(enum.Enum):
> + """Supplemental DT_* constants for EM_SPARC."""
> + DT_SPARC_REGISTER = 0x70000001
> +
> +class StInfo:
> + """ELF symbol binding and type. Type of the Sym.st_info field."""
> + def __init__(self, arg0, arg1=None):
> + if isinstance(arg0, int) and arg1 is None:
> + self.bind = Stb(arg0 >> 4)
> + self.type = Stt(arg0 & 15)
> + else:
> + self.bind = Stb(arg0)
> + self.type = Stt(arg1)
> +
> + def value(self):
> + """Returns the raw value for the bind/type combination."""
> + return (self.bind.value() << 4) | (self.type.value())
> +
> +# Type in an ELF file. Used for deserialization.
> +_Layout = collections.namedtuple('_Layout', 'unpack size')
> +
> +def _define_layouts(baseclass: type, layout32: str, layout64: str,
> + types=None, fields32=None):
> + """Assign variants dict to baseclass.
> +
> + The variants dict is indexed by (ElfClass, ElfData) pairs, and its
> + values are _Layout instances.
> +
> + """
> + struct32 = struct.Struct(layout32)
> + struct64 = struct.Struct(layout64)
> +
> + # Check that the struct formats yield the right number of components.
> + for s in (struct32, struct64):
> + example = s.unpack(b' ' * s.size)
> + if len(example) != len(baseclass._fields):
> + raise ValueError('{!r} yields wrong field count: {} != {}'.format(
> + s.format, len(example), len(baseclass._fields)))
> +
> + # Check that field names in types are correct.
> + if types is None:
> + types = ()
> + for n in types:
> + if n not in baseclass._fields:
> + raise ValueError('{} does not have field {!r}'.format(
> + baseclass.__name__, n))
> +
> + if fields32 is not None \
> + and set(fields32) != set(baseclass._fields):
> + raise ValueError('{!r} is not a permutation of the fields {!r}'.format(
> + fields32, baseclass._fields))
> +
> + def unique_name(name, used_names = (set((baseclass.__name__,))
> + | set(baseclass._fields)
> + | {n.__name__
> + for n in (types or {}).values()})):
> + """Find a name that is not used for a class or field name."""
> + candidate = name
> + n = 0
> + while candidate in used_names:
> + n += 1
> + candidate = '{}{}'.format(name, n)
> + used_names.add(candidate)
> + return candidate
> +
> + blob_name = unique_name('blob')
> + struct_unpack_name = unique_name('struct_unpack')
> + comps_name = unique_name('comps')
> +
> + layouts = {}
> + for (bits, elfclass, layout, fields) in (
> + (32, ElfClass.ELFCLASS32, layout32, fields32),
> + (64, ElfClass.ELFCLASS64, layout64, None),
> + ):
> + for (elfdata, structprefix, funcsuffix) in (
> + (ElfData.ELFDATA2LSB, '<', 'LE'),
> + (ElfData.ELFDATA2MSB, '>', 'BE'),
> + ):
> + env = {
> + baseclass.__name__: baseclass,
> + struct_unpack_name: struct.unpack,
> + }
> +
> + # Add the type converters.
> + if types:
> + for cls in types.values():
> + env[cls.__name__] = cls
> +
> + funcname = ''.join(
> + ('unpack_', baseclass.__name__, str(bits), funcsuffix))
> +
> + code = '''
> +def {funcname}({blob_name}):
> +'''.format(funcname=funcname, blob_name=blob_name)
> +
> + indent = ' ' * 4
> + unpack_call = '{}({!r}, {})'.format(
> + struct_unpack_name, structprefix + layout, blob_name)
> + field_names = ', '.join(baseclass._fields)
> + if types is None and fields is None:
> + code += '{}return {}({})\n'.format(
> + indent, baseclass.__name__, unpack_call)
> + else:
> + # Destructuring tuple assignment.
> + if fields is None:
> + code += '{}{} = {}\n'.format(
> + indent, field_names, unpack_call)
> + else:
> + # Use custom field order.
> + code += '{}{} = {}\n'.format(
> + indent, ', '.join(fields), unpack_call)
> +
> + # Perform the type conversions.
> + for n in baseclass._fields:
> + if n in types:
> + code += '{}{} = {}({})\n'.format(
> + indent, n, types[n].__name__, n)
> + # Create the named tuple.
> + code += '{}return {}({})\n'.format(
> + indent, baseclass.__name__, field_names)
> +
> + exec(code, env)
> + layouts[(elfclass, elfdata)] = _Layout(
> + env[funcname], struct.calcsize(layout))
> + baseclass.layouts = layouts
> +
> +
> +# Corresponds to EI_* indices into Elf*_Ehdr.e_indent.
> +class Ident(collections.namedtuple('Ident',
> + 'ei_mag ei_class ei_data ei_version ei_osabi ei_abiversion ei_pad')):
> +
> + def __new__(cls, *args):
> + """Construct an object from a blob or its constituent fields."""
> + if len(args) == 1:
> + return cls.unpack(args[0])
> + return cls.__base__.__new__(cls, *args)
> +
> + @staticmethod
> + def unpack(blob: memoryview) -> 'Ident':
> + """Parse raws data into a tuple."""
> + ei_mag, ei_class, ei_data, ei_version, ei_osabi, ei_abiversion, \
> + ei_pad = struct.unpack('4s5B7s', blob)
> + return Ident(ei_mag, ElfClass(ei_class), ElfData(ei_data),
> + ei_version, ei_osabi, ei_abiversion, ei_pad)
> + size = 16
> +
> +# Corresponds to Elf32_Ehdr and Elf64_Ehdr.
> +Ehdr = collections.namedtuple('Ehdr',
> + 'e_ident e_type e_machine e_version e_entry e_phoff e_shoff e_flags'
> + + ' e_ehsize e_phentsize e_phnum e_shentsize e_shnum e_shstrndx')
> +_define_layouts(Ehdr,
> + layout32='16s2H5I6H',
> + layout64='16s2HI3QI6H',
> + types=dict(e_ident=Ident,
> + e_machine=Machine,
> + e_type=Et,
> + e_shstrndx=Shn))
> +
> +# Corresponds to Elf32_Phdr and Elf64_Pdhr. Order follows the latter.
> +Phdr = collections.namedtuple('Phdr',
> + 'p_type p_flags p_offset p_vaddr p_paddr p_filesz p_memsz p_align')
> +_define_layouts(Phdr,
> + layout32='8I',
> + fields32=('p_type', 'p_offset', 'p_vaddr', 'p_paddr',
> + 'p_filesz', 'p_memsz', 'p_flags', 'p_align'),
> + layout64='2I6Q',
> + types=dict(p_type=Pt, p_flags=Pf))
> +
> +
> +# Corresponds to Elf32_Shdr and Elf64_Shdr.
> +class Shdr(collections.namedtuple('Shdr',
> + 'sh_name sh_type sh_flags sh_addr sh_offset sh_size sh_link sh_info'
> + + ' sh_addralign sh_entsize')):
> + def resolve(self, strtab: 'StringTable') -> 'Shdr':
> + """Resolve sh_name using a string table."""
> + return self.__class__(strtab.get(self[0]), *self[1:])
> +_define_layouts(Shdr,
> + layout32='10I',
> + layout64='2I4Q2I2Q',
> + types=dict(sh_type=Sht,
> + sh_flags=Shf,
> + sh_link=Shn))
> +
> +# Corresponds to Elf32_Dyn and Elf64_Dyn. The nesting through the
> +# d_un union is skipped, and d_ptr is missing (its representation in
> +# Python would be identical to d_val).
> +Dyn = collections.namedtuple('Dyn', 'd_tag d_val')
> +_define_layouts(Dyn,
> + layout32='2i',
> + layout64='2q',
> + types=dict(d_tag=Dt))
> +
> +# Corresponds to Elf32_Sym and Elf64_Sym.
> +class Sym(collections.namedtuple('Sym',
> + 'st_name st_info st_other st_shndx st_value st_size')):
> + def resolve(self, strtab: 'StringTable') -> 'Sym':
> + """Resolve st_name using a string table."""
> + return self.__class__(strtab.get(self[0]), *self[1:])
> +_define_layouts(Sym,
> + layout32='3I2BH',
> + layout64='I2BH2Q',
> + fields32=('st_name', 'st_value', 'st_size', 'st_info',
> + 'st_other', 'st_shndx'),
> + types=dict(st_shndx=Shn,
> + st_info=StInfo))
> +
> +# Corresponds to Elf32_Rel and Elf64_Rel.
> +Rel = collections.namedtuple('Rel', 'r_offset r_info')
> +_define_layouts(Rel,
> + layout32='2I',
> + layout64='2Q')
> +
> +# Corresponds to Elf32_Rel and Elf64_Rel.
> +Rela = collections.namedtuple('Rela', 'r_offset r_info r_addend')
> +_define_layouts(Rela,
> + layout32='3I',
> + layout64='3Q')
> +
> +class StringTable:
> + """ELF string table."""
> + def __init__(self, blob):
> + """Create a new string table backed by the data in the blob.
> +
> + blob: a memoryview-like object
> +
> + """
> + self.blob = blob
> +
> + def get(self, index) -> bytes:
> + """Returns the null-terminated byte string at the index."""
> + blob = self.blob
> + endindex = index
> + while True:
> + if blob[endindex] == 0:
> + return bytes(blob[index:endindex])
> + endindex += 1
> +
> +class Image:
> + """ELF image parser."""
> + def __init__(self, image):
> + """Create an ELF image from binary image data.
> +
> + image: a memoryview-like object that supports efficient range
> + subscripting.
> +
> + """
> + self.image = image
> + ident = self.read(Ident, 0)
> + classdata = (ident.ei_class, ident.ei_data)
> + # Set self.Ehdr etc. to the subtypes with the right parsers.
> + for typ in (Ehdr, Phdr, Shdr, Dyn, Sym, Rel, Rela):
> + setattr(self, typ.__name__, typ.layouts.get(classdata, None))
> +
> + if self.Ehdr is not None:
> + self.ehdr = self.read(self.Ehdr, 0)
> + self._shdr_num = self._compute_shdr_num()
> + else:
> + self.ehdr = None
> + self._shdr_num = 0
> +
> + self._section = {}
> + self._stringtab = {}
> +
> + if self._shdr_num > 0:
> + self._shdr_strtab = self._find_shdr_strtab()
> + else:
> + self._shdr_strtab = None
> +
> + @staticmethod
> + def readfile(path: str) -> 'Image':
> + """Reads the ELF file at the specified path."""
> + with open(path, 'rb') as inp:
> + return Image(memoryview(inp.read()))
> +
> + def _compute_shdr_num(self) -> int:
> + """Computes the actual number of section headers."""
> + shnum = self.ehdr.e_shnum
> + if shnum == 0:
> + if self.ehdr.e_shoff == 0 or self.ehdr.e_shentsize == 0:
> + # No section headers.
> + return 0
> + # Otherwise the extension mechanism is used (which may be
> + # needed because e_shnum is just 16 bits).
> + return self.read(self.Shdr, self.ehdr.e_shoff).sh_size
> + return shnum
> +
> + def _find_shdr_strtab(self) -> StringTable:
> + """Finds the section header string table (maybe via extensions)."""
> + shstrndx = self.ehdr.e_shstrndx
> + if shstrndx == Shn.SHN_XINDEX:
> + shstrndx = self.read(self.Shdr, self.ehdr.e_shoff).sh_link
> + return self._find_stringtab(shstrndx)
> +
> + def read(self, typ: type, offset:int ):
> + """Reads an object at a specific offset.
> +
> + The type must have been enhanced using _define_variants.
> +
> + """
> + return typ.unpack(self.image[offset: offset + typ.size])
> +
> + def phdrs(self) -> Phdr:
> + """Generator iterating over the program headers."""
> + if self.ehdr is None:
> + return
> + size = self.ehdr.e_phentsize
> + if size != self.Phdr.size:
> + raise ValueError('Unexpected Phdr size in ELF header: {} != {}'
> + .format(size, self.Phdr.size))
> +
> + offset = self.ehdr.e_phoff
> + for _ in range(self.ehdr.e_phnum):
> + yield self.read(self.Phdr, offset)
> + offset += size
> +
> + def shdrs(self, resolve: bool=True) -> Shdr:
> + """Generator iterating over the section headers.
> +
> + If resolve, section names are automatically translated
> + using the section header string table.
> +
> + """
> + if self._shdr_num == 0:
> + return
> +
> + size = self.ehdr.e_shentsize
> + if size != self.Shdr.size:
> + raise ValueError('Unexpected Shdr size in ELF header: {} != {}'
> + .format(size, self.Shdr.size))
> +
> + offset = self.ehdr.e_shoff
> + for _ in range(self._shdr_num):
> + shdr = self.read(self.Shdr, offset)
> + if resolve:
> + shdr = shdr.resolve(self._shdr_strtab)
> + yield shdr
> + offset += size
> +
> + def dynamic(self) -> Dyn:
> + """Generator iterating over the dynamic segment."""
> + for phdr in self.phdrs():
> + if phdr.p_type == Pt.PT_DYNAMIC:
> + # Pick the first dynamic segment, like the loader.
> + if phdr.p_filesz == 0:
> + # Probably separated debuginfo.
> + return
> + offset = phdr.p_offset
> + end = offset + phdr.p_memsz
> + size = self.Dyn.size
> + while True:
> + next_offset = offset + size
> + if next_offset > end:
> + raise ValueError(
> + 'Dynamic segment size {} is not a multiple of Dyn size {}'.format(
> + phdr.p_memsz, size))
> + yield self.read(self.Dyn, offset)
> + if next_offset == end:
> + return
> + offset = next_offset
> +
> + def syms(self, shdr: Shdr, resolve: bool=True) -> Sym:
> + """A generator iterating over a symbol table.
> +
> + If resolve, symbol names are automatically translated using
> + the string table for the symbol table.
> +
> + """
> + assert shdr.sh_type == Sht.SHT_SYMTAB
> + size = shdr.sh_entsize
> + if size != self.Sym.size:
> + raise ValueError('Invalid symbol table entry size {}'.format(size))
> + offset = shdr.sh_offset
> + end = shdr.sh_offset + shdr.sh_size
> + if resolve:
> + strtab = self._find_stringtab(shdr.sh_link)
> + while offset < end:
> + sym = self.read(self.Sym, offset)
> + if resolve:
> + sym = sym.resolve(strtab)
> + yield sym
> + offset += size
> + if offset != end:
> + raise ValueError('Symbol table is not a multiple of entry size')
> +
> + def lookup_string(self, strtab_index: int, strtab_offset: int) -> bytes:
> + """Looks up a string in a string table identified by its link index."""
> + try:
> + strtab = self._stringtab[strtab_index]
> + except KeyError:
> + strtab = self._find_stringtab(strtab_index)
> + return strtab.get(strtab_offset)
> +
> + def find_section(self, shndx: Shn) -> Shdr:
> + """Returns the section header for the indexed section.
> +
> + The section name is not resolved.
> + """
> + try:
> + return self._section[shndx]
> + except KeyError:
> + pass
> + if shndx in Shn:
> + raise ValueError('Reserved section index {}'.format(shndx))
> + idx = shndx.value
> + if idx < 0 or idx > self._shdr_num:
> + raise ValueError('Section index {} out of range [0, {})'.format(
> + idx, self._shdr_num))
> + shdr = self.read(
> + self.Shdr, self.ehdr.e_shoff + idx * self.Shdr.size)
> + self._section[shndx] = shdr
> + return shdr
> +
> + def _find_stringtab(self, sh_link: int) -> StringTable:
> + if sh_link in self._stringtab:
> + return self._stringtab
> + if sh_link < 0 or sh_link >= self._shdr_num:
> + raise ValueError('Section index {} out of range [0, {})'.format(
> + sh_link, self._shdr_num))
> + shdr = self.read(
> + self.Shdr, self.ehdr.e_shoff + sh_link * self.Shdr.size)
> + if shdr.sh_type != Sht.SHT_STRTAB:
> + raise ValueError(
> + 'Section {} is not a string table: {}'.format(
> + sh_link, shdr.sh_type))
> + strtab = StringTable(
> + self.image[shdr.sh_offset:shdr.sh_offset + shdr.sh_size])
> + # This could retrain essentially arbitrary amounts of data,
> + # but caching string tables seems important for performance.
> + self._stringtab[sh_link] = strtab
> + return strtab
> +
> +
> +__all__ = [name for name in dir() if name[0].isupper()]
>
next prev parent reply other threads:[~2022-04-22 8:45 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-21 21:42 Florian Weimer
2022-04-22 8:45 ` Siddhesh Poyarekar [this message]
2022-04-22 16:26 ` Joseph Myers
2022-04-22 16:37 ` Florian Weimer
2022-04-22 17:25 ` Florian Weimer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b6a9d88d-9950-c640-4553-4097d11044c5@gotplt.org \
--to=siddhesh@gotplt.org \
--cc=fweimer@redhat.com \
--cc=libc-alpha@sourceware.org \
--cc=siddhesh@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).