烦恼一般都是想太多了。

0%

ELF文件格式

ELF 是一种在计算机中用来描述 可执行文件,编译对象代码,共享库,core dumps 的一种标准格式。其在很多系统上都有进行使用。

每个 ELF 文件都由一个 ELF 头 加上文件数据组成。文件数据可能包含以下三部分:

  • Program header table(程序头表),描述了 0 个或多个内存段。其显示了在运行时会使用到的段内容。

  • Section header table( 节头别),描述了 0 个或多个节。其列出了二进制文件的节信息,在进行链接和重定位的时候会有用。

  • 被 Programe header table 或者 section header table 中的项所引用的数据。

    img

文件中的每个字节只会被一个节所拥有,而也有可能不被任何节拥有。

有一张比较完整的大图,来自于维基百科:

img

ELF Header

https://refspecs.linuxfoundation.org/elf/gabi4+/ch4.eheader.html#elfid

ELF 头定义了是使用 32 还是 64 位的地址。就长度是 52(32bit)或者 64(64bit)。其在 C 内的结构表示如下:

#define EI_NIDENT 16

typedef struct {
unsigned char e_ident[EI_NIDENT];
Elf32_Half e_type;
Elf32_Half e_machine;
Elf32_Word e_version;
Elf32_Addr e_entry;
Elf32_Off e_phoff;
Elf32_Off e_shoff;
Elf32_Word e_flags;
Elf32_Half e_ehsize;
Elf32_Half e_phentsize;
Elf32_Half e_phnum;
Elf32_Half e_shentsize;
Elf32_Half e_shnum;
Elf32_Half e_shstrndx;
} Elf32_Ehdr;

typedef struct {
unsigned char e_ident[EI_NIDENT];
Elf64_Half e_type;
Elf64_Half e_machine;
Elf64_Word e_version;
Elf64_Addr e_entry;
Elf64_Off e_phoff;
Elf64_Off e_shoff;
Elf64_Word e_flags;
Elf64_Half e_ehsize;
Elf64_Half e_phentsize;
Elf64_Half e_phnum;
Elf64_Half e_shentsize;
Elf64_Half e_shnum;
Elf64_Half e_shstrndx;
} Elf64_Ehdr;

e_indent

ELF Identification 的意思。其是一个 16字节的数组。描述了16个属性。这个字段标记了文件本身,同时提供了与机器无关的数据以用来解析文件内容。

Name Value Purpose
EI_MAG0 0 0x7f
EI_MAG1 1 ‘E’
EI_MAG2 2 ‘L’
EI_MAG3 3 ‘F’
EI_CLASS 4 32/64bit ELFCLASSNONE 无效 ELFCLASS32/ELFCLASS3264
EI_DATA 5 Data encoding
EI_VERSION 6 ELF 头版本号
EI_OSABI 7 Operating system/ABI identification如 AIX/Linux/Solaris等
EI_ABIVERSION 8 ABI version
EI_PAD 9 Start of padding bytes。暂时没用
EI_NIDENT 16 Size of e_ident[]

e_type

文件类型。

Name Value Meaning
ET_NONE 0 No file type
ET_REL 1 Relocatable file
ET_EXEC 2 Executable file
ET_DYN 3 Shared object file
ET_CORE 4 Core file
ET_LOOS 0xfe00 Operating system-specific
ET_HIOS 0xfeff Operating system-specific
ET_LOPROC 0xff00 Processor-specific
ET_HIPROC 0xffff Processor-specific

e_machine

CPU 架构

Name Value Meaning
EM_NONE 0 No machine
EM_M32 1 AT&T WE 32100
EM_SPARC 2 SPARC
EM_386 3 Intel 80386
EM_68K 4 Motorola 68000
EM_88K 5 Motorola 88000
reserved 6 Reserved for future use (was EM_486)
EM_860 7 Intel 80860
EM_MIPS 8 MIPS I Architecture
EM_S370 9 IBM System/370 Processor
EM_MIPS_RS3_LE 10 MIPS RS3000 Little-endian
reserved 11-14 Reserved for future use
EM_PARISC 15 Hewlett-Packard PA-RISC
reserved 16 Reserved for future use
EM_VPP500 17 Fujitsu VPP500
EM_SPARC32PLUS 18 Enhanced instruction set SPARC
EM_960 19 Intel 80960
EM_PPC 20 PowerPC
EM_PPC64 21 64-bit PowerPC
EM_S390 22 IBM System/390 Processor
reserved 23-35 Reserved for future use
EM_V800 36 NEC V800
EM_FR20 37 Fujitsu FR20
EM_RH32 38 TRW RH-32
EM_RCE 39 Motorola RCE
EM_ARM 40 Advanced RISC Machines ARM
EM_ALPHA 41 Digital Alpha
EM_SH 42 Hitachi SH
EM_SPARCV9 43 SPARC Version 9
EM_TRICORE 44 Siemens TriCore embedded processor
EM_ARC 45 Argonaut RISC Core, Argonaut Technologies Inc.
EM_H8_300 46 Hitachi H8/300
EM_H8_300H 47 Hitachi H8/300H
EM_H8S 48 Hitachi H8S
EM_H8_500 49 Hitachi H8/500
EM_IA_64 50 Intel IA-64 processor architecture
EM_MIPS_X 51 Stanford MIPS-X
EM_COLDFIRE 52 Motorola ColdFire
EM_68HC12 53 Motorola M68HC12
EM_MMA 54 Fujitsu MMA Multimedia Accelerator
EM_PCP 55 Siemens PCP
EM_NCPU 56 Sony nCPU embedded RISC processor
EM_NDR1 57 Denso NDR1 microprocessor
EM_STARCORE 58 Motorola Star*Core processor
EM_ME16 59 Toyota ME16 processor
EM_ST100 60 STMicroelectronics ST100 processor
EM_TINYJ 61 Advanced Logic Corp. TinyJ embedded processor family
EM_X86_64 62 AMD x86-64 architecture
EM_PDSP 63 Sony DSP Processor
EM_PDP10 64 Digital Equipment Corp. PDP-10
EM_PDP11 65 Digital Equipment Corp. PDP-11
EM_FX66 66 Siemens FX66 microcontroller
EM_ST9PLUS 67 STMicroelectronics ST9+ 8/16 bit microcontroller
EM_ST7 68 STMicroelectronics ST7 8-bit microcontroller
EM_68HC16 69 Motorola MC68HC16 Microcontroller
EM_68HC11 70 Motorola MC68HC11 Microcontroller
EM_68HC08 71 Motorola MC68HC08 Microcontroller
EM_68HC05 72 Motorola MC68HC05 Microcontroller
EM_SVX 73 Silicon Graphics SVx
EM_ST19 74 STMicroelectronics ST19 8-bit microcontroller
EM_VAX 75 Digital VAX
EM_CRIS 76 Axis Communications 32-bit embedded processor
EM_JAVELIN 77 Infineon Technologies 32-bit embedded processor
EM_FIREPATH 78 Element 14 64-bit DSP Processor
EM_ZSP 79 LSI Logic 16-bit DSP Processor
EM_MMIX 80 Donald Knuth’s educational 64-bit processor
EM_HUANY 81 Harvard University machine-independent object files
EM_PRISM 82 SiTera Prism
EM_AVR 83 Atmel AVR 8-bit microcontroller
EM_FR30 84 Fujitsu FR30
EM_D10V 85 Mitsubishi D10V
EM_D30V 86 Mitsubishi D30V
EM_V850 87 NEC v850
EM_M32R 88 Mitsubishi M32R
EM_MN10300 89 Matsushita MN10300
EM_MN10200 90 Matsushita MN10200
EM_PJ 91 picoJava
EM_OPENRISC 92 OpenRISC 32-bit embedded processor
EM_ARC_A5 93 ARC Cores Tangent-A5
EM_XTENSA 94 Tensilica Xtensa Architecture
EM_VIDEOCORE 95 Alphamosaic VideoCore processor
EM_TMM_GPP 96 Thompson Multimedia General Purpose Processor
EM_NS32K 97 National Semiconductor 32000 series
EM_TPC 98 Tenor Network TPC processor
EM_SNP1K 99 Trebia SNP 1000 processor
EM_ST200 100 STMicroelectronics (www.st.com) ST200 microcontroller

e_version

文件版本

Name Value Meaning
EV_NONE 0 Invalid version
EV_CURRENT 1 Current version

e_entry

入口地址。如果没有,那么就是0。

e_phoff

Program header table 偏移值。

e_shoff

Section header table 偏移值。

e_flags

这个需要根据 e_machine 来解析。

e_ehsize

ELF header 大小。

e_phentsize

Program header table 中每项的大小,全部项大小都是一致的。

e_phnum

Program header table 内的项目数量。

e_shentsize

每个 Section header table 的大小

e_shnum

Section header table 中每项的大小。 e_shnum e_shentsize 就给出了Section header table 的大小

If the number of sections is greater than or equal to SHN_LORESERVE (0xff00), this member has the value zero and the actual number of section header table entries is contained in the sh_size field of the section header at index 0. (Otherwise, the sh_size member of the initial entry contains 0.)

e_shstrndx

每个 section 内,包含 section 名称的section header table 项目索引。

例子

readelf  -h a.out
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x400440
Start of program headers: 64 (bytes into file)
Start of section headers: 4504 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 9
Size of section headers: 64 (bytes)
Number of section headers: 30
Section header string table index: 27

Program header

具体参考一下 https://refspecs.linuxfoundation.org/elf/gabi4+/ch5.pheader.html

Objdump

objdump  --help
Usage: objdump <option(s)> <file(s)>
Display information from object <file(s)>.
At least one of the following switches must be given:
-a, --archive-headers Display archive header information
-f, --file-headers Display the contents of the overall file header
-p, --private-headers Display object format specific file header contents
-P, --private=OPT,OPT... Display object format specific contents
-h, --[section-]headers Display the contents of the section headers
-x, --all-headers Display the contents of all headers
-d, --disassemble Display assembler contents of executable sections
-D, --disassemble-all Display assembler contents of all sections
-S, --source Intermix source code with disassembly
-s, --full-contents Display the full contents of all sections requested
-g, --debugging Display debug information in object file
-e, --debugging-tags Display debug information using ctags style
-G, --stabs Display (in raw form) any STABS info in the file
-W[lLiaprmfFsoRt] or
--dwarf[=rawline,=decodedline,=info,=abbrev,=pubnames,=aranges,=macro,=frames,
=frames-interp,=str,=loc,=Ranges,=pubtypes,
=gdb_index,=trace_info,=trace_abbrev,=trace_aranges,
=addr,=cu_index]
Display DWARF info in the file
-t, --syms Display the contents of the symbol table(s)
-T, --dynamic-syms Display the contents of the dynamic symbol table
-r, --reloc Display the relocation entries in the file
-R, --dynamic-reloc Display the dynamic relocation entries in the file
@<file> Read options from <file>
-v, --version Display this program's version number
-i, --info List object formats and architectures supported
-H, --help Display this information

The following switches are optional:
-b, --target=BFDNAME Specify the target object format as BFDNAME
-m, --architecture=MACHINE Specify the target architecture as MACHINE
-j, --section=NAME Only display information for section NAME
-M, --disassembler-options=OPT Pass text OPT on to the disassembler
-EB --endian=big Assume big endian format when disassembling
-EL --endian=little Assume little endian format when disassembling
--file-start-context Include context from start of file (with -S)
-I, --include=DIR Add DIR to search list for source files
-l, --line-numbers Include line numbers and filenames in output
-F, --file-offsets Include file offsets when displaying information
-C, --demangle[=STYLE] Decode mangled/processed symbol names
The STYLE, if specified, can be `auto', `gnu',
`lucid', `arm', `hp', `edg', `gnu-v3', `java'
or `gnat'
-w, --wide Format output for more than 80 columns
-z, --disassemble-zeroes Do not skip blocks of zeroes when disassembling
--start-address=ADDR Only process data whose address is >= ADDR
--stop-address=ADDR Only process data whose address is <= ADDR
--prefix-addresses Print complete address alongside disassembly
--[no-]show-raw-insn Display hex alongside symbolic disassembly
--insn-width=WIDTH Display WIDTH bytes on a single line for -d
--adjust-vma=OFFSET Add OFFSET to all displayed section addresses
--special-syms Include special symbols in symbol dumps
--prefix=PREFIX Add PREFIX to absolute paths for -S
--prefix-strip=LEVEL Strip initial directory names for -S
--dwarf-depth=N Do not display DIEs at depth N or greater
--dwarf-start=N Display DIEs starting with N, at the same depth
or deeper
--dwarf-check Make additional dwarf internal consistency checks.

objdump: supported targets: elf64-x86-64 elf32-i386 elf32-x86-64 a.out-i386-linux pei-i386 pei-x86-64 elf64-l1om elf64-k1om elf64-little elf64-big elf32-little elf32-big plugin srec symbolsrec verilog tekhex binary ihex
objdump: supported architectures: i386 i386:x86-64 i386:x64-32 i8086 i386:intel i386:x86-64:intel i386:x64-32:intel l1om l1om:intel k1om k1om:intel plugin

The following i386/x86-64 specific disassembler options are supported for use
with the -M switch (multiple options should be separated by commas):
x86-64 Disassemble in 64bit mode
i386 Disassemble in 32bit mode
i8086 Disassemble in 16bit mode
att Display instruction in AT&T syntax
intel Display instruction in Intel syntax
att-mnemonic
Display instruction in AT&T mnemonic
intel-mnemonic
Display instruction in Intel mnemonic
addr64 Assume 64bit address size
addr32 Assume 32bit address size
addr16 Assume 16bit address size
data32 Assume 32bit data size
data16 Assume 16bit data size
suffix Always display instruction suffix in AT&T syntax
Report bugs to <http://bugzilla.redhat.com/bugzilla/>.

重要段信息

可执行文件和共享库文件都代表了代表一个程序。为了执行这些程序, 系统必须利用这个文件来建立动态程序表示或者说是进程映像。一个进程映像会在虚拟内存空间进行分段来保存其 代码,数据,栈等。

  • Program Header 描述了一个对象文件和程序直接相关的信息,其定位了文件中的段映像及其他用来创建内存映像必须的信息。
  • Program Loading 给出了一个程序运行必须加载的 对象文件。
  • Dynamic Linking 在系统加载完程序后,其必须完成解析完组成进程的对象文件中的符号引用。