CTF-All-In-One/doc/1.5.3_elf.md
2018-08-05 17:43:10 +08:00

618 lines
27 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# 1.5.3 Linux ELF
- [一个实例](#一个实例)
- [elfdemo.o](#elfdemoo)
- [ELF 文件结构](#elf-文件结构)
- [参考资料](#参考资料)
## 一个实例
*1.5.1节 C语言基础* 中我们看到了从源代码到可执行文件的全过程,现在我们来看一个更复杂的例子。
```c
#include<stdio.h>
int global_init_var = 10;
int global_uninit_var;
void func(int sum) {
printf("%d\n", sum);
}
void main(void) {
static int local_static_init_var = 20;
static int local_static_uninit_var;
int local_init_val = 30;
int local_uninit_var;
func(global_init_var + local_init_val +
local_static_init_var );
}
```
然后分别执行下列命令生成三个文件:
```text
gcc -m32 -c elfDemo.c -o elfDemo.o
gcc -m32 elfDemo.c -o elfDemo.out
gcc -m32 -static elfDemo.c -o elfDemo_static.out
```
使用 ldd 命令打印所依赖的共享库:
```text
$ ldd elfDemo.out
linux-gate.so.1 (0xf77b1000)
libc.so.6 => /usr/lib32/libc.so.6 (0xf7597000)
/lib/ld-linux.so.2 => /usr/lib/ld-linux.so.2 (0xf77b3000)
$ ldd elfDemo_static.out
not a dynamic executable
```
elfDemo_static.out 采用了静态链接的方式。
使用 file 命令查看相应的文件格式:
```text
$ file elfDemo.o
elfDemo.o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not stripped
$ file elfDemo.out
elfDemo.out: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=50036015393a99344897cbf34099256c3793e172, not stripped
$ file elfDemo_static.out
elfDemo_static.out: ELF 32-bit LSB executable, Intel 80386, version 1 (GNU/Linux), statically linked, for GNU/Linux 3.2.0, BuildID[sha1]=276c839c20b4c187e4b486cf96d82a90c40f4dae, not stripped
$ file -L /usr/lib32/libc.so.6
/usr/lib32/libc.so.6: ELF 32-bit LSB shared object, Intel 80386, version 1 (GNU/Linux), dynamically linked, interpreter /usr/lib32/ld-linux.so.2, BuildID[sha1]=ee88d1b2aa81f104ab5645d407e190b244203a52, for GNU/Linux 3.2.0, not stripped
```
于是我们得到了 Linux 可执行文件格式 ELF Executable Linkable Format文件的三种类型
- 可重定位文件Relocatable file
- 包含了代码和数据,可以和其他目标文件链接生成一个可执行文件或共享目标文件。
- elfDemo.o
- 可执行文件Executable File
- 包含了可以直接执行的文件。
- elfDemo_static.out
- 共享目标文件Shared Object File
- 包含了用于链接的代码和数据,分两种情况。一种是链接器将其与其他的可重定位文件和共享目标文件链接起来,生产新的目标文件。另一种是动态链接器将多个共享目标文件与可执行文件结合,作为进程映像的一部分。
- elfDemo.out
- `libc-2.25.so`
此时他们的结构如图:
![img](../pic/1.5.3_elfdemo.png)
可以看到,在这个简化的 ELF 文件中,开头是一个“文件头”,之后分别是代码段、数据段和.bss段。程序源代码编译后执行语句变成机器指令保存在`.text`段;已初始化的全局变量和局部静态变量都保存在`.data`段;未初始化的全局变量和局部静态变量则放在`.bss`段。
把程序指令和程序数据分开存放有许多好处,从安全的角度讲,当程序被加载后,数据和指令分别被映射到两个虚拟区域。由于数据区域对于进程来说是可读写的,而指令区域对于进程来说是只读的,所以这两个虚存区域的权限可以被分别设置成可读写和只读,可以防止程序的指令被改写和利用。
### elfDemo.o
接下来,我们更深入地探索目标文件,使用 objdump 来查看目标文件的内部结构:
```text
$ objdump -h elfDemo.o
elfDemo.o: file format elf32-i386
Sections:
Idx Name Size VMA LMA File off Algn
0 .group 00000008 00000000 00000000 00000034 2**2
CONTENTS, READONLY, GROUP, LINK_ONCE_DISCARD
1 .text 00000078 00000000 00000000 0000003c 2**0
CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
2 .data 00000008 00000000 00000000 000000b4 2**2
CONTENTS, ALLOC, LOAD, DATA
3 .bss 00000004 00000000 00000000 000000bc 2**2
ALLOC
4 .rodata 00000004 00000000 00000000 000000bc 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
5 .text.__x86.get_pc_thunk.ax 00000004 00000000 00000000 000000c0 2**0
CONTENTS, ALLOC, LOAD, READONLY, CODE
6 .comment 00000012 00000000 00000000 000000c4 2**0
CONTENTS, READONLY
7 .note.GNU-stack 00000000 00000000 00000000 000000d6 2**0
CONTENTS, READONLY
8 .eh_frame 0000007c 00000000 00000000 000000d8 2**2
CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
```
可以看到目标文件中除了最基本的代码段、数据段和 BSS 段以外,还有一些别的段。注意到 .bss 段没有 `CONTENTS` 属性,表示它实际上并不存在,.bss 段只是为为未初始化的全局变量和局部静态变量预留了位置而已。
### 代码段
```text
$ objdump -x -s -d elfDemo.o
......
Sections:
Idx Name Size VMA LMA File off Algn
......
1 .text 00000078 00000000 00000000 0000003c 2**0
CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
......
Contents of section .text:
0000 5589e553 83ec04e8 fcffffff 05010000 U..S............
0010 0083ec08 ff75088d 90000000 005289c3 .....u.......R..
0020 e8fcffff ff83c410 908b5dfc c9c38d4c ..........]....L
0030 240483e4 f0ff71fc 5589e551 83ec14e8 $.....q.U..Q....
0040 fcffffff 05010000 00c745f4 1e000000 ..........E.....
0050 8b880000 00008b55 f401ca8b 80040000 .......U........
0060 0001d083 ec0c50e8 fcffffff 83c41090 ......P.........
0070 8b4dfcc9 8d61fcc3 .M...a..
......
Disassembly of section .text:
00000000 <func>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 53 push %ebx
4: 83 ec 04 sub $0x4,%esp
7: e8 fc ff ff ff call 8 <func+0x8>
8: R_386_PC32 __x86.get_pc_thunk.ax
c: 05 01 00 00 00 add $0x1,%eax
d: R_386_GOTPC _GLOBAL_OFFSET_TABLE_
11: 83 ec 08 sub $0x8,%esp
14: ff 75 08 pushl 0x8(%ebp)
17: 8d 90 00 00 00 00 lea 0x0(%eax),%edx
19: R_386_GOTOFF .rodata
1d: 52 push %edx
1e: 89 c3 mov %eax,%ebx
20: e8 fc ff ff ff call 21 <func+0x21>
21: R_386_PLT32 printf
25: 83 c4 10 add $0x10,%esp
28: 90 nop
29: 8b 5d fc mov -0x4(%ebp),%ebx
2c: c9 leave
2d: c3 ret
0000002e <main>:
2e: 8d 4c 24 04 lea 0x4(%esp),%ecx
32: 83 e4 f0 and $0xfffffff0,%esp
35: ff 71 fc pushl -0x4(%ecx)
38: 55 push %ebp
39: 89 e5 mov %esp,%ebp
3b: 51 push %ecx
3c: 83 ec 14 sub $0x14,%esp
3f: e8 fc ff ff ff call 40 <main+0x12>
40: R_386_PC32 __x86.get_pc_thunk.ax
44: 05 01 00 00 00 add $0x1,%eax
45: R_386_GOTPC _GLOBAL_OFFSET_TABLE_
49: c7 45 f4 1e 00 00 00 movl $0x1e,-0xc(%ebp)
50: 8b 88 00 00 00 00 mov 0x0(%eax),%ecx
52: R_386_GOTOFF global_init_var
56: 8b 55 f4 mov -0xc(%ebp),%edx
59: 01 ca add %ecx,%edx
5b: 8b 80 04 00 00 00 mov 0x4(%eax),%eax
5d: R_386_GOTOFF .data
61: 01 d0 add %edx,%eax
63: 83 ec 0c sub $0xc,%esp
66: 50 push %eax
67: e8 fc ff ff ff call 68 <main+0x3a>
68: R_386_PC32 func
6c: 83 c4 10 add $0x10,%esp
6f: 90 nop
70: 8b 4d fc mov -0x4(%ebp),%ecx
73: c9 leave
74: 8d 61 fc lea -0x4(%ecx),%esp
77: c3 ret
```
`Contents of section .text``.text` 的数据的十六进制形式,总共 0x78 个字节,最左边一列是偏移量,中间 4 列是内容,最右边一列是 ASCII 码形式。下面的 `Disassembly of section .text` 是反汇编结果。
### 数据段和只读数据段
```text
......
Sections:
Idx Name Size VMA LMA File off Algn
2 .data 00000008 00000000 00000000 000000b4 2**2
CONTENTS, ALLOC, LOAD, DATA
4 .rodata 00000004 00000000 00000000 000000bc 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
......
Contents of section .data:
0000 0a000000 14000000 ........
Contents of section .rodata:
0000 25640a00 %d..
.......
```
`.data` 段保存已经初始化了的全局变量和局部静态变量。`elfDemo.c` 中共有两个这样的变量,`global_init_var` 和 `local_static_init_var`,每个变量 4 个字节,一共 8 个字节。由于小端序的原因,`0a000000` 表示 `global_init_var` 值(`10`)的十六进制 `0x0a``14000000` 表示 `local_static_init_var` 值(`20`)的十六进制 `0x14`
`.rodata` 段保存只读数据,包括只读变量和字符串常量。`elfDemo.c` 中调用 `printf` 的时候,用到了一个字符串变量 `%d\n`,它是一种只读数据,保存在 `.rodata` 段中,可以从输出结果看到字符串常量的 ASCII 形式,以 `\0` 结尾。
### BSS段
```text
Sections:
Idx Name Size VMA LMA File off Algn
3 .bss 00000004 00000000 00000000 000000bc 2**2
ALLOC
```
`.bss` 段保存未初始化的全局变量和局部静态变量。
## ELF 文件结构
对象文件参与程序链接构建程序和程序执行运行程序。ELF 结构几相关信息在 `/usr/include/elf.h` 文件中。
![img](../pic/1.5.3_format.png)
- **ELF 文件头ELF Header** 在目标文件格式的最前面,包含了描述整个文件的基本属性。
- **程序头表Program Header Table** 是可选的,它告诉系统怎样创建一个进程映像。可执行文件必须有程序头表,而重定位文件不需要。
- **段Section** 包含了链接视图中大量的目标文件信息。
- **段表Section Header Table** 包含了描述文件中所有段的信息。
### 32位数据类型
| 名称 | 长度 | 对其 | 描述 | 原始类型 |
| --- | --- | --- | --- | --- |
| Elf32_Addr | 4 | 4 | 无符号程序地址 | uint32_t |
| Elf32_Half | 2 | 2 | 无符号短整型 | uint16_t |
| Elf32_Off | 4 | 4 | 无符号偏移地址 | uint32_t |
| Elf32_Sword | 4 | 4 | 有符号整型 | int32_t |
| Elf32_Word | 4 | 4 | 无符号整型 | uint32_t |
### 文件头
ELF 文件头必然存在于 ELF 文件的开头,表明这是一个 ELF 文件。定义如下:
```C
typedef struct
{
unsigned char e_ident[EI_NIDENT]; /* Magic number and other info */
Elf32_Half e_type; /* Object file type */
Elf32_Half e_machine; /* Architecture */
Elf32_Word e_version; /* Object file version */
Elf32_Addr e_entry; /* Entry point virtual address */
Elf32_Off e_phoff; /* Program header table file offset */
Elf32_Off e_shoff; /* Section header table file offset */
Elf32_Word e_flags; /* Processor-specific flags */
Elf32_Half e_ehsize; /* ELF header size in bytes */
Elf32_Half e_phentsize; /* Program header table entry size */
Elf32_Half e_phnum; /* Program header table entry count */
Elf32_Half e_shentsize; /* Section header table entry size */
Elf32_Half e_shnum; /* Section header table entry count */
Elf32_Half e_shstrndx; /* Section header string table index */
} Elf32_Ehdr;
typedef struct
{
unsigned char e_ident[EI_NIDENT]; /* Magic number and other info */
Elf64_Half e_type; /* Object file type */
Elf64_Half e_machine; /* Architecture */
Elf64_Word e_version; /* Object file version */
Elf64_Addr e_entry; /* Entry point virtual address */
Elf64_Off e_phoff; /* Program header table file offset */
Elf64_Off e_shoff; /* Section header table file offset */
Elf64_Word e_flags; /* Processor-specific flags */
Elf64_Half e_ehsize; /* ELF header size in bytes */
Elf64_Half e_phentsize; /* Program header table entry size */
Elf64_Half e_phnum; /* Program header table entry count */
Elf64_Half e_shentsize; /* Section header table entry size */
Elf64_Half e_shnum; /* Section header table entry count */
Elf64_Half e_shstrndx; /* Section header string table index */
} Elf64_Ehdr;
```
`e_ident` 保存着 ELF 的幻数和其他信息,最前面四个字节是幻数,用字符串表示为 `\177ELF`,其后的字节如果是 32 位则是 ELFCLASS32 (1),如果是 64 位则是 ELFCLASS64 (2),再其后的字节表示端序,小端序为 ELFDATA2LSB (1),大端序为 ELFDATA2LSB (2)。最后一个字节则表示 ELF 的版本。
现在我们使用 readelf 命令来查看 elfDome.out 的文件头:
```text
$ readelf -h elfDemo.out
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: DYN (Shared object file)
Machine: Intel 80386
Version: 0x1
Entry point address: 0x3e0
Start of program headers: 52 (bytes into file)
Start of section headers: 6288 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 9
Size of section headers: 40 (bytes)
Number of section headers: 30
Section header string table index: 29
```
### 程序头
程序头表是由 ELF 头的 `e_phoff` 指定的偏移量和 `e_phentsize`、`e_phnum` 共同确定大小的表格组成。`e_phentsize` 表示表格中程序头的大小,`e_phnum` 表示表格中程序头的数量。
程序头的定义如下:
```C
typedef struct
{
Elf32_Word p_type; /* Segment type */
Elf32_Off p_offset; /* Segment file offset */
Elf32_Addr p_vaddr; /* Segment virtual address */
Elf32_Addr p_paddr; /* Segment physical address */
Elf32_Word p_filesz; /* Segment size in file */
Elf32_Word p_memsz; /* Segment size in memory */
Elf32_Word p_flags; /* Segment flags */
Elf32_Word p_align; /* Segment alignment */
} Elf32_Phdr;
typedef struct
{
Elf64_Word p_type; /* Segment type */
Elf64_Word p_flags; /* Segment flags */
Elf64_Off p_offset; /* Segment file offset */
Elf64_Addr p_vaddr; /* Segment virtual address */
Elf64_Addr p_paddr; /* Segment physical address */
Elf64_Xword p_filesz; /* Segment size in file */
Elf64_Xword p_memsz; /* Segment size in memory */
Elf64_Xword p_align; /* Segment alignment */
} Elf64_Phdr;
```
使用 readelf 来查看程序头:
```text
$ readelf -l elfDemo.out
Elf file type is DYN (Shared object file)
Entry point 0x3e0
There are 9 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000034 0x00000034 0x00000034 0x00120 0x00120 R E 0x4
INTERP 0x000154 0x00000154 0x00000154 0x00013 0x00013 R 0x1
[Requesting program interpreter: /lib/ld-linux.so.2]
LOAD 0x000000 0x00000000 0x00000000 0x00780 0x00780 R E 0x1000
LOAD 0x000ef4 0x00001ef4 0x00001ef4 0x00130 0x0013c RW 0x1000
DYNAMIC 0x000efc 0x00001efc 0x00001efc 0x000f0 0x000f0 RW 0x4
NOTE 0x000168 0x00000168 0x00000168 0x00044 0x00044 R 0x4
GNU_EH_FRAME 0x000624 0x00000624 0x00000624 0x00044 0x00044 R 0x4
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x10
GNU_RELRO 0x000ef4 0x00001ef4 0x00001ef4 0x0010c 0x0010c R 0x1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .plt.got .text .fini .rodata .eh_frame_hdr .eh_frame
03 .init_array .fini_array .dynamic .got .got.plt .data .bss
04 .dynamic
05 .note.ABI-tag .note.gnu.build-id
06 .eh_frame_hdr
07
08 .init_array .fini_array .dynamic .got
```
### 段
段表Section Header Table是一个以 `Elf32_Shdr` 结构体为元素的数组每个结构体对应一个段它描述了各个段的信息。ELF 文件头的 `e_shoff` 成员给出了段表在 ELF 中的偏移,`e_shnum` 成员给出了段描述符的数量,`e_shentsize` 给出了每个段描述符的大小。
```C
typedef struct
{
Elf32_Word sh_name; /* Section name (string tbl index) */
Elf32_Word sh_type; /* Section type */
Elf32_Word sh_flags; /* Section flags */
Elf32_Addr sh_addr; /* Section virtual addr at execution */
Elf32_Off sh_offset; /* Section file offset */
Elf32_Word sh_size; /* Section size in bytes */
Elf32_Word sh_link; /* Link to another section */
Elf32_Word sh_info; /* Additional section information */
Elf32_Word sh_addralign; /* Section alignment */
Elf32_Word sh_entsize; /* Entry size if section holds table */
} Elf32_Shdr;
typedef struct
{
Elf64_Word sh_name; /* Section name (string tbl index) */
Elf64_Word sh_type; /* Section type */
Elf64_Xword sh_flags; /* Section flags */
Elf64_Addr sh_addr; /* Section virtual addr at execution */
Elf64_Off sh_offset; /* Section file offset */
Elf64_Xword sh_size; /* Section size in bytes */
Elf64_Word sh_link; /* Link to another section */
Elf64_Word sh_info; /* Additional section information */
Elf64_Xword sh_addralign; /* Section alignment */
Elf64_Xword sh_entsize; /* Entry size if section holds table */
} Elf64_Shdr;
```
使用 readelf 命令查看目标文件中完整的段:
```text
$ readelf -S elfDemo.o
There are 15 section headers, starting at offset 0x41c:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .group GROUP 00000000 000034 000008 04 12 16 4
[ 2] .text PROGBITS 00000000 00003c 000078 00 AX 0 0 1
[ 3] .rel.text REL 00000000 000338 000048 08 I 12 2 4
[ 4] .data PROGBITS 00000000 0000b4 000008 00 WA 0 0 4
[ 5] .bss NOBITS 00000000 0000bc 000004 00 WA 0 0 4
[ 6] .rodata PROGBITS 00000000 0000bc 000004 00 A 0 0 1
[ 7] .text.__x86.get_p PROGBITS 00000000 0000c0 000004 00 AXG 0 0 1
[ 8] .comment PROGBITS 00000000 0000c4 000012 01 MS 0 0 1
[ 9] .note.GNU-stack PROGBITS 00000000 0000d6 000000 00 0 0 1
[10] .eh_frame PROGBITS 00000000 0000d8 00007c 00 A 0 0 4
[11] .rel.eh_frame REL 00000000 000380 000018 08 I 12 10 4
[12] .symtab SYMTAB 00000000 000154 000140 10 13 13 4
[13] .strtab STRTAB 00000000 000294 0000a2 00 0 0 1
[14] .shstrtab STRTAB 00000000 000398 000082 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
p (processor specific)
```
注意ELF 段表的第一个元素是被保留的,类型为 NULL。
### 字符串表
字符串表以段的形式存在,包含了以 null 结尾的字符序列。对象文件使用这些字符串来表示符号和段名称引用字符串时只需给出在表中的偏移即可。字符串表的第一个字符和最后一个字符为空字符以确保所有字符串的开始和终止。通常段名为 `.strtab` 的字符串表是 **字符串表Strings Table**,段名为 `.shstrtab` 的是段表字符串表Section Header String Table
| 偏移 | +0 | +1 | +2 | +3 | +4 | +5 | +6 | +7 | +8 | +9 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| **+0** | \0 | h | e | l | l | o | \0 | w | o | r |
| **+10** | l | d | \0 | h | e | l | l | o | w | o |
| **+20** | r | l | d | \0 |
| 偏移 | 字符串 |
| --- | --- |
| 0 | 空字符串 |
| 1 | hello |
| 7 | world |
| 13 | helloworld |
| 18 | world |
可以使用 readelf 读取这两个表:
```text
$ readelf -x .strtab elfDemo.o
Hex dump of section '.strtab':
0x00000000 00656c66 44656d6f 2e63006c 6f63616c .elfDemo.c.local
0x00000010 5f737461 7469635f 696e6974 5f766172 _static_init_var
0x00000020 2e323139 35006c6f 63616c5f 73746174 .2195.local_stat
0x00000030 69635f75 6e696e69 745f7661 722e3231 ic_uninit_var.21
0x00000040 39360067 6c6f6261 6c5f696e 69745f76 96.global_init_v
0x00000050 61720067 6c6f6261 6c5f756e 696e6974 ar.global_uninit
0x00000060 5f766172 0066756e 63005f5f 7838362e _var.func.__x86.
0x00000070 6765745f 70635f74 68756e6b 2e617800 get_pc_thunk.ax.
0x00000080 5f474c4f 42414c5f 4f464653 45545f54 _GLOBAL_OFFSET_T
0x00000090 41424c45 5f007072 696e7466 006d6169 ABLE_.printf.mai
0x000000a0 6e00
$ readelf -x .shstrtab elfDemo.o
Hex dump of section '.shstrtab':
0x00000000 002e7379 6d746162 002e7374 72746162 ..symtab..strtab
0x00000010 002e7368 73747274 6162002e 72656c2e ..shstrtab..rel.
0x00000020 74657874 002e6461 7461002e 62737300 text..data..bss.
0x00000030 2e726f64 61746100 2e746578 742e5f5f .rodata..text.__
0x00000040 7838362e 6765745f 70635f74 68756e6b x86.get_pc_thunk
0x00000050 2e617800 2e636f6d 6d656e74 002e6e6f .ax..comment..no
0x00000060 74652e47 4e552d73 7461636b 002e7265 te.GNU-stack..re
0x00000070 6c2e6568 5f667261 6d65002e 67726f75 l.eh_frame..grou
0x00000080 7000
```
### 符号表
目标文件的符号表保存了定位和重定位程序的符号定义和引用所需的信息。符号表索引是这个数组的下标。索引0指向表中的第一个条目,作为未定义的符号索引。
```C
typedef struct
{
Elf32_Word st_name; /* Symbol name (string tbl index) */
Elf32_Addr st_value; /* Symbol value */
Elf32_Word st_size; /* Symbol size */
unsigned char st_info; /* Symbol type and binding */
unsigned char st_other; /* Symbol visibility */
Elf32_Section st_shndx; /* Section index */
} Elf32_Sym;
typedef struct
{
Elf64_Word st_name; /* Symbol name (string tbl index) */
unsigned char st_info; /* Symbol type and binding */
unsigned char st_other; /* Symbol visibility */
Elf64_Section st_shndx; /* Section index */
Elf64_Addr st_value; /* Symbol value */
Elf64_Xword st_size; /* Symbol size */
} Elf64_Sym;
```
查看符号表:
```text
$ readelf -s elfDemo.o
Symbol table '.symtab' contains 20 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 0 FILE LOCAL DEFAULT ABS elfDemo.c
2: 00000000 0 SECTION LOCAL DEFAULT 2
3: 00000000 0 SECTION LOCAL DEFAULT 4
4: 00000000 0 SECTION LOCAL DEFAULT 5
5: 00000000 0 SECTION LOCAL DEFAULT 6
6: 00000004 4 OBJECT LOCAL DEFAULT 4 local_static_init_var.219
7: 00000000 4 OBJECT LOCAL DEFAULT 5 local_static_uninit_var.2
8: 00000000 0 SECTION LOCAL DEFAULT 7
9: 00000000 0 SECTION LOCAL DEFAULT 9
10: 00000000 0 SECTION LOCAL DEFAULT 10
11: 00000000 0 SECTION LOCAL DEFAULT 8
12: 00000000 0 SECTION LOCAL DEFAULT 1
13: 00000000 4 OBJECT GLOBAL DEFAULT 4 global_init_var
14: 00000004 4 OBJECT GLOBAL DEFAULT COM global_uninit_var
15: 00000000 46 FUNC GLOBAL DEFAULT 2 func
16: 00000000 0 FUNC GLOBAL HIDDEN 7 __x86.get_pc_thunk.ax
17: 00000000 0 NOTYPE GLOBAL DEFAULT UND _GLOBAL_OFFSET_TABLE_
18: 00000000 0 NOTYPE GLOBAL DEFAULT UND printf
19: 0000002e 74 FUNC GLOBAL DEFAULT 2 main
```
### 重定位
重定位是连接符号定义与符号引用的过程。可重定位文件必须具有描述如何修改段内容的信息,从而运行可执行文件和共享对象文件保存进程程序映像的正确信息。
```C
typedef struct
{
Elf32_Addr r_offset; /* Address */
Elf32_Word r_info; /* Relocation type and symbol index */
} Elf32_Rel;
typedef struct
{
Elf64_Addr r_offset; /* Address */
Elf64_Xword r_info; /* Relocation type and symbol index */
Elf64_Sxword r_addend; /* Addend */
} Elf64_Rela;
```
查看重定位表:
```text
$ readelf -r elfDemo.o
Relocation section '.rel.text' at offset 0x338 contains 9 entries:
Offset Info Type Sym.Value Sym. Name
00000008 00001002 R_386_PC32 00000000 __x86.get_pc_thunk.ax
0000000d 0000110a R_386_GOTPC 00000000 _GLOBAL_OFFSET_TABLE_
00000019 00000509 R_386_GOTOFF 00000000 .rodata
00000021 00001204 R_386_PLT32 00000000 printf
00000040 00001002 R_386_PC32 00000000 __x86.get_pc_thunk.ax
00000045 0000110a R_386_GOTPC 00000000 _GLOBAL_OFFSET_TABLE_
00000052 00000d09 R_386_GOTOFF 00000000 global_init_var
0000005d 00000309 R_386_GOTOFF 00000000 .data
00000068 00000f02 R_386_PC32 00000000 func
Relocation section '.rel.eh_frame' at offset 0x380 contains 3 entries:
Offset Info Type Sym.Value Sym. Name
00000020 00000202 R_386_PC32 00000000 .text
00000044 00000202 R_386_PC32 00000000 .text
00000070 00000802 R_386_PC32 00000000 .text.__x86.get_pc_thu
```
## 参考资料
- `$ man elf`
- [Acronyms relevant to Executable and Linkable Format (ELF)](https://www.cs.stevens.edu/~jschauma/631/elf.html)