2017-11-30 18:54:38 +07:00
|
|
|
|
# 5.6 LLVM
|
2018-03-12 18:04:31 +07:00
|
|
|
|
|
|
|
|
|
- [简介](#简介)
|
|
|
|
|
- [初步使用](#初步使用)
|
|
|
|
|
- [参考资料](#参考资料)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## 简介
|
|
|
|
|
LLVM 是当今炙手可热的编译器基础框架。它从一开始就采用了模块化设计的思想,使得每一个编译阶段都被独立出来,形成了一系列的库。LLVM 使用面向对象的 C++ 语言开发,为编译器开发人员提供了易用而丰富的编程接口和 API。
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## 初步使用
|
|
|
|
|
首先我们通过著名的 helloWorld 来熟悉下 LLVM 的使用。
|
|
|
|
|
```c
|
|
|
|
|
#include <stdio.h>
|
|
|
|
|
int main()
|
|
|
|
|
{
|
|
|
|
|
printf("hello, world\n");
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
将 C 源码转换成 LLVM 汇编码:
|
|
|
|
|
```
|
|
|
|
|
$ clang -emit-llvm -S hello.c -o hello.ll
|
|
|
|
|
```
|
|
|
|
|
生成的 LLVM IR 如下:
|
|
|
|
|
```
|
|
|
|
|
; ModuleID = 'hello.c'
|
|
|
|
|
source_filename = "hello.c"
|
|
|
|
|
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
|
|
|
|
|
target triple = "x86_64-unknown-linux-gnu"
|
|
|
|
|
|
|
|
|
|
@.str = private unnamed_addr constant [14 x i8] c"hello, world\0A\00", align 1
|
|
|
|
|
|
|
|
|
|
; Function Attrs: noinline nounwind optnone sspstrong uwtable
|
|
|
|
|
define i32 @main() #0 {
|
|
|
|
|
%1 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([14 x i8], [14 x i8]* @.str, i32 0, i32 0))
|
|
|
|
|
ret i32 0
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
declare i32 @printf(i8*, ...) #1
|
|
|
|
|
|
|
|
|
|
attributes #0 = { noinline nounwind optnone sspstrong uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
|
|
|
|
|
attributes #1 = { "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
|
|
|
|
|
|
|
|
|
|
!llvm.module.flags = !{!0, !1, !2}
|
|
|
|
|
!llvm.ident = !{!3}
|
|
|
|
|
|
|
|
|
|
!0 = !{i32 1, !"wchar_size", i32 4}
|
|
|
|
|
!1 = !{i32 7, !"PIC Level", i32 2}
|
|
|
|
|
!2 = !{i32 7, !"PIE Level", i32 2}
|
|
|
|
|
!3 = !{!"clang version 5.0.1 (tags/RELEASE_501/final)"}
|
|
|
|
|
```
|
|
|
|
|
该过程从词法分析开始,将 C 源码分解成 token 流,然后传递给语法分析器,语法分析器在 CFG(上下文无关文法)的指导下将 token 流组织成 AST(抽象语法树),接下来进行语义分析,检查语义正确性,最后生成 IR。
|
|
|
|
|
|
|
|
|
|
LLVM bitcode 有两部分组成:位流,以及将 LLVM IR 编码成位流的编码格式。使用汇编器 llvm-as 将 LLVM IR 转换成 bitcode:
|
|
|
|
|
```
|
|
|
|
|
$ llvm-as hello.ll -o hello.bc
|
|
|
|
|
```
|
|
|
|
|
结果如下:
|
|
|
|
|
```
|
|
|
|
|
$ file hello.bc
|
|
|
|
|
hello.bc: LLVM IR bitcode
|
|
|
|
|
$ xxd -g1 hello.bc | head -n5
|
|
|
|
|
00000000: 42 43 c0 de 35 14 00 00 05 00 00 00 62 0c 30 24 BC..5.......b.0$
|
|
|
|
|
00000010: 49 59 be 66 ee d3 7e 2d 44 01 32 05 00 00 00 00 IY.f..~-D.2.....
|
|
|
|
|
00000020: 21 0c 00 00 4d 02 00 00 0b 02 21 00 02 00 00 00 !...M.....!.....
|
|
|
|
|
00000030: 13 00 00 00 07 81 23 91 41 c8 04 49 06 10 32 39 ......#.A..I..29
|
|
|
|
|
00000040: 92 01 84 0c 25 05 08 19 1e 04 8b 62 80 10 45 02 ....%......b..E.
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
反过来将 bitcode 转回 LLVM IR 也是可以的,使用反汇编器 llvm-dis:
|
|
|
|
|
```
|
|
|
|
|
$ llvm-dis hello.bc -o hello.ll
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
其实 LLVM 可以利用工具 lli 的即时编译器(JIT)直接执行 bitcode 格式的程序:
|
|
|
|
|
```
|
|
|
|
|
$ lli hello.bc
|
|
|
|
|
hello, world
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
接下来使用静态编译器 llc 命令可以将 bitcode 编译为特定架构的汇编语言:
|
|
|
|
|
```
|
|
|
|
|
$ llc -march=x86-64 hello.bc -o hello.s
|
|
|
|
|
```
|
|
|
|
|
也可以使用 clang 来生成,结果是一样的:
|
|
|
|
|
```
|
|
|
|
|
$ clang -S hello.bc -o hello.s -fomit-frame-pointer
|
|
|
|
|
```
|
|
|
|
|
结果如下:
|
|
|
|
|
```asm
|
|
|
|
|
.text
|
|
|
|
|
.file "hello.c"
|
|
|
|
|
.globl main # -- Begin function main
|
|
|
|
|
.p2align 4, 0x90
|
|
|
|
|
.type main,@function
|
|
|
|
|
main: # @main
|
|
|
|
|
.cfi_startproc
|
|
|
|
|
# BB#0:
|
|
|
|
|
pushq %rbp
|
|
|
|
|
.Lcfi0:
|
|
|
|
|
.cfi_def_cfa_offset 16
|
|
|
|
|
.Lcfi1:
|
|
|
|
|
.cfi_offset %rbp, -16
|
|
|
|
|
movq %rsp, %rbp
|
|
|
|
|
.Lcfi2:
|
|
|
|
|
.cfi_def_cfa_register %rbp
|
|
|
|
|
movabsq $.L.str, %rdi
|
|
|
|
|
movb $0, %al
|
|
|
|
|
callq printf
|
|
|
|
|
xorl %eax, %eax
|
|
|
|
|
popq %rbp
|
|
|
|
|
retq
|
|
|
|
|
.Lfunc_end0:
|
|
|
|
|
.size main, .Lfunc_end0-main
|
|
|
|
|
.cfi_endproc
|
|
|
|
|
# -- End function
|
|
|
|
|
.type .L.str,@object # @.str
|
|
|
|
|
.section .rodata.str1.1,"aMS",@progbits,1
|
|
|
|
|
.L.str:
|
|
|
|
|
.asciz "hello, world\n"
|
|
|
|
|
.size .L.str, 14
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
.ident "clang version 5.0.1 (tags/RELEASE_501/final)"
|
|
|
|
|
.section ".note.GNU-stack","",@progbits
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## 参考资料
|
|
|
|
|
- [llvm documentation](http://llvm.org/docs/index.html)
|