Adapted from SEED Labs: A Hands-on Lab for Security Education.
A buffer overflow is defined as the act of writing data beyond the boundary of allocated memory space (e.g., a buffer). This vulnerability can be used by a malicious user to alter the flow control of the program, leading to the execution of malicious code. The objective of this lab is for students to gain practical insights into this type of vulnerability, and to learn how to exploit the vulnerability.
In this lab, students will be given a program with a buffer-overflow vulnerability; their task is to develop a scheme to exploit the vulnerability, and ultimately gain root privileges on the system. In addition to the attacks we study, students will be guided through several protection schemes that have been implemented in the operating system as countermeasures to buffer-overflow attacks. Students will evaluate how the schemes work as well as their potential limitations.
This lab covers the following topics:
This lab is an adaptation of the SEED Labs “Buffer Overflow Attack Lab”. (Specifically, the Set-UID version.)
03_buffer_overflow/
of our class’s GitHub repository.Modern operating systems have implemented several security mechanisms to make the buffer-overflow attack difficult. To simplify our attacks, we need to disable them first. Later on, we will enable them and see whether our attack can still be successful or not.
Ubuntu and several other Linux-based systems uses ASLR (address space layout randomization) to randomize the starting address of heap and stack. This makes guessing the exact addresses difficult; guessing addresses is one of the critical steps of buffer-overflow attacks. This feature can be disabled using the following command:
$ sudo sysctl -w kernel.randomize_va_space=0
/bin/sh
In the recent versions of Ubuntu OS, the /bin/sh
symbolic link points
to the /bin/dash
shell.
The dash and bash shells have implemented a security countermeasure that prevents itself from being executed in a set-uid process.
Basically, if they detect that they are executed in a set-uid process,
they will immediately change the effective user ID to the process’s real user ID,
effectively dropping any elevated privileges.
The victim of many of our attacks in this lab is a set-uid program, and our attack relies on running /bin/sh
;
the countermeasure in /bin/dash
makes our attack more difficult.
Therefore, we will link /bin/sh
to another shell that does not have this countermeasure.
(In later tasks, we will show that with a little more effort, the countermeasure in /bin/dash
can be easily defeated!)
We have installed a shell program called zsh
in our Ubuntu 20.04 VM.
The following command can be used to link /bin/sh
to /bin/zsh
:
$ sudo ln -sf /bin/zsh /bin/sh
You can verify how /bin/sh
is configured at any time:
$ ls -l /bin/sh /bin/zsh /bin/dash
The following program has a buffer-overflow vulnerability.
Your main objective throughout parts of this lab will be to exploit this vulnerability and get a shell with root privileges.
The program first reads in input from a file called badfile
, and ultimately passes this input to another buffer in the function bof()
.
The original input can have a maximum length of 517
bytes, but the buffer in bof()
is only BUF_SIZE
bytes long, which is less than 517
.
Because strcpy()
does not check boundaries, a buffer overflow can occur.
In this lab, this program is will be compiled and run as a root-owned set-uid program;
if a normal user can exploit this buffer overflow vulnerability, the user might be able to get a root shell.
It should be noted that the program gets its input from a file called badfile
.
The contents of this file are specified by an untrusted user (you!).
Thus, your objective is to create the badfile
with the necessary contents such that,
when the vulnerable program copies the contents into its buffer, a root shell gets spawned.
When compiling the above program for this task,
we must not forget to turn off the StackGuard (-fno-stack-protector
) and the non-executable stack (-z execstack
) countermeasures.
After the compilation, we need to make the program a root-owned set-uid program.
We can achieve this by first changing the ownership of the program to root
, and then changing the permissions for the executable to 4755
, which enables the set-uid bit.
It should be noted that changing ownership must
be done before enabling the set-uid bit; changing ownership will cause the set-uid bit to be turned off.
In summary, a command sequence such as this will yield the desired setup:
$ gcc -DBUF_SIZE=100 -m32 -o stack -z execstack -fno-stack-protector stack.c
$ sudo chown root stack # change owner to root
$ sudo chmod 4755 stack # flip the set-uid bit
The compilation and setup commands are already included in the Makefile,
so your just need to type make
in this directory to execute the needed commands.
Note that the example here, where we set the
BUF_SIZE=100
is just an example. In the Makefile there are variablesL1
, ...,L4
, which are used during the compilation. This program and Makefile can actually be configured in different ways, mostly aimed at varying the buffer size used in the program. I have configured these values for you to ensure that you compile the program for various tasks with the correct buffer size. DO NOT CHANGE THESE VALUES!
When you compile with Makefile, it will generate 3 different executables, stack-L1, stack-L2, and stack-L3. You will ONLY use stack-L1 in this lab. When I was creating this lab, I removed the task that used level 2 and level 3.
This lab has been tested on the pre-built SEED VM (Ubuntu 20.04 VM).
The ultimate goal of the buffer-overflow attacks we’ll study in this lab is to inject malicious code into the target program, so the code can be executed using the target program’s privileges (yes, we’ll target root-owned set-uid programs as in labs past!). Shellcode is widely used in most code-injection attacks. In this task we will spend some time getting familiar with shellcode.
ing the Shellcode -->In class, we walked through how the 32-bit shellcode works. 64-bit shellcode is quite similar to the 32-bit shellcode, except that the names of the registers are different and the registers used by the
execve()
system call are also different.
In this task, you will examine different versions of shellcode.
Specifically, the code above includes two copies of shellcode:
one is the 32-bit shellcode and the other is 64-bit shellcode.
When we compile the program using the -m32
flag, the 32-bit version will be used; without this flag, the 64-bit version will be used.
Using the provided Makefile,
you can compile the code by typing make
in that directory.
The Makefile will produce two binaries: a32.out
(32-bit shellcode) and a64.out
(64-bit shellcode).
Please compile and run both executables, and describe your observations.
Please briefly describe what this program is doing; i.e., what does the code in main()
actually do?
NOTE: If you look at the Makefile you can see that we use the
execstack
option when compiling the programs, which allows code to be executed from the stack; without this option, the program will fail.
In this task, you need to compile the vulnerable program into a 32-bit binary called stack-L1
.
The compilation and setup commands are already included in
Makefile
.
For each “level” we explore, our Makefile will also compile a version of the executable suitable for debugging (e.g.,
stack-L1-dbg
)
To exploit the buffer-overflow vulnerability in the target program, the
most important thing to know is the distance between the buffer’s
starting position and the place where the return-address is stored.
We will use a debugging method to determine this value.
Since we have the source code of the target program, we can compile it with the debugging flag (-g
) turned on, which makes debugging a lot more convenient.
(You should be using our provided Makefile. If you are, when you run make
, the debugging version should be created automatically.)
We will use gdb
(gdb cheatsheet!) to debug stack-L1-dbg
.
Before running the program under gdb
, we need to create a file called badfile
.
Now, use gdb
as follows to determine the buffer/ebp offset,
which you can use to determine where the return address should be in memory.
$ touch badfile # <= Create an empty badfile
$ gdb stack-L1-dbg
gdb-peda$ b bof # <= Set a break point at function bof()
Breakpoint 1 at 0x124d: file stack.c, line 18.
gdb-peda$ run # <= Start executing the program
...
Breakpoint 1, bof (str=0xffffcf57 ...) at stack.c:18
18 {
gdb-peda$ next # <= See the note below
...
22 strcpy(buffer, str);
gdb-peda$ p $ebp # <= Get the ebp value
$1 = (void *) ADDR1
gdb-peda$ p &buffer # <= Get the buffer's address
$2 = (char (*)[100]) ADDR2
gdb-peda$ p/d ADDR1 - ADDR2
$3 = ???
gdb-peda$ quit # <= exit
Note: Getting
bof()
’sebp
When
gdb
stops inside thebof()
function, it stops before theebp
register is set to point to the current stack frame, so if we print out the value ofebp
here, we will get the caller’sebp
value. Thus, we need to usenext
to execute a few instructions and stop after theebp
register is modified to point to the stack frame of thebof()
function; i.e., we need to get past the function prologue to ensure thatebp
is set to the callee’s frame pointer, not the caller’s frame pointer.
Note: The Woes of Using A Debugger
It should be noted that the frame pointer value obtained when using
gdb
is different from that during the actual execution (without usinggdb
). This is becausegdb
has pushed some environment data into the stack before running the debugged program. When the program runs directly without usinggdb
, the stack does not have that data, so the actual frame pointer value will be “larger” (aka higher in memory). You should keep this in mind when constructing your payload.
To exploit the buffer-overflow vulnerability in the target program, you need to prepare a payload, and save it inside badfile
.
One could do this manually (sounds tedious…) or use another program, such as a Python script to help us make our badfile
(yay, Python!).
For this lab, we provide a skeleton program called exploit.py
.
Note that the code is incomplete, however, and students need to replace some of the essential values in the code to generate a suitable badfile
.
After you finish the above program, run it.
This will generate the contents for your badfile
.
Then run the vulnerable program for this task.
If your exploit is implemented correctly, you should be able to get a root shell!
$./exploit.py # create the badfile
$./stack-L1 # launch the attack by running the vulnerable program
# <---------------- Bingo! Root shell! (You can also verify with commands like `id`)
In your lab report, in addition to providing screenshots and/or code/command snippets to demonstrate your investigation (Task 2.1) and attack (Task 2.2),
you also need to explain how the values used in your exploit.py
were decided.
Since we provide the skeleton code (exploit.py
),
these values really are the most important part of the attack;
a detailed explanation verifies that you understand what is going on here.
To be clear, only demonstrating a successful attack without explaining why/how the attack works will not receive full credit.
dash
’s CountermeasureThe dash
shell in the Ubuntu OS drops privileges when it detects that the effective UID is not equal to the real UID (i.e., EUID != RUID),
which is the case in a set-uid program.
This is achieved by changing the effective UID back to the real UID, essentially, dropping any elevated privilege.
In previous tasks, we let /bin/sh
points to another shell called zsh
, which does not implement this countermeasure.
In this task, we will change our shell back to dash
, and see how we can defeat this countermeasure.
First, set your shell back to dash
:
$ sudo ln -sf /bin/dash /bin/sh
To defeat the countermeasure in our buffer-overflow attacks, all we need to do is to change the real UID, so it equals the effective UID.
When a root-owned set-uid program runs, the effective UID is zero, so before we invoke the shell program, we just need to change the real UID to zero
(which we can do… because at the time that we do this we are effectively running as root!).
We can achieve this by invoking setuid(0)
before executing execve()
in the shellcode.
The assembly code to do this is already inside the call_shellcode.c
code
(it is commented out at the top of the file.)
You just need to add it to the beginning of the shellcode.
Compile call_shellcode.c
into root-owned binary.
The Makefile in the
shellcode/
folder on GitHub has a target that you can use by running:make setuid
Run both the a32.out
and a64.out
shellcode programs with and without the assembly that makes the setuid(0)
system call.
Please describe your observations and provide supporting evidence.
Now, using the updated shellcode from the previous task, we can attempt the attack again on the vulnerable program, and this time, with the shell’s countermeasure turned on. Repeat your attack on the Level 1 executable (Task 2), and see whether you can get a root shell. (Hint: you should be able to!)
After getting a root shell, please run the following commands to prove that (1) you are using a shell with countermeasure, and (2) you are running in a shell as root.
# ls -l /bin/sh /bin/zsh /bin/dash
# id
Repeating the attacks on Level 2 and beyond is not required, but please do feel free to do that and see whether those attacks work!
On 32-bit Linux machines, stacks only have 19 bits of entropy, which means the base address for the stack can have \(2^{19} = 524,288\) possibilities. This number is not that high and can be exhausted easily with a brute-force approach. In this task, we use such an approach to defeat the ASLR countermeasure on our 32-bit VM.
First, turn on ASLR using the following command:
$ sudo /sbin/sysctl -w kernel.randomize_va_space=2
Then, run the same kind of attack as before against stack-L1
.
Please describe your observations and provide supporting evidence.
Now, we can use a brute-force approach to attack the vulnerable program repeatedly, hoping that the address we put in the badfile
will eventually be correct…
For this task, you can use the following shell script to invoke the vulnerable program repeatedly (i.e., in an infinite loop!).
If your attack succeeds, the script will stop; otherwise, it will keep running.
Please describe your observations and provide supporting evidence.
Please be patient, as this may take a few minutes… but if it turns out that you are very unlucky, it may take longer… If this is the case, please don’t invite me out to the casinos with you… :-)
You only need to try this on
stack-L1
, which is a 32-bit program. A Brute-force attack on 64-bit programs is much harder, because the entropy is much larger. Although this is not required, free free to try it just for fun. Maybe let it run overnight? Who knows, you may get lucky!
In this task we will explore some of the other countermeasures that exist to defend against buffer overflow attacks.
Many compilers, such as gcc
, implement a security mechanism called StackGuard to prevent buffer overflows.
In the presence of this protection, the buffer overflow attacks we’ve studied in this lab will not work.
In our previous tasks, we disabled the StackGuard protection mechanism when compiling the programs.
In this task, we will turn it back on and see what happens.
Before diving into this task, remember to turn off the address randomization if it is still enabled!
First, repeat the Level-1 attack (Task 2) with StackGuard off, and make sure that the attack is still successful.
Then, turn on the StackGuard protection by recompiling the vulnerable stack.c
program without the -fno-stack-protector
flag.
(In gcc
version 4.3.3 and above, StackGuard is enabled by default.)
Now, conduct your attack again.
Please describe your observations and provide supporting evidence.
In the past, Operating systems did allow executable stacks, but this is not common today:
In the Ubuntu OS,
the binary images of programs (and shared libraries) must declare whether they require executable stacks or not,
i.e., they need to mark a field in the program header of the ELF binary.
The kernel and dynamic linker can use this information to decide whether to make the stack of this running program executable or non-executable.
Specifying this information is done automatically by our compiler, gcc
, which by default makes stack non-executable.
While non-executable stacks is the default setting these days, we can specifically make it non-executable using the -z noexecstack
flag when we compile the program.
In our previous tasks, we have used -z execstack
to make stacks executable.
In this task, we will make the stack non-executable.
If you recall from Task 1, the call_shellcode
program puts a copy of shellcode on the stack, and then executes the code from the stack.
Please recompile call_shellcode.c
into a32.out
and a64.out
without the -z execstack
option.
Please run them both and describe your observations. Also provide supporting evidence.
Defeating the non-executable stack countermeasure. While we will not study this idea in this lab, it should be noted that non-executable stack only makes it impossible to run shellcode on the stack, it does not prevent buffer-overflow attacks. Tthere are in fact other ways to run malicious code after exploiting a buffer-overflow vulnerability. (Think about it! How could this work?!) The return-to-libc attack is one such example. If you are interested, there is another SEED Lab (Return-to-libc Attack Lab) covering this topic. I encourage you to check it out if you are interested!
Submit your assignment as a single PDF to the appropriate D2L dropbox