User Tools

Site Tools


session:extra:fuzzing

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
session:extra:fuzzing [2017/07/20 12:11]
Lucian Mogosanu [Intro]
session:extra:fuzzing [2020/07/19 12:49] (current)
Line 1: Line 1:
 ====== 0x0C. Fuzzing ====== ====== 0x0C. Fuzzing ======
  
-== Slides+===== Slides =====
  
-Slides are available {{:session:session_12.pdf|here}}.+Slides are available {{:session:s12-slides-fuzzing.pdf|here}}.
  
-== Tutorials +===== Tutorials ===== 
-=== Introduction+==== Introduction ====
  
 Fuzzing is a technique for testing certain kinds of software by feeding the target with thousands of random generated inputs. From now on, **the target** is the software program that we test using the fuzzer. Fuzzing is used by companies to test their internal developed software, or by security companies to analyze interesting pieces of software(delivered as binaries). Fuzzing is a technique for testing certain kinds of software by feeding the target with thousands of random generated inputs. From now on, **the target** is the software program that we test using the fuzzer. Fuzzing is used by companies to test their internal developed software, or by security companies to analyze interesting pieces of software(delivered as binaries).
Line 12: Line 12:
 Fuzzers are divided into two main categories based on the target form as **source code** or **binary file**. Fuzzers that work with “source code”, can use compiler features to instrument the binary code with coverage handlers and sanitizers, thus the fuzzer can use information from the target itself. The second class of fuzzers, that work with binary files, must run each  instruction in an environment that allows the fuzzer to collect execution feedback data. In the following tutorial, we will consider only the latter class. Fuzzers are divided into two main categories based on the target form as **source code** or **binary file**. Fuzzers that work with “source code”, can use compiler features to instrument the binary code with coverage handlers and sanitizers, thus the fuzzer can use information from the target itself. The second class of fuzzers, that work with binary files, must run each  instruction in an environment that allows the fuzzer to collect execution feedback data. In the following tutorial, we will consider only the latter class.
  
-=== Basic Blocks+==== Basic Blocks ====
 In order to understand what fuzzers try to achieve, we must understand the flow graph of a program. The **flow graph** is the target program layout representing all the paths that can be traversed during program execution. Its representation corresponds to a graph where the nodes correspond to basic blocks. The direct edge between two basic blocks represent that there is a jump instruction after the first block that directs the execution to the second basic block. The first basic block, called also the **entry block** is the first block to be executed. In order to understand what fuzzers try to achieve, we must understand the flow graph of a program. The **flow graph** is the target program layout representing all the paths that can be traversed during program execution. Its representation corresponds to a graph where the nodes correspond to basic blocks. The direct edge between two basic blocks represent that there is a jump instruction after the first block that directs the execution to the second basic block. The first basic block, called also the **entry block** is the first block to be executed.
  
Line 68: Line 68:
   * syscall/sysenter instruction   * syscall/sysenter instruction
  
-=== How do fuzzers work+==== How do fuzzers work ====
  
 Briefly, a fuzzer is a program the generates input randomly and feeds the input to the target program. The target program is a x86/x86_64 binary. Now we will understand how the input is generated, how does the target program receive the input and most important, what is execution feedback and how the fuzzer makes use of it. Briefly, a fuzzer is a program the generates input randomly and feeds the input to the target program. The target program is a x86/x86_64 binary. Now we will understand how the input is generated, how does the target program receive the input and most important, what is execution feedback and how the fuzzer makes use of it.
Line 131: Line 131:
 An input is **interesting** if the execution of target program with the aforementioned input will **exercise** one or more basic blocks that were not previously exercised by any other input. **Coverage-based fuzzers** rely more on increasing the number of basic blocks discovered, than on discovering a certain issue in an isolated area of the code. An input is **interesting** if the execution of target program with the aforementioned input will **exercise** one or more basic blocks that were not previously exercised by any other input. **Coverage-based fuzzers** rely more on increasing the number of basic blocks discovered, than on discovering a certain issue in an isolated area of the code.
  
-=== Fuzzing with a view+==== Fuzzing with a view ====
 {{:session:s12-sss_-_fuzzing_-_for_feedback.png}} {{:session:s12-sss_-_fuzzing_-_for_feedback.png}}
  
Line 149: Line 149:
 </code> </code>
  
-=== Driller - augmenting fuzzing with symbolic execution+==== Driller - augmenting fuzzing with symbolic execution ====
  
 Driller is a software verification system for detecting vulnerabilities in small binaries. Driller is built on top of [[http://lcamtuf.coredump.cx/afl/|afl]] and [[http://angr.io/|angr]]. Afl is the most popular fuzzing engine that discovered unknown [[http://lcamtuf.coredump.cx/afl/#bugs|critical issues]] in software like qemu, libtiff, sqlite and many more. When afl gets stuck in certain compare instructions, it may take between two minutes and two days to generate an input that will satisfy a condition. Or worse, it can take forever … Driller is a software verification system for detecting vulnerabilities in small binaries. Driller is built on top of [[http://lcamtuf.coredump.cx/afl/|afl]] and [[http://angr.io/|angr]]. Afl is the most popular fuzzing engine that discovered unknown [[http://lcamtuf.coredump.cx/afl/#bugs|critical issues]] in software like qemu, libtiff, sqlite and many more. When afl gets stuck in certain compare instructions, it may take between two minutes and two days to generate an input that will satisfy a condition. Or worse, it can take forever …
Line 178: Line 178:
 angr is the concolic execution engine integrated in driller. Now that we have an idea about how it works, let’s see some blood. angr is the concolic execution engine integrated in driller. Now that we have an idea about how it works, let’s see some blood.
  
-=== Installing Driller+==== Installing Driller ==== 
 + 
 +<note important> 
 +You can skip this part of the tutorial if you're using the [[https://security.cs.pub.ro/summer-school/wiki/session/infrastructure/vm#ubuntu-1404-64bit|Ubuntu 64bit]] virtual machine. 
 +</note>
  
 Driller is a pretty complex framework comprising a bunch of components: a fuzzer (AFL), a symbolic execution engine (angr) and some scaffolding to use the two together. If you're running the lab VMs, they will most likely come with Driller preinstalled. Otherwise, follow the steps here to (hopefully) get it up and running. Driller is a pretty complex framework comprising a bunch of components: a fuzzer (AFL), a symbolic execution engine (angr) and some scaffolding to use the two together. If you're running the lab VMs, they will most likely come with Driller preinstalled. Otherwise, follow the steps here to (hopefully) get it up and running.
Line 243: Line 247:
  
 Now we should be all done! If the first task works for you, then you should be all set. Now we should be all done! If the first task works for you, then you should be all set.
-== Tasks +===== Tasks ===== 
-=== Intro+The archive can be downloaded from {{:session:s12-tasks.tar.gz | here}}. 
 + 
 +<note important> 
 +To run Driller scripts in the Ubuntu 64bit VM, make sure that you're in the ''driller'' Python virtualenv: 
 + 
 +<code> 
 +$ workon driller 
 +(driller) $ 
 +</code> 
 +</note> 
 +==== Task 1: Intro ====
 The first task can be found in ''task-01/fst''. Let's analyze the binary. The first task can be found in ''task-01/fst''. Let's analyze the binary.
 <code bash> <code bash>
Line 263: Line 277:
 </code> </code>
  
-The first parameter passed to driller is the **binary path**. The second parameter is a reference input (it should be long enough to crash the program). The third parameter is the bitmap with no discovered transitions.+The first parameter passed to driller is the **binary path**. The second parameter is a **reference input** (it should be long enough to crash the program). The third parameter is the fuzzer bitmap with no discovered transitions.
 Run the above code. The result contains a set of inputs discovered by driller. We can see that all inputs start with "SSS". If we disassemble ''fst'' again, we can find the following instructions: Run the above code. The result contains a set of inputs discovered by driller. We can see that all inputs start with "SSS". If we disassemble ''fst'' again, we can find the following instructions:
 <code> <code>
Line 272: Line 286:
  8048516:       e8 95 fe ff ff          call   80483b0 <strncmp@plt>  8048516:       e8 95 fe ff ff          call   80483b0 <strncmp@plt>
 </code> </code>
-At address ''0x8048620'' we will find "\x53\x53\x53\x00" which translated to ascii, is "SSS". So the first three bytes must be "SSS". Now feed the driller input to ''fst'' and follow the instructions to complete this task.+At address ''0x8048620'' we will find "\x53\x53\x53\x00" which translated to ascii, is "SSS". So the first three bytes must be "SSS".
  
-=== Objdump or Driller? Which one is better?+<note important> 
 +Now feed the driller input to ''fst'' and follow the instructions to complete this task. 
 +<code bash> 
 +cat input | ./fst 
 +</code> 
 +</note> 
 + 
 +==== Task 2: Objdump or Driller? Which one is better? ====
 The second task executable can be found in ''task-02/snd''. Analyze the binary using ''file'', ''objdump'' and ''IDA''. Do not spend more than 5 minutes. If you couldn't find the proper input, give Driller a shot. The second task executable can be found in ''task-02/snd''. Analyze the binary using ''file'', ''objdump'' and ''IDA''. Do not spend more than 5 minutes. If you couldn't find the proper input, give Driller a shot.
  
Line 284: Line 305:
  
  
-=== Codegate - postbox+==== Task 3: Codegate - postbox ====
  
 The executable can be found in ''task-03/postbox''. This task was part of Codegate CTF contest. The executable can be found in ''task-03/postbox''. This task was part of Codegate CTF contest.
Line 294: Line 315:
 </code> </code>
  
-Analyze the binary using ''Ida 64'', subroutine ''sub_401F20''. Because we do not have enough time to dig into each menu and submenu of postbox, we will build a small fuzzer that will feed various inputs to the binary.+Analyze the binary using ''IDA64'', subroutine ''sub_401F20''. Because we do not have enough time to dig into each menu and submenu of postbox, we will build a small fuzzer that will feed various inputs to the binary.
  
 Open script file ''task-03/solve.py''. The menu implies that there are a lot of buffers that may be used for buffer overflow. Our job is to find out the buffer that will gain us access to ''rip register''. The script will fill the buffers with payload of 1024 bytes. Your job is to send the payload or the proper option for each step. Open script file ''task-03/solve.py''. The menu implies that there are a lot of buffers that may be used for buffer overflow. Our job is to find out the buffer that will gain us access to ''rip register''. The script will fill the buffers with payload of 1024 bytes. Your job is to send the payload or the proper option for each step.
Line 303: Line 324:
 $ dmesg | tail $ dmesg | tail
 </code> </code>
 +
 +
session/extra/fuzzing.1500541907.txt.gz · Last modified: 2017/07/20 12:11 by Lucian Mogosanu