The slides can be found here.
“To fuzz or not to fuzz ?”
Fuzzing is a software testing technique, often automated, that involves providing invalid, unexpected or random data as input to a program. From a security perspective, it's imperative for discovering new software vulnerabilities in applications where you don't have access to the underlying source code and even in applications where you do have the source code. The phases of the fuzzing process are presented in the following figure.
The first step in the fuzzing process is to identify the target which you would like to fuzz. While this is generally an application running on a system, there may be different data input vectors for it such as:
Based on these input vectors, you can have different types of fuzzers like:
For a non-exhaustive list of fuzzing tools, please visit Fuzzing.org
The next logical step is to try to understand the structure of the input data in order to see what fields should be fuzzed and with what values. Following this, we generate the fuzzed data . Depending on how you derive the fuzzing data, you can have:
When generating fuzzing data, we're trying to pick interesting values for the fields being fuzzed. Eg.
Following this, the generated fuzz data is executed against the target application and we monitor succesively for exceptions. After each fuzz test, the application is killed and restored to its initial state. Once a crash/hang has been found, we proceed to determine whether the vulnerability is exploitable.
We've hinted a bit in the preceding section about restoring the application to its initial state after each fuzz test is run. This can be as simple as just restarting the application in the case of stateless fuzzing where we have a fixed state from which we're trying to fuzz to performing the necessary operations to get the application into a specific state before performing the fuzz test in the case of stateful fuzzing. In this session we'll be covering only stateless fuzzing but we will also go into stateful fuzzing in the next session which will build on the knowledge gained in this one.
A fuzzing framework is mostly concerned with generating the fuzzing data, but it can also serve the following aditional functions:
Most fuzzing frameworks are designed only for the data generation stage, leaving the test execution to the user. For this particular lab we will be using the Sulley Fuzzing Framework which, as we'll see, covers all of the above functionality.
Sulley is a fuzzer development and fuzz testing framework consisting of multiple extensible components. It's also Python based so it can basically run on any platform.
Overall usage of Sulley breaks down to the following:
The first step in installing Sulley is to clone the Github repository for it:
git clone https://github.com/OpenRCE/sulley.git
Sulley depends upon the following libraries:
apt-get install python-pcapy python-impacket
If everything is set up appropriately you should be able to run the following commands:
sulley$ sudo python network_monitor.py ERR> USAGE: network_monitor.py <-d|--device DEVICE #> device to sniff on (see list below) [-f|--filter PCAP FILTER] BPF filter string [-P|--log_path PATH] log directory to store pcaps to [-l|--log_level LEVEL] log level (default 1), increase for more verbosity [--port PORT] TCP port to bind this agent to Network Device List: [0] eth0 [1] virbr0 [2] usbmon1 [3] usbmon2 [4] any [5] lo
sudo python process_monitor_unix.py ERR> USAGE: process_monitor_unix.py -c|--crash_bin File to record crash info too [-P|--port PORT] TCP port to bind this agent too [-l|--log_level LEVEL] log level (default 1), increase for more verbosity
Without going into all the details in the user manual, there are some basics that we need to know. When fuzzing with Sulley, we need to write a python script that defines all required objects that Sulley needs in order to fuzz specific target. Those objects are:
Let's start by considering a concrete fuzzing example. Consider we have a network service listening on port 9999 and it supports the following commands :
telnet 127.0.0.1 9999 Trying 127.0.0.1... Connected to 127.0.0.1. Escape character is '^]'. Welcome to Vulnerable Server! Enter HELP for help. HELP Valid Commands: HELP STATS [stat_value] RTIME [rtime_value] LTIME [ltime_value] SRUN [srun_value] TRUN [trun_value] GMON [gmon_value] GDOG [gdog_value] KSTET [kstet_value] GTER [gter_value] HTER [hter_value] LTER [lter_value] KSTAN [lstan_value] EXIT
The fuzzer for it would look as follows (sulley/vulnserver.py):
#!/usr/bin/python from sulley import * import sys import time """ Receive banner when connecting to server. """ def banner(sock): sock.recv(1024) """ Define data model. """ s_initialize("VulnserverDATA") s_group("commands", values=['HELP', 'STATS', 'RTIME', 'LTIME', 'SRUN', 'TRUN', 'GMON', 'GDOG', 'KSTET', 'GTER', 'HTER', 'LTER', 'KSTAN', 'EXIT']) s_block_start("CommandBlock", group="commands") s_delim(' ') s_string('fuzz') s_static('rn') s_block_end() """ Keep session information if we want to resume at a later point. """ s = sessions.session(session_filename="audits/vulnserver.session") """ Define state model. """ s.connect(s_get("VulnserverDATA")) """ Define the target to fuzz. """ target = sessions.target("127.0.0.1", 9999) target.netmon = pedrpc.client("127.0.0.1", 26001) target.procmon = pedrpc.client("127.0.0.1", 26002) target.procmon_options = { "proc_name" : "vulnserver", "stop_commands" : ['pkill -9 vulnserver'], "start_commands" : ['/home/dragos/vulnserver'], } """ grab the banner from the server """ s.pre_send = banner """ start fuzzing - define target and data """ s.add_target(target) s.fuzz()
First we’re defining a function that will receive the banner upon each connection to the vulnerable server. This is needed, because when initiating the connection to the vulnerable server, the server sends us a welcome message, which we must accept before trying to send any data to the vulnerable server. After the banner function, we’re defining a Data Model. A data model is how Sulley describes the input data to be fuzzed. The s_initialize sets the name to the group element. Then we’re using the s_group which declares the group where the values parameter specifies all the command names, and the command block follows. The s_block_start and s_block_end specify the starting and ending point of the command body. The command body consists of the each command in turn (as specified in the values parameter to the s_group function) as well as the space delimiter, the string to fuzz and the ending CRLF command termination.
With the s_group, we don’t have to define each command separately; we define just the names of the command, followed by a space delimiter and argument, which is to be fuzzed. We’ve described the declaration of the data model, which doesn’t actually do anything just yet. We have to provide a data model to let Sulley know how it should generate data to be sent to Vulnserver when fuzzing each command. After the data model, we need to define a new session, which we can use later in time to continue fuzzing.
What follows is a state model, which describes which states Sulley must go through when fuzzing. In our simple example, we don’t need many states, just one that actually uses the previously defined data model and fuzzes its commands as described in the data model. If we would fuzz the FTP server for example, that would be a different story, because we would first need to go into USER state, followed by PASS state, in which we authenticate to the FTP server. Only after authentication we can enter into a state where the fuzzing can really begin.
What follows is the target declaration. Let’s take a look at that piece of code again:
""" Define the target to fuzz. """ target = sessions.target("127.0.0.1", 9999) target.netmon = pedrpc.client("127.0.0.1", 26001) target.procmon = pedrpc.client("127.0.0.1", 26002) target.procmon_options = { "proc_name" : "vulnserver", "stop_commands" : ['pkill -9 vulnserver'], "start_commands" : ['/home/dragos/vulnserver'], }
Those lines declare the target being fuzzed. We already said that on the target machine we should have the network and processor monitor running – those two will listen on port 26001 and 26002. So here we’re declaring on which IP and port those two monitors are accessible. The sessions.target argument specifies the target to send the fuzzed data to, and the procmon_options specify the commands that can be used to start and stop the vulnerable server.
At the end of the vulnserver.py we’re just grabbing a banner of the Vulnserver and we can start fuzzing.
""" start fuzzing - define target and data """ s.add_target(target) s.fuzz()
mkdir auditsvulnserver python network_monitor.py -d 5 -f "src or dst port 9999" -P auditsvulnserver python process_monitor_unix.py -c auditsvulnserver.crashbin
python vulnserver.py
Now that we've had our first taste of how a fuzzer can be defined using Sulley, let's take a closer look at the options available for defining the data model and what they can do.
s_initialize("new request")
Static primitives are never mutated.
# these are all equivalent: s_static("pedram\x00was\x01here\x02") s_raw("pedram\x00was\x01here\x02") s_dunno("pedram\x00was\x01here\x02") s_unknown("pedram\x00was\x01here\x02")
# yeah, it can handle all these formats. s_binary("0xde 0xad be ef \xca fe 00 01 02 0xba0xdd f0 0d", name="complex")
s_random(min_length, max_length, num_mutations, fuzzable, name)
# fuzzes the string: <BODY bgcolor="black"> s_delim("<") s_string("BODY") s_delim(" ") s_string("bgcolor") s_delim("=") s_delim("\"") s_string("black") s_delim("\"") s_delim(">")
# import all of Sulley's functionality. from sulley import * # this request is for fuzzing: {GET,HEAD,POST,TRACE} /index.html HTTP/1.1 # define a new block named "HTTP BASIC". s_initialize("HTTP BASIC") # define a group primitive listing the various HTTP verbs we wish to fuzz. s_group("verbs", values=["GET", "HEAD", "POST", "TRACE"]) # define a new block named "body" and associate with the above group. if s_block_start("body", group="verbs"): # break the remainder of the HTTP request into individual primitives. s_delim(" ") s_delim("/") s_string("index.html") s_delim(" ") s_string("HTTP") s_delim("/") s_string("1") s_delim(".") s_string("1") # end the request with the mandatory static sequence. s_static("\r\n\r\n") # close the open block, the name argument is optional here. s_block_end("body")
s_short("opcode", full_range=True) # opcode 10 expects an authentication sequence. if s_block_start("auth", dep="opcode", dep_value=10): s_string("USER") s_delim(" ") s_string("pedram") s_static("\r\n") s_string("PASS") s_delim(" ") s_delim("fuzzywuzzy") s_block_end() # opcodes 15 and 16 expect a single string hostname. if s_block_start("hostname", dep="opcode", dep_values=[15, 16]): s_string("pedram.openrce.org") s_block_end() # the rest of the opcodes take a string prefixed with two underscores. if s_block_start("something", dep="opcode", dep_values=[10, 15, 16], dep_compare="!="): s_static("__") s_string("some string") s_block_end()
# table entry: [type][len][string][checksum] if s_block_start("table entry"): # we don't know what the valid types are, so we'll fill this in with random data. s_random("\x00\x00", 2, 2) # next, we insert a sizer of length 2 for the string field to follow. s_size("string field", length=2) # block helpers only apply to blocks, so encapsulate the string primitive in one. if s_block_start("string field"): # the default string will simply be a short sequence of C's. s_string("C" * 10) s_block_end() # append the CRC-32 checksum of the string to the table entry. s_checksum("string field") s_block_end() # repeat the table entry from 100 to 1,000 reps stepping 50 elements on each iteration. s_repeat("table entry", min_reps=100, max_reps=1000, step=50)
A File format is quite similar to a protocol meaning that it can be treated as a standardized means of communication. As such, applications should be able to deal with anomalies in the files they are parsing by implementing:
When these controls aren't in place, the following things might happen:
As always, we can have the same two approaches when generating files for fuzzing, meaning mutation-based fuzzers (notSPIKEfile) and generation-based fuzzers (SPIKEfile).
SPIKEfile is a Linux generation-based file fuzzer based on SPIKE. Fuzzers may be defined for it with the same SPIKE script block based notation as we've seen with Sulley.
wget http://www.fuzzing.org/wp-content/SPIKEfile.tgz tar xzvf SPIKEfile.tgz cd SPIKEfile/ ./make.sh
FLAGS = $(INCLUDE) -m32 -O3 -ggdb
Modify the SPIKEfile Makefile:
CFLAGS=-L$(LIBDISASM) -L. -m32 $(LD) -shared -melf_i386 -soname libdlrpc.so -o libdlrpc.so $(OBJ)
export LD_LIBRARY_PATH=path/to/libdisasm:path/to/libdlrpc ./SPIKEfile fileSPIKE ./SPIKEfile [options] <spike_file.spk> <startvar> <startstr> <command> Options: -t Timeout value (default=5) -k Do not kill the process after timeout -h Print this message -f Name for created files (sometimes important, don't overlook this) Command: Quoted command to execute to process the generated file. Use %FILENAME% as a symbol to be replaced with the filename. Example: ./SPIKEfile -t 3 -f fuzz.gif gif89a.spk 0 0 "/usr/X11R6/bin/display -debug %FILENAME%"
s_blocksize_string("fileformat", 4); s_block_start("fileformat"); s_string(" "); s_string_variable("abc"); s_block_end("fileformat");
notSPIKEfile is a Linux based file format fuzzing tool. It was designed to automate the execution and launching of applications and detection of exceptions caused by fuzzed file formats. It's a mutation based fuzzing tool meaning that starting from a valid input file it will attempt and modify the file byte by byte (or some other pattern) in order to produce test cases.
wget http://www.fuzzing.org/wp-content/notSPIKEfile.tgz tar xzvf notSPIKEfile.tgz cd notSPIKEfile ./make.sh
FLAGS = $(INCLUDE) -m32 -O3 -ggdb
and remove quickdis from compiling.
In the Makefile from noSPIKEfile also change the following:
CFLAGS=-m32
and
all $(OBJ): $(CC) $(OBJ) $(CFLAGS) -o $(EXE) $(LIBS)
export LD_LIBRARY_PATH=path/to/libdisasm/shared/library ./notSPIKEfile notSPIKEfile notSPIKEfile [options] <base file> <command> Required Options: -o Output file name base for fuzzed files Additional Options: -t Timeout value (default=2) -k Do not kill the process after timeout -s Send the specified signal to kill the process. Default is SIGTERM, some apps need SIGKILL -h Print this message -m Maximum concurrent processes (default=1) -r Fuzz this range of bytes in stead of trying the whole file (format low-high) -f Fuzz this range of fuzz values in stead of using all known fuzz values (format low-high) -B Blob mode (replace with blobs) -S String mode (replace with fancy strings) -d Delay between kill and re-exec (default=1) Command: Quoted command to execute to process the generated file. Use %FILENAME% as a symbol to be replaced with the filename. If %FILENAME% is absent in the command string, filename will be appended automatically Example: notSPIKEfile -t 3 -d 1 -m 6 -r 30- -s SIGKILL -o FUZZY.gif test.gif "/usr/bin/display -debug %FILENAME%"
Download the following archive containing a linux binary server.tgz. Unpack it and run the program to see what it does. Try to discover at least one vulnerability by sending manual input to it through the network socket it's listening on.
Once you've discovered something, go ahead and analyze the vulnerability with GDB to see if it's indeed exploitable. You don't need to create an exploit, just get an idea of how you could exploit it.
Sync the following github repository:
git clone https://github.com/madmaze/HTCPCP.git
Compile the code by executing
./make.em
Explore the structure of the Hyper Text Cofee Pot Control Protocol (HTCPCP) in RFC2324 or in the source code provided. Create a Sulley fuzzer for the HTCPCP server implementation and commence fuzzing.
Sync the following github repository:
git clone https://github.com/ejohnst/exiftags.git
and compile it:
make
The application reads in and parses JPEG exif tags from different camera vendors. The EXIF tag format is discussed here. An example JPEG file containing EXIF tags can be found here: dsc_0441.jpg Fuzz it using notSPIKEfile in order to discover vulnerabilities and, afterwards using SPIKEfile. Compare and contrast the two approaches and findings.