User Tools

Site Tools


session:04

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
session:04 [2018/07/16 11:39]
Andrei-Robert BARONESCU (66895)
session:04 [2020/07/19 12:49] (current)
Line 1: Line 1:
-0x04. Static Analysis+====== 0x03. Static Analysis ======
  
-== Resources+===== Resources =====
  
-{{https://security.cs.pub.ro/summer-school/res/slides/04-static-analysis.pdf|Slides}}+[[https://security.cs.pub.ro/summer-school/res/slides/04-static-analysis.pdf|Session 3 slides]]
  
-{{:session:tasks_04.tgz|Tasks archive}}+/*[[https://security.cs.pub.ro/summer-school/res/arc/04-static-analysis-skel.zip|Session's tutorials and challenges archive]]*/
  
-== Setup+/*[[https://security.cs.pub.ro/summer-school/res/arc/04-static-analysis-full.zip|Session's solutions]]*/ 
 + 
 +Get the tasks by cloning [[https://github.com/hexcellents/sss-exploit|Public GitHub Repository]]. 
 + 
 +===== Setup =====
  
 For the tutorial and tasks below we will use GDB with [[https://github.com/longld/peda|PEDA]] (//Python Exploit Development Assistance//) and [[https://www.hex-rays.com/products/ida/index.shtml|IDA]]. For the tutorial and tasks below we will use GDB with [[https://github.com/longld/peda|PEDA]] (//Python Exploit Development Assistance//) and [[https://www.hex-rays.com/products/ida/index.shtml|IDA]].
Line 13: Line 17:
 Configure GDB locally with [[https://github.com/longld/peda|PEDA]] by following the instructions on the [[https://github.com/longld/peda#installation|GitHub page]]. Configure GDB locally with [[https://github.com/longld/peda|PEDA]] by following the instructions on the [[https://github.com/longld/peda#installation|GitHub page]].
  
-Download [[https://www.hex-rays.com/products/ida/support/download_demo.shtml|IDA evaluation version for Linux]].+Download [[https://www.hex-rays.com/products/ida/support/download_freeware.shtml|IDA Freeware Version]].
  
-== Initial info                                                                                                                                                                   +===== Initial Info                                                                                                                                                                    =====
  
 After obtaining a malicious binary, it isn't such a good idea to run it without prior inspection of what harm it can do. Therefore, you have to extract as much information as possible about the program without actually executing it. This technique is called //binary static analysis// and can be used to gather a lot of useful details about what the program does and detect potential vulnerabilities. This session describes how a binary executable can be statically reverse engineered by disassembling it, analyzing and annotating the assembly code in order to partially restore its source code. After obtaining a malicious binary, it isn't such a good idea to run it without prior inspection of what harm it can do. Therefore, you have to extract as much information as possible about the program without actually executing it. This technique is called //binary static analysis// and can be used to gather a lot of useful details about what the program does and detect potential vulnerabilities. This session describes how a binary executable can be statically reverse engineered by disassembling it, analyzing and annotating the assembly code in order to partially restore its source code.
Line 34: Line 38:
 This is very useful information because it helps the ''objdump'' tool to structure the output based on functions. These names are taken by the compiler directly from the source code and are stored inside the binary as DWARF/debugging symbols. This is very useful information because it helps the ''objdump'' tool to structure the output based on functions. These names are taken by the compiler directly from the source code and are stored inside the binary as DWARF/debugging symbols.
  
-=== 01. Disassemble methods+==== 01. Disassemble methods ====
  
 Although machine code and assembly is a one-to-one mapping, binary disassembly is not always an easy task. There are several tools that try to generate accurate assembly code starting from a binary. Based on the method they use, we can narrow down two approaches: //linear sweep// and //recursive traversal//. Although machine code and assembly is a one-to-one mapping, binary disassembly is not always an easy task. There are several tools that try to generate accurate assembly code starting from a binary. Based on the method they use, we can narrow down two approaches: //linear sweep// and //recursive traversal//.
Line 41: Line 45:
 A detailed description of disassemble methods is presented in [[http://resources.infosecinstitute.com/linear-sweep-vs-recursive-disassembling-algorithm/|this article]]. A detailed description of disassemble methods is presented in [[http://resources.infosecinstitute.com/linear-sweep-vs-recursive-disassembling-algorithm/|this article]].
 </note> </note>
-==== Linear Sweep+=== Linear Sweep ===
  
 Let's strip the debugging symbols and see how ''objdump'' behaves: Let's strip the debugging symbols and see how ''objdump'' behaves:
Line 115: Line 119:
 It is clear now that the //linear sweep// does not always satisfy our requirements, so let's take a look at a different approach. It is clear now that the //linear sweep// does not always satisfy our requirements, so let's take a look at a different approach.
  
-==== Recursive Traversal+=== Recursive Traversal ===
  
 The Recursive Disassembling Algorithm is a technique that combines //linear sweep// with control flow analysis. Basically what it does is to start from the entry point of the program and disassembles instructions in a linear fashion until it finds an instruction that changes the flow of the program (e.g. branches, function calls). At this point it stops disassembling the next instructions and follows the address pointed by the branch instruction, starting a new linear sweep from there. When reaching a return instruction, it goes back and resumes the algorithm from the point it had previously stopped. Since for each jmp instruction, the disassemble continues from the address it points to, we can clearly extract the exact instructions that are executed and properly decode them. The Recursive Disassembling Algorithm is a technique that combines //linear sweep// with control flow analysis. Basically what it does is to start from the entry point of the program and disassembles instructions in a linear fashion until it finds an instruction that changes the flow of the program (e.g. branches, function calls). At this point it stops disassembling the next instructions and follows the address pointed by the branch instruction, starting a new linear sweep from there. When reaching a return instruction, it goes back and resumes the algorithm from the point it had previously stopped. Since for each jmp instruction, the disassemble continues from the address it points to, we can clearly extract the exact instructions that are executed and properly decode them.
Line 121: Line 125:
 Now let's see how [[https://www.hex-rays.com/products/ida/|IDA]], a recursive traversal based disassemble tool, generates output for our ''wrong'' executable. Now let's see how [[https://www.hex-rays.com/products/ida/|IDA]], a recursive traversal based disassemble tool, generates output for our ''wrong'' executable.
  
-* Open ''idaq'' -> New -> Open ''wrong'' executable -> OK (load it as elf 386 executable) +  * Open ''idaq'' -> New -> Open ''wrong'' executable -> OK (load it as elf 386 executable) 
-* Go to address (Jump -> Jump to address / press G) ''0x80483fb''+  * Go to address (Jump -> Jump to address / press G) ''0x80483fb''
  
 You should see something like this: You should see something like this:
Line 133: Line 137:
 Although recursive traversal is less susceptible to overlapping instructions, it still has some flaws. One of the biggest problem is indirect branching - jumps to addresses that cannot be computed in a static fashion and can only be determined during runtime. Neither recursive traversal nor linear sweep algorithms can predict those addresses, therefore they can only be decoded using a dynamic analysis technique. Although recursive traversal is less susceptible to overlapping instructions, it still has some flaws. One of the biggest problem is indirect branching - jumps to addresses that cannot be computed in a static fashion and can only be determined during runtime. Neither recursive traversal nor linear sweep algorithms can predict those addresses, therefore they can only be decoded using a dynamic analysis technique.
  
-=== 02. Stop, IDA time!+==== 02. Stop, IDA time! ====
  
 <note important> <note important>
Line 233: Line 237:
   * If you want to write comments next to an instruction or a function press '':''   * If you want to write comments next to an instruction or a function press '':''
  
-=== 03. C++ executables+==== 03. C++ executables ====
  
 Many binaries now come from C++ source code. It is important to understand the concepts and paradigm shift that comes with C++ binaries. Many binaries now come from C++ source code. It is important to understand the concepts and paradigm shift that comes with C++ binaries.
Line 298: Line 302:
 </note> </note>
  
 +/*
  
-== Tasks +==== warm-up: stripped ====
- +
-For the practical tasks of this session you have to download the {{:session:tasks_04.tgz}} archive. +
- +
-=== warm-up: stripped+
  
 Someone has given us a stripped binary called stripped. Let's run it and give it a brief view: Someone has given us a stripped binary called stripped. Let's run it and give it a brief view:
Line 330: Line 331:
 </note> </note>
  
 +*/
  
-=== hyp3rs3rv3r+==== 04. crypto_crackme ====
  
-  Investigate the hyp3rs3rv3r binary and find out where the backdoor function is. Note that since it's not directly called, IDA doesn't think of it as a procedure so it won't come up on the left paneFigure out a way around this. When you find that code block you can press ''p'' on the first instruction to help IDA see it as a procedure.+The ''crypto_crackme'' binary is an application that asks for a secret and uses it to decrypt a message. In order to solve this task, you have to retrieve the message. 
 + 
 +  Open the binary using IDA and determine the program control flow. What is it doing after fetching the secret? It seems to be consuming a lot of CPU cycles. If possibleuse IDA to patch the program and reduce the execution time of the applicationUse ''Edit -> Patch program -> Change byte...'' 
 +  * Next, it looks like the program tries to verify if the secret provided is correct. Where is the secret stored? Is it stored in plain text? Find out what the validation algorithm is. 
 +  * Now break it and retrieve the message!
  
 <note tip> <note tip>
-You can use IDA to reverse engineer this binary.+Unfortunately, the virtual machine doesn't support the libssl1.0.0 version of SSL library. 
 + 
 +Use the library files in the task archive and run the executable using: 
 +<code> 
 +LD_LIBRARY_PATH=. ./crypto_crackme 
 +</code>
 </note> </note>
  
 <note tip> <note tip>
-In order to exploit the vulnerability in Ubuntu, you should use netcat-traditional. You can switch from netcat-openbsd to netcat-traditional using the steps described [[https://stackoverflow.com/questions/10065993/how-to-switch-to-netcat-traditional-in-ubuntu|here]].+You can break password hashes (including SHA1) on [[https://crackstation.net/|CrackStation]].
 </note> </note>
  
 +==== 05. broken ====
  
-=== crypto_crackme+The ''broken'' binary is asking you for the correct password. Investigate the binary and provide it with the correct password. If you provided the correct password the message ''%%That's correct! The password is '...'%%''.
  
-The ''crypto_crackme'' binary is an application that asks for a secret and uses it to decrypt a message. In order to solve this task, you have to retrieve the message. 
  
-Open the binary using IDA and determine the program control flowWhat is it doing after fetching the secret? It seems to be consuming a lot of CPU cycles. If possibleuse IDA to patch the program and reduce the execution time of the applicationUse ''Edit -> Patch program -> Change byte...'' +==== 06. hyp3rs3rv3r ==== 
-* Next, it looks like the program tries to verify if the secret provided is correct. Where is the secret stored? Is it stored in plain text? Find out what the validation algorithm is. + 
-* Now break it and retrieve the message!+  Investigate the hyp3rs3rv3r binary and find out where the backdoor function isNote that since it's not directly called, IDA doesn't think of it as a procedure so it won't come up on the left paneFigure out a way around this. When you find that code block you can press ''p'' on the first instruction to help IDA see it as a procedure.
  
 <note tip> <note tip>
-Unfortunately, the virtual machine doesn't support the libssl1.0.0 version of SSL library. You will have to use either the base system or the [[:session:infrastructure:vm#debian-32bit|Debian 32 virtual machine]].+You can use IDA to reverse engineer this binary.
 </note> </note>
  
 <note tip> <note tip>
-You can break password hashes (including SHA1) on [[https://crackstation.net/|CrackStation]].+In order to exploit the vulnerability in Ubuntu, you should use netcat-traditional. You can switch from netcat-openbsd to netcat-traditional using the steps described [[https://stackoverflow.com/questions/10065993/how-to-switch-to-netcat-traditional-in-ubuntu|here]].
 </note> </note>
  
-=== 06. broken 
- 
-The ''broken'' binary in [[http://security.cs.pub.ro/summer-school/res/arc/06-challenge-broken.zip|the archive]] is asking you for the correct password. Investigate the binary and provide it with the correct password. If you provided the correct password the message ''That's correct! The password is '...'''. 
- 
-=== 07. matryoshka 
- 
-The ''matryoshka'' binary in [[http://security.cs.pub.ro/summer-school/res/arc/07-challenge-matryoshka.zip|the archive]] is hiding something. Extract what you need and get the flag. 
session/04.1531730355.txt.gz · Last modified: 2018/07/16 11:39 by Andrei-Robert BARONESCU (66895)