User Tools

Site Tools


session:04

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
session:04 [2020/06/25 22:24]
Rareş-Mihail VISALOM (67101) [07. hyp3rs3rv3r]
session:04 [2020/07/19 12:49] (current)
Line 1: Line 1:
-= 0x03. Static Analysis+====== 0x03. Static Analysis ======
  
-== Resources+===== Resources =====
  
 [[https://security.cs.pub.ro/summer-school/res/slides/04-static-analysis.pdf|Session 3 slides]] [[https://security.cs.pub.ro/summer-school/res/slides/04-static-analysis.pdf|Session 3 slides]]
  
-[[https://security.cs.pub.ro/summer-school/res/arc/04-static-analysis-skel.zip|Session's tutorials and challenges archive]]+/*[[https://security.cs.pub.ro/summer-school/res/arc/04-static-analysis-skel.zip|Session's tutorials and challenges archive]]*/
  
 /*[[https://security.cs.pub.ro/summer-school/res/arc/04-static-analysis-full.zip|Session's solutions]]*/ /*[[https://security.cs.pub.ro/summer-school/res/arc/04-static-analysis-full.zip|Session's solutions]]*/
  
-== Setup+Get the tasks by cloning [[https://github.com/hexcellents/sss-exploit|Public GitHub Repository]]. 
 + 
 +===== Setup =====
  
 For the tutorial and tasks below we will use GDB with [[https://github.com/longld/peda|PEDA]] (//Python Exploit Development Assistance//) and [[https://www.hex-rays.com/products/ida/index.shtml|IDA]]. For the tutorial and tasks below we will use GDB with [[https://github.com/longld/peda|PEDA]] (//Python Exploit Development Assistance//) and [[https://www.hex-rays.com/products/ida/index.shtml|IDA]].
Line 17: Line 19:
 Download [[https://www.hex-rays.com/products/ida/support/download_freeware.shtml|IDA Freeware Version]]. Download [[https://www.hex-rays.com/products/ida/support/download_freeware.shtml|IDA Freeware Version]].
  
-== Initial Info                                                                                                                                                                   +===== Initial Info                                                                                                                                                                    =====
  
 After obtaining a malicious binary, it isn't such a good idea to run it without prior inspection of what harm it can do. Therefore, you have to extract as much information as possible about the program without actually executing it. This technique is called //binary static analysis// and can be used to gather a lot of useful details about what the program does and detect potential vulnerabilities. This session describes how a binary executable can be statically reverse engineered by disassembling it, analyzing and annotating the assembly code in order to partially restore its source code. After obtaining a malicious binary, it isn't such a good idea to run it without prior inspection of what harm it can do. Therefore, you have to extract as much information as possible about the program without actually executing it. This technique is called //binary static analysis// and can be used to gather a lot of useful details about what the program does and detect potential vulnerabilities. This session describes how a binary executable can be statically reverse engineered by disassembling it, analyzing and annotating the assembly code in order to partially restore its source code.
Line 36: Line 38:
 This is very useful information because it helps the ''objdump'' tool to structure the output based on functions. These names are taken by the compiler directly from the source code and are stored inside the binary as DWARF/debugging symbols. This is very useful information because it helps the ''objdump'' tool to structure the output based on functions. These names are taken by the compiler directly from the source code and are stored inside the binary as DWARF/debugging symbols.
  
-=== 01. Disassemble methods+==== 01. Disassemble methods ====
  
 Although machine code and assembly is a one-to-one mapping, binary disassembly is not always an easy task. There are several tools that try to generate accurate assembly code starting from a binary. Based on the method they use, we can narrow down two approaches: //linear sweep// and //recursive traversal//. Although machine code and assembly is a one-to-one mapping, binary disassembly is not always an easy task. There are several tools that try to generate accurate assembly code starting from a binary. Based on the method they use, we can narrow down two approaches: //linear sweep// and //recursive traversal//.
Line 43: Line 45:
 A detailed description of disassemble methods is presented in [[http://resources.infosecinstitute.com/linear-sweep-vs-recursive-disassembling-algorithm/|this article]]. A detailed description of disassemble methods is presented in [[http://resources.infosecinstitute.com/linear-sweep-vs-recursive-disassembling-algorithm/|this article]].
 </note> </note>
-==== Linear Sweep+=== Linear Sweep ===
  
 Let's strip the debugging symbols and see how ''objdump'' behaves: Let's strip the debugging symbols and see how ''objdump'' behaves:
Line 117: Line 119:
 It is clear now that the //linear sweep// does not always satisfy our requirements, so let's take a look at a different approach. It is clear now that the //linear sweep// does not always satisfy our requirements, so let's take a look at a different approach.
  
-==== Recursive Traversal+=== Recursive Traversal ===
  
 The Recursive Disassembling Algorithm is a technique that combines //linear sweep// with control flow analysis. Basically what it does is to start from the entry point of the program and disassembles instructions in a linear fashion until it finds an instruction that changes the flow of the program (e.g. branches, function calls). At this point it stops disassembling the next instructions and follows the address pointed by the branch instruction, starting a new linear sweep from there. When reaching a return instruction, it goes back and resumes the algorithm from the point it had previously stopped. Since for each jmp instruction, the disassemble continues from the address it points to, we can clearly extract the exact instructions that are executed and properly decode them. The Recursive Disassembling Algorithm is a technique that combines //linear sweep// with control flow analysis. Basically what it does is to start from the entry point of the program and disassembles instructions in a linear fashion until it finds an instruction that changes the flow of the program (e.g. branches, function calls). At this point it stops disassembling the next instructions and follows the address pointed by the branch instruction, starting a new linear sweep from there. When reaching a return instruction, it goes back and resumes the algorithm from the point it had previously stopped. Since for each jmp instruction, the disassemble continues from the address it points to, we can clearly extract the exact instructions that are executed and properly decode them.
Line 123: Line 125:
 Now let's see how [[https://www.hex-rays.com/products/ida/|IDA]], a recursive traversal based disassemble tool, generates output for our ''wrong'' executable. Now let's see how [[https://www.hex-rays.com/products/ida/|IDA]], a recursive traversal based disassemble tool, generates output for our ''wrong'' executable.
  
-* Open ''idaq'' -> New -> Open ''wrong'' executable -> OK (load it as elf 386 executable) +  * Open ''idaq'' -> New -> Open ''wrong'' executable -> OK (load it as elf 386 executable) 
-* Go to address (Jump -> Jump to address / press G) ''0x80483fb''+  * Go to address (Jump -> Jump to address / press G) ''0x80483fb''
  
 You should see something like this: You should see something like this:
Line 135: Line 137:
 Although recursive traversal is less susceptible to overlapping instructions, it still has some flaws. One of the biggest problem is indirect branching - jumps to addresses that cannot be computed in a static fashion and can only be determined during runtime. Neither recursive traversal nor linear sweep algorithms can predict those addresses, therefore they can only be decoded using a dynamic analysis technique. Although recursive traversal is less susceptible to overlapping instructions, it still has some flaws. One of the biggest problem is indirect branching - jumps to addresses that cannot be computed in a static fashion and can only be determined during runtime. Neither recursive traversal nor linear sweep algorithms can predict those addresses, therefore they can only be decoded using a dynamic analysis technique.
  
-=== 02. Stop, IDA time!+==== 02. Stop, IDA time! ====
  
 <note important> <note important>
Line 235: Line 237:
   * If you want to write comments next to an instruction or a function press '':''   * If you want to write comments next to an instruction or a function press '':''
  
-=== 03. C++ executables+==== 03. C++ executables ====
  
 Many binaries now come from C++ source code. It is important to understand the concepts and paradigm shift that comes with C++ binaries. Many binaries now come from C++ source code. It is important to understand the concepts and paradigm shift that comes with C++ binaries.
Line 302: Line 304:
 /* /*
  
-=== warm-up: stripped+==== warm-up: stripped ====
  
 Someone has given us a stripped binary called stripped. Let's run it and give it a brief view: Someone has given us a stripped binary called stripped. Let's run it and give it a brief view:
Line 331: Line 333:
 */ */
  
-=== 04. crypto_crackme+==== 04. crypto_crackme ====
  
 The ''crypto_crackme'' binary is an application that asks for a secret and uses it to decrypt a message. In order to solve this task, you have to retrieve the message. The ''crypto_crackme'' binary is an application that asks for a secret and uses it to decrypt a message. In order to solve this task, you have to retrieve the message.
  
-* Open the binary using IDA and determine the program control flow. What is it doing after fetching the secret? It seems to be consuming a lot of CPU cycles. If possible, use IDA to patch the program and reduce the execution time of the application. Use ''Edit -> Patch program -> Change byte...'' +  * Open the binary using IDA and determine the program control flow. What is it doing after fetching the secret? It seems to be consuming a lot of CPU cycles. If possible, use IDA to patch the program and reduce the execution time of the application. Use ''Edit -> Patch program -> Change byte...'' 
-* Next, it looks like the program tries to verify if the secret provided is correct. Where is the secret stored? Is it stored in plain text? Find out what the validation algorithm is. +  * Next, it looks like the program tries to verify if the secret provided is correct. Where is the secret stored? Is it stored in plain text? Find out what the validation algorithm is. 
-* Now break it and retrieve the message!+  * Now break it and retrieve the message!
  
 <note tip> <note tip>
Line 352: Line 354:
 </note> </note>
  
-=== 05. broken+==== 05. broken ====
  
 The ''broken'' binary is asking you for the correct password. Investigate the binary and provide it with the correct password. If you provided the correct password the message ''%%That's correct! The password is '...'%%''. The ''broken'' binary is asking you for the correct password. Investigate the binary and provide it with the correct password. If you provided the correct password the message ''%%That's correct! The password is '...'%%''.
  
  
-=== 06. hyp3rs3rv3r+==== 06. hyp3rs3rv3r ====
  
   * Investigate the hyp3rs3rv3r binary and find out where the backdoor function is. Note that since it's not directly called, IDA doesn't think of it as a procedure so it won't come up on the left pane. Figure out a way around this. When you find that code block you can press ''p'' on the first instruction to help IDA see it as a procedure.   * Investigate the hyp3rs3rv3r binary and find out where the backdoor function is. Note that since it's not directly called, IDA doesn't think of it as a procedure so it won't come up on the left pane. Figure out a way around this. When you find that code block you can press ''p'' on the first instruction to help IDA see it as a procedure.
session/04.1593113076.txt.gz · Last modified: 2020/06/25 22:24 by Rareş-Mihail VISALOM (67101)