====== Refresher. Assembly Language ======

===== Resources =====

[[https://security.cs.pub.ro/summer-school/res/slides/02-assembly-language.pdf|Session slides]]

[[https://security.cs.pub.ro/summer-school/res/arc/02-assembly-language-skel.zip|Session's tutorials and challenges archive]]

[[https://security.cs.pub.ro/summer-school/res/arc/02-assembly-language-full.zip|Session's solutions]]

===== Tutorials =====

This session will serve as a quick **refresher** of basic computer architecture and assembly language. For the sake of brevity, we are going to focus on x86. Also, people are generally more familiar with this one.

First, we'll go through some **very** general computer architecture topics. In order to get started with the tools and learn how to easily get from assembly to a running binary, we'll continue with dissecting a short hello-world program. The core of this session is a general reference of assembly language. At the end we will also dive a little bit into some operating system internals and check out some tricks that might be useful later.

Let's get our hands dirty!
==== Computer Architecture: A Blistering Approach ====

A microprocessor executes, one by one, **logical**, **arithmetic**, **control**, and **input/output (I/O)** operations specified by the instructions of a computer program that was previously loaded in the system's memory. An instruction is just a set of bytes that specify the operation or opcode (e.g., addition, multiplication, memory read/write) and the operands (e.g. numbers, memory locations). The list of supported operations is specified by an **Instruction Set Architecture (ISA)**. ISAs can be classified into types such as [[http://en.wikipedia.org/wiki/Complex_instruction_set_computing|CISC]], [[http://en.wikipedia.org/wiki/Reduced_instruction_set_computer|RISC]], [[http://en.wikipedia.org/wiki/Very_long_instruction_word|VLIW]] and others. Particular processors implement this specification in different ways - this is called a microarchitecture, and allows the same program to be compatible with processors produced by different vendors. For example, both Intel 80386, and AMD K7 Athlon implement the same x86 ISA. Moreover, newer ISAs tend to be backward-compatible with older ones (e.g., x86 is still supported on newer 64-bit ISAs).

An x86 program operates with and on data stored in memory along with the program itself. Besides the memory, the processor also contains a set of registers that can hold a very limited number of values for fast access. Both the memory and the registers can be referenced in an instruction as operands.

An x86 instruction in machine code might look like this:

<code text>
NASM syntax: add dword [0xdeadbeef], 42
        hex:  8    3     0    5     e    f    b    e    a    d    d    e     2    a
     binary: [1000 0011][0000 0101][1110 1111 1011 1110 1010 1101 1101 1110][0010 1010]
             |          |          |                                        \- immediate: 42
             |          |          \- memory address: 0xdeadbeef (note the endianness)
             |          \- opcode modifiers:
             |               2 bits = addressing mode
             |               3 bits = register/opcode modifier
             |               3 bits = r/m field
             \- opcode: add sign-extended 8-bits immediate to register, or 32-bits memory address
</code>

<note important>
**Useful references:**
  * [[http://ref.x86asm.net/index.html|table]] of all x86 opcodes
  * original [[http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html|Intel manuals]]
  * 1-page [[https://net.cs.uni-bonn.de/fileadmin/user_upload/plohmann/x86_opcode_structure_and_instruction_overview.pdf|x86 opcode & instruction structure]]

</note>

The complete hierarchy of memory in a modern computer is depicted in the following picture. The ISA is only interested in accessing the registers, and the RAM memory. The processor-level caching is invisible while the lower levels (below RAM) are managed by the operating system and accessed via system calls.

{{ :session:memory-hierarchy.png?nolink&700 |}}

This being said, even the RAM memory is not directly accessible from a normal (i.e. in protected mode - see Basics section) program. The operating system, with support from the processor, will provide the same virtual address space to all programs but map each program to different physical sections of the RAM. Using the same mechanism, memory contents can also be spilled to disk and accessed on-demand (see [[http://en.wikipedia.org/wiki/Paging|swapping/paging]]).

<note>
[[http://ocw.cs.pub.ro/courses/so/cursuri/curs-06|This Operating Systems lecture (Romanian)]] covers deep details regarding virtual memory.
</note>

==== Hello (Assembly) World ====

We can get right down to business and see what happens when we compile a very simple program written in C.

<code c>
#include <stdio.h>

int main() {
  puts("Hello world!");
  return 0;
}
</code>

You can compile this with ''%%gcc -m32 -O0 hello.c -o hello%%''. Let's take a sneak peek at the assembly generated by the GCC compiler for this basic program: ''%%objdump -M intel -d hello%%''. We can see it looks kind of complicated and we hope you'll be able to understand every piece of it by the end of this course but, for now, let's see what bits are actually needed and write our own minimal version directly in assembly. We are going to talk more in later sessions about topics such as disassembling, executable sections, linking, reverse engineering and static analysis.

For our minimal version we need an executable that will contain 2 kinds of information:
  * some data (the ''%%"Hello world!"%%'' string)
  * actual executable code

We also need to call ''%%puts()%%'' which is a library function. This function is already assembled in an object file and sits in the libc library. It can be used by any program running on the system in 2 ways:
  * by statically linking it at compile time, which is the job of the linker
  * by dynamically linking it at runtime, which is the job of the operating system's loader

We are going to use the NASM assembler to convert the following mnemonics into an actual object file containing machine code.

<note important>
You can find documentation (syntax, command line options, etc.) on NASM [[http://www.nasm.us/xdoc/2.11.05/html/nasmdoc0.html|here]].
</note>

<code asm>
extern puts
section .data
  helloStr: db 'Hello, world!',0
section .text
  global main
main:
  push helloStr
  call puts
</code>

To assemble this run: ''%%nasm -f elf32 hello.asm%%''. This will produce an object file that we can inspect with objdump.

<code objdump>
$ objdump -M intel -d hello.o

hello.o:     file format elf32-i386


Disassembly of section .text:

00000000 <main>:
   0:	68 00 00 00 00       	push   0x0
   5:	e8 fc ff ff ff       	call   6 <main+0x6>
</code>

As we can see, there is no reference to the ''%%puts()%%'' function but it is present in the relocation records that will be used by the linker.

<code objdump>
$ objdump -M intel -r hello.o

hello.o:     file format elf32-i386

RELOCATION RECORDS FOR [.text]:
OFFSET   TYPE              VALUE 
00000001 R_386_32          .data
00000006 R_386_PC32        puts
</code>

To dynamically link our object file with ''%%libc%%'' we can use ''%%ld%%''.

<code text>
$ ld -s -lc -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 -e main hello.o -o hello_min
</code>

You can spend a few minutes and figure out what all those options do. The most important are: **''%%-lc%%''** and **''%%-e main%%''**.

<note>
The ''%%-dynamic-linker%%'' option should not be necessary but, at least on the system used to write this session, ''ld'' could not find the correct linker.
</note>

The disassembly of the final binary also contains some code that will find the ''%%puts()%%'' function at runtime. We will learn more about the ''%%.plt%%'' section in the following sessions.

<code objdump>
$ objdump -M intel -d hello_min

hello_min:     file format elf32-i386


Disassembly of section .plt:

08048170 <puts@plt-0x10>:
 8048170:	ff 35 40 92 04 08    	push   DWORD PTR ds:0x8049240
 8048176:	ff 25 44 92 04 08    	jmp    DWORD PTR ds:0x8049244
 804817c:	00 00                	add    BYTE PTR [eax],al
	...

08048180 <puts@plt>:
 8048180:	ff 25 48 92 04 08    	jmp    DWORD PTR ds:0x8049248
 8048186:	68 00 00 00 00       	push   0x0
 804818b:	e9 e0 ff ff ff       	jmp    8048170 <puts@plt-0x10>

Disassembly of section .text:

08048190 <.text>:
 8048190:	68 4c 92 04 08       	push   0x804924c
 8048195:	e8 e6 ff ff ff       	call   8048180 <puts@plt>
</code>

<note>
This binary is missing the initialization and clean-up phases. Because nobody calls ''%%exit()%%'' at the end, it will throw a segmentation fault after running our code. This should be handled by ''[[http://refspecs.linuxbase.org/LSB_3.1.1/LSB-Core-generic/LSB-Core-generic/baselib---libc-start-main-.html|__libc_start_main]]'' which is part of the [[http://en.wikipedia.org/wiki/Linux_Standard_Base|Linux Standard Base]] Core Specification.
</note>

<note>
This walk-through was inspired by these 3 articles on minimizing an ELF binary:
  * [[http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html|A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux]]
  * [[http://timelessname.com/elfbin/|Smallest x86 ELF Hello World]]
  * [[http://codegolf.stackexchange.com/questions/5696/shortest-elf-for-hello-world-n|Code Golf: Shortest ELF for “Hello world\n”?]]

</note>

==== Basics ====

As new versions of the x86 processors appeared, new features where introduced and, in order to maintain backward compatibility, the processors had to provide different operation **modes**. For example, the original 8086 allowed access to 1MB of memory, with no protection and no support for virtual memory, thus newer versions (80286, 80386) were introduced and had to be switched to **protected mode** which overcame the limitations of the older **real mode**. Other, even newer processors, also introduced the **virtual 8086 mode**, and the **long mode**. All x86 processors start in real mode and most operating systems (e.g. Linux) will switch to 80386 protected mode at boot time.

While in protected mode, an x86 processor has access to 8 32-bit general registers (depicted below), 6 segment registers (''cs'', ''ds'', ''ss'', ''es'', ''fs'', ''gs''), 1 status register (''eflags''), an instruction pointer (''eip''), as well as other control, debug, and test registers. The segment registers usually point to the same address in modern operating systems, which use paging - they where initially used for the [[http://en.wikipedia.org/wiki/X86_memory_segmentation|segmentation]] mechanism. Other registers were also added by different extensions to the processor (e.g. SSE, MMX). The 32-bit registers have names that start with **"e"** but their 16-bit and 8-bit versions are still accessible via special names, as described in the picture.

{{ :session:x86-registers.png?nolink&500 |}}

While they can be used to store any value, the 8 general registers are commonly used as follows:
  * ''eax'': accumulator, used in arithmetic operations
  * ''ebx'': base pointer in memory operations (e.g., arrays)
  * ''ecx'': loop counters
  * ''edx'': also used in arithmetic operations
  * ''esi'': source addresses in memory operations
  * ''edi'': destination addreses in memory operations
  * ''ebp'': frame base pointer
  * ''esp'': stack pointer

For convenience while working with complex data structures (e.g., structs in an array), x86 ISA offers multiple addressing modes. The most simple one, direct addressing, you only need to specify an absolute value, while other modes compute the absolute address based on some registers. All addressing modes supported in 32-bit protected mode are summarized by this formula:

{{ :session:addressing-modes.png?nolink&550 |}}

In Intel syntax, the previous formula translates to:

<code asm>
mov eax, [0xcafebab3]         ; direct (displacement)
mov eax, [esi]                ; register indirect (base)
mov eax, [ebp-8]              ; based (base + displacement)
mov eax, [ebx*4 + 0xdeadbeef] ; indexed (index*scale + displacement)
mov eax, [edx + ebx + 12]     ; based-indexed w/o scale (base + index + displacement)
mov eax, [edx + ebx*4 + 42]   ; based-indexed w/ scale (base + index*scale + displacement)
</code>

<note important>
An **effective address** is an operand that references a memory location. In NASM (Intel) syntax this consists of an expression enclosed in square brackets.
</note>

<note important>
Sometimes the assembler cannot infer the size of operands. This ambiguity can be removed by the programmer by using size specifications resulting in instructions similar to ''mov word ax, 0x42'' (in NASM syntax). This syntax is usually assembler-specific. Some discussions regarding this can be found [[http://www.c-jump.com/CIS77/ASM/Instructions/I77_0250_ptr_pointer.htm|here]], and [[http://stackoverflow.com/questions/13790146/x86-difference-between-byte-and-byte-ptr|on StackOverflow]].
</note>

==== Data Transfer ====

Data transfer instructions move bytes between memory-register, register-register, and register-memory. Memory to memory data transfers are not possible. The most common such instructions are:
  * ''mov <dest>, <src>'': move
  * ''xchg <dest>, <src>'': exchange (swap)
  * ''movzx <dest>, <src>'': move with zero extend
  * ''movsx <dest>, <src>'': move with sign extend
  * ''movsb'': move byte from location pointed to by ''esi'' to ''edi''
  * ''movsw'': similar, move word (2 bytes)
  * ''lea <dest>, <src>'': load effective address (calculate address of <src> and load it to <dest>)

<note important>
The **''lea''** instruction represents <src> with square brackets, but it only computes the address and DOES NOT read the contents at that address, as ''mov'' does. Example: ''lea ebx, [ebx*8+ebx]''.

The following two instructions are equivalent:\\ 
''lea eax, [ebx]''\\ 
''mov eax, ebx''\\ 
</note>

<note important>
''xchg eax, eax'' is equivalent with **''nop''**, which is an instruction that does nothing.
</note>

==== Control Flow ====

As a program executes, the address of the next instruction is stored in the ''eip'' register. Changing the value of this register allows control of the execution flow. Instructions directly influencing ''eip'' are:
  * ''jmp <addr>'': loads <addr> into ''eip''
  * ''call <addr>'': pushes current ''eip'' on stack, and loads <addr> into ''eip''
  * ''ret <val>'': loads head of stack into ''eip'', and pops <val> bytes off the stack
  * ''loop <addr>'': decrements ''ecx'', and jumps to <addr> if ''ecx != 0''

<note>
Interrupts also change the execution flow of a program. [[http://ocw.cs.pub.ro/courses/so2/cursuri/curs04|Operation Systems lecture (Romanian)]] treating this subject.
</note>

Conditional jumps act in the same way as ''jmp'', but they require different combinations of flags to be set in the ''eflags'' register. Flags are set by arithmetic (e.g., ''add'', ''sub''), logical (e.g., ''xor'', ''or''), or comparison (''cmp'', ''test'') instructions. The most common flags are:
  * ZF (zero flag): previous arithmetic operation resulted in zero
  * SF (sign flag): previous result's most significant bit
  * CF (carry flag): previous result requires a [[http://en.wikipedia.org/wiki/Carry_(arithmetic)|carry]]
  * OF (everflow flag): previous result overflows the maximum value that fits a register

<note important>
[[http://www.unixwiz.net/techtips/x86-jumps.html|Conditional jumps]] reference.
</note>

==== Arithmetic/Logic ====

Arithmetic instructions (NASM/Intel syntax):
  * ''add <dest>, <src>'': addition
  * ''sub <dest>, <src>'': subtraction
  * ''mul <arg>'': multiplication with corresponding byte-wise ''eax'' (i.e., <arg> = "dh" => dh * ah)
  * ''imul <arg>'': signed multiplication
  * ''imul <dest>, <src>'': signed multiplication (dest = dest * src)
  * ''imul <dest>, <src>, <aux>'': signed multiplication (dest = src * aux)
  * ''div <arg>'': division
  * ''idiv <arg>'': signed division
  * ''neg <arg>'': 2's complement negation

Shifts and rotations: ''shr'', ''shl'' (logical shift right/left), ''sar'', ''sal'' (arithmetic shift right/left), ''shld'', ''shrd'' (double-shift), ''ror'', ''rol'' (rotate), ''rcr'', ''rcl'' (rotate with carry).

Logical instructions: ''and'', ''or'', ''xor'', ''not''.

==== Function Calls ====

Function (subroutines) calls are nothing more that a convention on how parameters are passed, how the return value is passed back to the caller, and how the registers can be modified by the callee. The addresses to which a function needs to return after execution are stored in a stack data structure. Other values such as frame base pointer, and the functions local variables are also placed on the stack. Each function will thus have a corresponding **stack frame** that it allocates immediately after it is called (function prologue), and deallocates just before returning (function epilogue). The size of this allocation (changing the ''esp'' register) is establishes at compile time, and its based on the size of the function's local variables.

<note important>
The stack frames are usually aligned to 16 bytes (or 8 bytes) boundaries. This is required by standards in order to accommodate some vectorized SSE instructions which will fail if non-aligned addresses are used.
</note>

While ''esp'' points to the top of the stack, ''ebp'' points tot the beginning of the current frame, and it's previous values are also saved on the stack. This is used to conveniently navigate the call stack in debuggers, and to address local variables from a fixed address (''esp'' might change during the function's execution, by calling ''push'' and ''pop'' for example).
{{ :session:stack-convention.png?nolink |}}

There are multiple calling conventions mainly classified by who (caller or callee) is responsible for cleaning the parameters after the function finished. The most common are:
  * callee clean-up: **stdcall**
  * caller clean-up: **cdecl**, **fastcall** (some parameters passed in registers), **thiscall** (C++)

<note>
Following the same calling convention is important in situations such as calling a function from a dynamic library that was precompiled, and is already present on the system.
</note>

The following subsections will show the previous conventions, for C code, in real-life. Take a few minutes to dissect and understand the snippets. You can also try to reproduce the example on your machine.

<note>
The default convention used by GCC is ''cdecl''. Using the ''stdcall'' or ''fastcall'' [[https://gcc.gnu.org/onlinedocs/gcc-4.7.2/gcc/Function-Attributes.html|function attributes]] will force GCC to use the specified convention.
</note>
=== cdecl ===

<code c>
struct x {
    int x1, x2;
    char x3;
};

int func(struct x a, float b, void* c, int d) {
    return 42;
}

int main() {
    struct x a;
    a.x3 = '$';
    func(a, 3.14, (void*)0xdeadbeef, 1);
    return 0;
}
</code>

<code text>
$ gcc -O0 -m32 -no-pie cdecl.c -o cdecl
$ objdump -M intel -d ./cdecl
</code>

<code objdump>
080483ef <main>:
...
 8048403:       6a 01                   push   0x1
 8048405:       68 ef be ad de          push   0xdeadbeef
 804840a:       d9 80 c0 e4 ff ff       fld    DWORD PTR [eax-0x1b40]
 8048410:       8d 64 24 fc             lea    esp,[esp-0x4]
 8048414:       d9 1c 24                fstp   DWORD PTR [esp]
 8048417:       ff 75 fc                push   DWORD PTR [ebp-0x4]
 804841a:       ff 75 f8                push   DWORD PTR [ebp-0x8]
 804841d:       ff 75 f4                push   DWORD PTR [ebp-0xc]
 8048420:       e8 b6 ff ff ff          call   80483db <func>
 8048425:       83 c4 18                add    esp,0x18
...
</code>

<code objdump>
080483db <func>:
 80483db:       55                      push   ebp
 80483dc:       89 e5                   mov    ebp,esp
 80483de:       e8 4c 00 00 00          call   804842f <__x86.get_pc_thunk.ax>
 80483e3:       05 1d 1c 00 00          add    eax,0x1c1d
 80483e8:       b8 2a 00 00 00          mov    eax,0x2a
 80483ed:       5d                      pop    ebp
 80483ee:       c3
</code>


<note important>
As you can see the arguments are added on the stack by the caller function(''main''), using multiple ''push'' instructions, and are removed also by the caller function, using a single ''add esp,0x18'' instruction.
</note>

=== stdcall ===

<code c>
struct x {
    int x1, x2;
    char x3;
};

__attribute__((stdcall))
int func(struct x a, float b, void* c, int d) {
    return 42;
}

int main() {
    struct x a;
    a.x3 = '$';
    func(a, 3.14, (void*)0xdeadbeef, 1);
    return 0;
}
</code>

<code text>
$ gcc -O0 -m32 -no-pie stdcall.c -o stdcall
$ objdump -M intel -d ./stdcall
</code>

<code objdump>
080483f1 <main>:
...
 8048405:       6a 01                   push   0x1
 8048407:       68 ef be ad de          push   0xdeadbeef
 804840c:       d9 80 c0 e4 ff ff       fld    DWORD PTR [eax-0x1b40]
 8048412:       8d 64 24 fc             lea    esp,[esp-0x4]
 8048416:       d9 1c 24                fstp   DWORD PTR [esp]
 8048419:       ff 75 fc                push   DWORD PTR [ebp-0x4]
 804841c:       ff 75 f8                push   DWORD PTR [ebp-0x8]
 804841f:       ff 75 f4                push   DWORD PTR [ebp-0xc]
 8048422:       e8 b4 ff ff ff          call   80483db <func>
...
</code>

<code objdump>
080483db <func>:
 80483db:       55                      push   ebp
 80483dc:       89 e5                   mov    ebp,esp
 80483de:       e8 4b 00 00 00          call   804842e <__x86.get_pc_thunk.ax>
 80483e3:       05 1d 1c 00 00          add    eax,0x1c1d
 80483e8:       b8 2a 00 00 00          mov    eax,0x2a
 80483ed:       5d                      pop    ebp
 80483ee:       c2 18 00                ret    0x18
</code>

<note important>
As you can see in the output, the arguments are added on the stack by the caller(''main'') and are removed by the callee(''func'') using ''ret 0x18'' intruction.
</note>

=== fastcall ===

<code c>
__attribute__((fastcall))
int func(int a, int b, int c, int d) {
    return 42;
}

int main() {
    func(1, 2, 3, 4);
    return 0;
}
</code>

<code text>
$ gcc -O0 -m32 -no-pie fastcall.c -o fastcall
$ objdump -M intel -d ./fastcall
</code>

<code objdump>
080483fa <main>:
 80483fa:       55                      push   ebp
 80483fb:       89 e5                   mov    ebp,esp
 80483fd:       e8 1f 00 00 00          call   8048421 <__x86.get_pc_thunk.ax>
 8048402:       05 fe 1b 00 00          add    eax,0x1bfe
 8048407:       6a 04                   push   0x4
 8048409:       6a 03                   push   0x3
 804840b:       ba 02 00 00 00          mov    edx,0x2
 8048410:       b9 01 00 00 00          mov    ecx,0x1
 8048415:       e8 c1 ff ff ff          call   80483db <func>
...
</code>

<code objdump>
080483db <func>:
 80483db:       55                      push   ebp
 80483dc:       89 e5                   mov    ebp,esp
 80483de:       83 ec 08                sub    esp,0x8
 80483e1:       e8 3b 00 00 00          call   8048421 <__x86.get_pc_thunk.ax>
 80483e6:       05 1a 1c 00 00          add    eax,0x1c1a
 80483eb:       89 4d fc                mov    DWORD PTR [ebp-0x4],ecx
 80483ee:       89 55 f8                mov    DWORD PTR [ebp-0x8],edx
 80483f1:       b8 2a 00 00 00          mov    eax,0x2a
 80483f6:       c9                      leave  
 80483f7:       c2 08 00                ret    0x8
</code>

<note important>
The first two arguments are moved into registers and the rest are pushed on the stack by the caller(''main''). All arguments from the stack are being removed by the callee, using ''ret 0x8'' instruction. Other compilers might use more registers as arguments.
</note>

==== System calls ====

Syscalls are the interface that allows user applications to request services from the OS kernel, such as reading the disk, starting new processes, or managing existing ones. Just like function calls, syscalls are just a set of conventions on how to pass arguments to a kernel function. The mechanism is invoked by triggering an interrupt (**''int 0x80''**) which will call the kernel's syscall dispatcher, which, in turn, will call the syscall based on the ''eax'' register. The conventions for invoking a syscall on Linux are:
  * ''eax'' contains the syscall ID
  * parameters are passed in ''ebx'', ''ecx'', ''edx'', ''esi'', ''edi'', ''ebp'' (in this order)
  * the syscall is responsible of saving and restoring all registers

<note important>
Syscalls are not usually invoked directly, but through wrappers in ''libc''. You can read about how this is implemented in [[http://lwn.net/Articles/534682/|this LWN article]].
</note>

<note important>
**Other useful references:**
  * [[https://syscalls.kernelgrok.com/|Linux syscall table]] with ID, source code, and parameters.
  * [[http://ocw.cs.pub.ro/courses/so2/cursuri/curs02|Operation Systems lecture (Romanian)]] discussing syscalls theory and Linux implementation.
  * [[http://www.linuxjournal.com/article/3326|Implementing Linux syscalls]]

</note>
==== Compiler Patterns ====

In the end, let's take a look at some common C language constructs, and how they are compiled into machine code by GCC. You are encouraged to try other constructs too.


=== Compiler Explorer ===

You can try out the Compiler explorer at http://gcc.godbolt.org/ to see how each line is translated into instructions.
Check this example out: http://goo.gl/gVeH5p
/*
<note important> Due to a bug, you need to click ''Colorise'' two times to enable coloring adnotations
</note>
*/
=== function prologue ===

<code objdump>
 80483ed:	55                   	push   ebp
 80483ee:	89 e5                	mov    ebp,esp
 80483f0:	83 ec 08             	sub    esp,0x8
</code>

=== function epiloque ===

<code objdump>
 80483fe:	c9                   	leave  
 80483ff:	c2 08 00             	ret    0x8
</code>

=== for loop ===

<code c>
int main() {
    int x = 1000;
    for (int i = 1; i < 10; i++) {
        x++;
    }
    return 0;
}
</code>

<code text>
$ gcc -O0 -m32 for.c -o for
$ objdump -M intel -d ./for
</code>

<code objdump>
080483ed <main>:
...
 80483f3:	c7 45 f8 e8 03 00 00 	mov    DWORD PTR [ebp-0x8],0x3e8
 80483fa:	c7 45 fc 01 00 00 00 	mov    DWORD PTR [ebp-0x4],0x1
 8048401:	eb 08                	jmp    804840b <main+0x1e>
 8048403:	83 45 f8 01          	add    DWORD PTR [ebp-0x8],0x1
 8048407:	83 45 fc 01          	add    DWORD PTR [ebp-0x4],0x1
 804840b:	83 7d fc 09          	cmp    DWORD PTR [ebp-0x4],0x9
 804840f:	7e f2                	jle    8048403 <main+0x16>
...
</code>

=== while loop ===

<code c>
int main() {
    int x = 1000, i = 42;
    while (--i > 0) {
        x--;
    }
    return 0;
}
</code>

<code text>
$ gcc -O0 -m32 while.c -o while
$ objdump -M intel -d ./while
</code>

<code objdump>
080483ed <main>:
...
 80483f3:	c7 45 f8 e8 03 00 00 	mov    DWORD PTR [ebp-0x8],0x3e8
 80483fa:	c7 45 fc 2a 00 00 00 	mov    DWORD PTR [ebp-0x4],0x2a
 8048401:	eb 04                	jmp    8048407 <main+0x1a>
 8048403:	83 6d f8 01          	sub    DWORD PTR [ebp-0x8],0x1
 8048407:	83 6d fc 01          	sub    DWORD PTR [ebp-0x4],0x1
 804840b:	83 7d fc 00          	cmp    DWORD PTR [ebp-0x4],0x0
 804840f:	7f f2                	jg     8048403 <main+0x16>
...
</code>

=== nested fors with break and continue ===

<code c>
int main() {
    int x = 1000, i, j;
    for (i = 1; i < 10; i++) {
        for (j = 1; j < 4; j++) {
            if (x == 42)
                break;
        }
        if (i == 3)
            continue;
    }
    return 0;
}
</code>

<code text>
$ gcc -O0 -m32 nested.c -o nested
$ objdump -M intel -d ./nested
</code>

<code objdump>
080483ed <main>:
...
 80483f3:	c7 45 fc e8 03 00 00 	mov    DWORD PTR [ebp-0x4],0x3e8
 80483fa:	c7 45 f4 01 00 00 00 	mov    DWORD PTR [ebp-0xc],0x1
 8048401:	eb 26                	jmp    8048429 <main+0x3c>
 8048403:	c7 45 f8 01 00 00 00 	mov    DWORD PTR [ebp-0x8],0x1
 804840a:	eb 0c                	jmp    8048418 <main+0x2b>
 804840c:	83 7d fc 2a          	cmp    DWORD PTR [ebp-0x4],0x2a
 8048410:	75 02                	jne    8048414 <main+0x27>
 8048412:	eb 0a                	jmp    804841e <main+0x31>
 8048414:	83 45 f8 01          	add    DWORD PTR [ebp-0x8],0x1
 8048418:	83 7d f8 03          	cmp    DWORD PTR [ebp-0x8],0x3
 804841c:	7e ee                	jle    804840c <main+0x1f>
 804841e:	83 7d f4 03          	cmp    DWORD PTR [ebp-0xc],0x3
 8048422:	75 01                	jne    8048425 <main+0x38>
 8048424:	90                   	nop
 8048425:	83 45 f4 01          	add    DWORD PTR [ebp-0xc],0x1
 8048429:	83 7d f4 09          	cmp    DWORD PTR [ebp-0xc],0x9
 804842d:	7e d4                	jle    8048403 <main+0x16>
...
</code>


===== Challenges =====

==== 01. Execve ====

=== Simple printing ===

Use assembly to write a program that receives N command line parameters. If the 1st parameter starts with ''.'' (//dot//) (such as ''./ping 8.8.8.8'') the program prints the message ''FAILED''. If the first parameter **doesn't** start with ''.'' (//dot//) (such as ''/bin/ping 8.8.8.8'') the program prints the message ''WORKS''.

<note important>
You can find the skeleton for this task in ''01-challenge-execve/src''.
</note>

<note tip>
GCC will take care of the boilerplate that actually places the command line parameters on the stack before calling your ''main()''.
</note>

<code text>
$ ./execve ./ping 8.8.8.8 => prints FAILED message
$ ./execve /bin/ping 8.8.8.8 => prints WORKS message
</code>

=== Simple syscall ===

Update the above program and use assembly to write a program that receives N command line parameters, and dispatches them to the ''execve'' syscall. If the 1st parameter starts with ''.'' (//dot//) (such as ''./ping 8.8.8.8'') the program should NOT call ''execve'' and instead print an error message.

You can use libc's ''printf()'' or ''puts()'' for the error message. You can assume the command line parameters are already on the stack, and you can generate the boilerplate code that takes care of this by linking with ''gcc'' as opposed to ''ld''.

<note tip>
The equivalent C call would be:
<code c>
execve(argv[1], argv+1, NULL);
</code>
You have to translate that in assembly.
</note>

<note tip>
The syscall number for ''execve'' is ''11''. Check the [[http://man7.org/linux/man-pages/man2/execve.2.html|man page]] for the other arguments.
</note>
==== 02. Looping math ====

Use assembly to write a program that iterates through a statically allocated string (use the ''.data'' section), and calls a function that replaces each letter based on the following formula: ''NEW_LETTER = 33 + ((OLD_LETTER * 42 / 3 + 13) % 94)''. Print the new string at the end.

<note important>
You can find the skeleton for this task in ''02-challenge-looping-math/src''.
</note>

<note tip>
Follow the multiplication and division operations described [[https://en.wikibooks.org/wiki/X86_Assembly/Arithmetic|here]].
</note>

<note tip>
If the string you use it ''call denied!'' the result is ''tX66v$2Rj2$&''.
</note>
==== 03. Call secret function ====

The binary file ''03-challenge-call-secret/src/call-secret'' needs to call a specific function. However, because of a nasty "voice", the specific function doesn't get called. Please fix it and find out the flag.

<note tip>
You may overwrite "unwanted" content with the ''NOP'' instruction. You need to find out the ''NOP'' instruction for x86.
</note>

<note tip>
To edit a binary, you can use [[https://vim.fandom.com/wiki/Hex_dump#Editing_binary_files|vim + xxd]] or [[https://www.forensicswiki.org/wiki/Bless|Bless]].

</note>
==== 04. No exit ====

The binary file ''04-challenge-no-exit/src/no-exit'' needs to call a specific function. However, because of a nasty exit, the specific function doesn't get called. Please fix it and find out the flag.


<note tip>
You need to call the ''secret()'' function instead of the ''exit()'' function. Find out the offset issue the appropriate ''call'' instruction.

The ''secret()'' function will use the argument that has been "appropriately" provided to the ''exit()'' call.
</note>
==== 05. Funny convention ====

The binary ''05-challenge-funny-convention/src/funny'' is already dynamically linked with a missing library (''libfunny.so''), that you'll have to recreate in assembly. The library should contain a wrapper for the ''write'' syscall called ''leet_write()''. The original library was using a funny calling convention, slightly different from the standard one. Figure out the convention, write the wrapper in NASM, and compile the library. Test by running the provided binary.

The library is position independent, and exposes 2 symbols: the function, and some global variable. You can find the skeleton for this task in the directory ''05-challenge-funny-convention/src''.

You should be able to run the provided binary as long as the correct library is in ''./''.

<note important>
The library exports the ''count_param'' as a global symbol, thus it will reside inside the caller's address space as opposed to the .data section of the library. Because of this the library cannot access the data by using ''count_param'' and needs to use ''count_param wrt ..sym'' instead.
A more detailed explaination can be found [[https://www.nasm.us/doc/nasmdoc9.html#section-9.2.4|here]]
</note>
==== Extra: 06. Obfuscation ====

Write a program that does a completely different thing than what ''objdump'' will show by jumping into the middle of an instruction. After the jump, the processor will "see" another stream of valid instructions.


<note tip>
You can find the skeleton for this task in the directory ''06-challenge-obfuscation/src''.
</note>

<note important>
**HINT:** [[http://reverseengineering.stackexchange.com/questions/1531/what-is-overlapping-instructions-obfuscation|Overlapping instructions]]
</note>