Linux Reverse-Engineering

GCC 2022

chef chick

What's been covered

x86-64 assembly, compilation, basic instructions

How do we distribute/store such code?

We need a file format

Any naive attempts?

We could just have a file just containing code!

issues with sections, resources, metadata etc.

Let's welcome the ELF

How can we analyse such programs?

..Disassemblers

Ghidra Binary Ninja IDA
Free? Yes! Free-ish No/Yes
Decompiler? Yes! Kinda Yes/No
*NSA backdoor? Yes! Hopefully not
*Ghidra/IDA is fine, just a joke.
helloworld.c

#include <stdio.h>

int main(){
    puts("Hello World!");
    return 0;
}
					

Let's take a look at this compiled ELF

Deeper dive into functions

Functions heavily make use of the program stack

push

Syntax:
push <op>
Functionality:
Pushes the QWORD in <op> on to the stack
Sample Code

mov     rax, 0x100
mov     rbx, 0x200
push    rax
push    rbx
							

pop

Syntax:
pop <dst>
Functionality:
Pops the top QWORD in the stack into <dst>
Sample Code

mov     rax, 0x100
mov     rbx, 0x200
push    rax
push    rbx
pop     rcx
pop     rdx
							

push/pop

How does this enable functions?

functions

nested functions

Deconstructing a function


int foo(int a, int b){
    puts("Inside function (foo)");
    return a + b;
}
					

ABI

Return value


int foo(...){
    ...
    return a + b;
}
					
The return value is stored in the rax register, or if it is a 128-bit value, then the higher 64-bits go in rdx.

Return value

C source


int foo(){
    return 1000;
}
							

x86-64


foo:
        push    rbp
        mov     rbp, rsp
        mov     eax, 1000
        pop     rbp
        ret
							

Parameters/Arguments


int foo(int a, int b, ...){
    ...
}
					
Parameters to functions are passed in the registers rdi, rsi, rdx, rcx, r8, r9, and further values are passed on the stack in reverse order.

Parameters/Arguments

C source


int main() {
    foo(1,2,3,4,5);
}

int foo(int a, int b, int c, int d, int e){
    return 1;
}
							

x86-64


main:
        push    rbp
        mov     rbp, rsp
        mov     r8d, 5
        mov     ecx, 4
        mov     edx, 3
        mov     esi, 2
        mov     edi, 1
        mov     eax, 0
        call    foo
        mov     eax, 0
        pop     rbp
        ret
foo:
        push    rbp
        mov     rbp, rsp
        mov     DWORD PTR [rbp-4], edi
        mov     DWORD PTR [rbp-8], esi
        mov     DWORD PTR [rbp-12], edx
        mov     DWORD PTR [rbp-16], ecx
        mov     DWORD PTR [rbp-20], r8d
        mov     eax, 1
        pop     rbp
        ret
							

Let's reverse some basic functions

Mini Challenge #1

Mini Challenge #2

Mini Challenge #3

Approaching real binaries!

How to find main()


STATIC int LIBC_START_MAIN (int (*main) (int, char **, char **
                                         MAIN_AUXVEC_DECL),
                            int argc,
                            char **argv,
							...
							);
					

Let's begin!

Using Binary Ninja!