In [None]:
%run -i ../python/common.py
UC_SKIPTERMS=True
%run -i ../python/ln_preamble.py

o=bashCmds('''[[ -d ~/simpleasm ]] && rm -rf ~/simpleasm 
mkdir ~/simpleasm
cp ../src/setup.gdb ../src/add.s ../src/int3.s ~/simpleasm
cp ../src/simpleasm.mk ~/simpleasm/Makefile
cp ../src/popcnt_bb.s ~/simpleasm/popcnt.s
cp ../src/exit.S ~/simpleasm/exitfancy.S
cp ../src/exit_bb_bb.s ~/simpleasm/exit.s''', cwd=os.getcwd())

# SLS Lecture 8 :  Writing some simple assembly programs

Spend some time writing some very simple assembly programs and learn to use the debugger so that we have enough skills to explore how things work.    We will be repeat various things in more detail in future lectures.

- Write `popcnt` in assemble code 
  - use gdb to play with the popcnt program
- Write a simple `add` in assembly code
  - use gdb to play with the add program
    - using the cpu as a glorified calculator 
      - first pass at CPU support for "numbers" 
- What happens if we let our programs continue
  - how do we successfully "halt/end" our execution 
    - `int3` trap 
       - tells OS to return control to debugger   
    - more generally how can we make a Kernel/System Call
  - revisit `add` programs adding exits  
    - `int3`
    - `exit` syscall
- Implicitly use our shell, editor, Make and Git knowledge to do the above

## Writing a `popcnt` assembly program

- Write a one instruction assembly program 
  1. first using .byte 
  2. using intel assembly instruction  
- Use gdb to explore how this instruction works
  - learn to use gdb to set register values
  - and how to execute and re-execute an instruction

### Setup
1. make directory
2. open emacs and write `popcnt.s`

Skipping `git` for time. 

1. make a directory for our work :
   - `mkdir simpleasm`
   - `cd simpleasm`
2. emacs popcnt.s   

In [None]:
display(Markdown(FileCodeBox(
    file="../src/popcnt_bb.s", 
    lang="gas", 
    title="<b>CODE: asm - The 'popcnt' assembly program",
    h="100%", 
    w="107em"
)))

In [None]:
display(Markdown('''
Here is a fully commented version of the same code.
'''))
display(Markdown(FileCodeBox(
    file="../src/popcnt.s", 
    lang="gas", 
    title="<b>CODE: asm - The commented 'popcnt' assembly program",
    h="100%", 
    w="107em"
)))

In [None]:
display(showET("Editor"))

We can use the `.byte` directive to set the values in memory to anything we like 
eg.

``` gas
     .byte 0xF3, 0x48, 0x0F, 0xB8, 0xC3  
```

But of course the real value is that we could have also simply written

``` gas
       popcnt rax, rbx          
```


In [None]:
display(showBT())

In [None]:
display(Markdown(FileCodeBox(
    file="popcnt_build.sh", 
    lang="shell", 
#    title="<b>NOTES: on building popcnt", 
    h="100%", 
    w="100%")))

In [None]:
display(showDT("Debugger"))

In [None]:
display(Markdown(FileCodeBox(
    file="popcnt_gdb.txt", 
    lang="shell", 
    title="", 
    h="100%", 
    w="100%")))

## Writing an `add` assembly program

- re-enforce the steps to creating and debugging an assembly program
  - begin to explore CPU support for working with "numbers"
     - we will get into how numbers "work" later
     - learn enough so that you can poke around yourself
  - get an idea of cool things that INTEL instructions can do
  - try adding some variables in memory to our program

- Lets work with the `add` instruction in a similar way that we did with `popcnt`
- explore the results of adding with binary, hex, unsigned and signed values
- explore overflow

- then make the program a little more complex:
``` gas
  movabs rbx, 0xdeadbeefdeadbeef
  mov    rax, 1
  add    rax, rbx
```

- lets use some more cool features of the intel instruction set
``` gas
  rdrand rbx                
  mov    rax, 1
  add    rax, rbx
  popcnt rbx, rax
```
- lets get a brief glimpse at how to use memory locations for the value
``` gas
        .intel_syntax noprefix
        .data
x:       .quad 142
y:       .quad 4200
sum:     .quad

        .text
        .global _start
_start:
        mov rax, QWORD PTR x
        add rax, QWORD PTR y
        mov QWORD PTR sum, rax
```

In [None]:
display(showET("Editor"))

In [None]:
display(Markdown(FileCodeBox(
    file="../src/add.s", 
    lang="gas", 
    title="",
    h="100%", 
    w="100%"
)))

In [None]:
display(showBT())

In [None]:
display(Markdown(FileCodeBox(
    file="add_build.sh", 
    lang="shell", 
#    title="<b>NOTES: on building add", 
    h="100%", 
    w="100%")))

In [None]:
display(showDT())

In [None]:
display(Markdown(FileCodeBox(
    file="add_gdb.txt", 
    lang="shell", 
    title="", 
    h="100%", 
    w="100%")))

### Exercises 

  
- try repeating what we did with `add` with `imul`, `and`, `or`, `xor`: for each
  - create a new file
  - add targets to Makefile for it
  - use gdb to explore what the instruction does





## Ending / Exiting our Program/Process

- What happens if we run our programs outside of the debugger?
  - why does this happen?

In [None]:
display(showDT())

### How can we avoid this

1. TRAP: Use an instruction that tells the OS to 
    - stop the process and give control back to the debuggger
    - if no debugger is running just kill process and signal shell
        - Instruction: `int3`: 
        - Opcode: `0xCC` 
        - Description: `Interrupt 3 â€” trap to debugger`
2. Call OS Kernel Exit Process call
    - This is an example of calling an OS Kernel call to have the kernel do something for your process
    - We will look at this more but for the moment here is what is necessary to call `exit`
       - pass return value to Kernel 
       - exit/terminate process

### Interrupt 3 `int3` -- trap to debugger

<img src="../images/int3mp.png">

In [None]:
display(Markdown(FileCodeBox(
    file="../src/int3.s", 
    lang="gas", 
    title="",
    h="100%", 
    w="100%"
)))

### Exit -- An OS service to terminate a process

To exit your process and return an exit value 
  - requires a call to the OS!

On Intel the instruction is `syscall`

<img src="../images/syscallmp.png">


### The OS System Calls

Each OS Kernel provides a set of calls that an a process can invoke using the `syscall` instruction on an Intel based computer

The Linux Kernel supports a very large number of system calls each is identified by a unique number that must be placed in `RAX` prior to executing the `syscall` instruction.  Additional arguments are passed in by setting other registers.  

With each version of the Kernel the table of calls changes.  Here is one site that provides a list



In [None]:
display(IFrame("https://filippo.io/linux-syscall-table/", height=600, width="100%"))

- From the above we can see that the `exit` system call number is `60`
- reading some man pages `man syscall` and `man syscalls` we find that
  - we must place `60` in `rax`
  - and that the value we want to return in `rdi`

In [None]:
display(Markdown(FileCodeBox(
    file="../src/exit_bb_bb.s", 
    lang="gas", 
    title="",
    h="100%", 
    w="100%"
)))

We will revisit OS system calls in more detail later
- this is good enough for the moment


#### Avoiding Hard coding system call numbers

Operating system code usually provides files that you can include in your code so that you don't have to hardcode magic numbers like `60` for exit.  In Linux you can add the following file `#include <asm/unistd_64.h>` to get all the system call numbers.  You can then use `__NR_exit` to mean the number for the exit system call.

eg.
exitfancy.S
``` gas
#include <asm/unistd_64.h>
    .intel_syntax noprefix
    .text
    .global _start
_start:
    mov rax,__NR_exit # exit system call number
    mov rdi,0         # UNIX success value is 0
    syscall           # call OS. This will not return
```
    
But the assemble does not support have support for including files.
We must first use another tool called a preprocessor eg.
```
cc -E exitfancy.S > exitfancy.s
as -g exitfancy.s -o exitfancy.o
ld -g exitfancy.o -o exitfancy
```
In general we will just skip this and we will just use hardcoded numbers.

## Exercises  and extra materials

- rewrite all the examples to use int3 at the end 
- rewrite all the examples to call OS exit call
- combine some of the examples
- see what happens when you add `1` to `0xffffffffffffffff` using the add instructions.
   - any idea what is going on?

### Makefile for all the lecture examples

```make
popcnt: popcnt.o
        ld -g popcnt.o -o popcnt

popcnt.o: popcnt.s
        as -g popcnt.s -o popcnt.o

add: add.o
        ld -g add.o -o add

add.o: add.s
        as -g add.s -o add.o

exit: exit.o
        ld -g exit.o -o exit

exit.o: exit.s
        as -g exit.s -o exit.o

int3: int3.o
        ld -g int3.o -o int3

int3.o: int3.s
        as -g int3.s -o int3.o

exitfancy: exitfancy.o
        ld -g exitfancy.o -o exitfancy

exitfancy.o: exitfancy.s
        as -g exitfancy.s -o exitfancy.o

exitfancy.s: exitfancy.S
        cc -E exitfancy.S > exitfancy.s

clean:
        -rm -f $(wildcard *.o  popcnt  add int3 exit exitfancy exitfancy.s)
```


### Here is a fully documented fancy version of exit

- We use the the preprocessor to include the OS system call numbers
- and we use the `.equ` directive of the assembler to make our code more readable


In [None]:
display(Markdown(
'''
A commented version that avoids "magic" numbers. 
'''    +
    
    FileCodeBox(
    file="../src/exit.S", 
    lang="gas", 
    title="",
    h="100%", 
    w="200%"
)))