This is a simple hello world application, written in assembly. When run, it will output to the screen the words “Hello World.” Below, I will go through the phases of how I constructed it, based on the course material linked at the top of this post.
Phase I: Sections
When starting with a simple Assembly language file, we identify the sections. Sections declare what is allowed to occur within themselves. For example one section will be to set values to a string “hello world” and another section will be used to hold our code.
This file will have two sections:
Sections are defined using the section keyword, followed by dot syntax of the type. For example:
The above defines the text section. In Assembly the .text section refers to the actual flow of the logic – basically it’s where the code is kept.
The other section that will be used here is the .data section, this is where we declare our initialized values.
At this point the file looks like this:
Phase II: Data & Refactor
Declaring our hello world will work like so:
hello_world: db ‘Hello World’
The above code declares the symbol hello_world and used db (define byte) to set it’s value to the string Hello World.
At this time we can also consider our code. We have a text section with nothing in it. We should create a procedure that will start the code (later we will put all our code here) and we should make a global call to it when the binary is executed.
Inside the text section add:
; this is where the code will go
_start: is a procedure. It will contain all the Assembly language instructions to present the words “hello World” to screen. The ; is a comment.
The application will need guidance on what procedure to start with (or what code to launch when it’s run), we define this with a first instruction that sets a global call to the above procedure:
At this point our file looks like:
; put code here.
hello_world: db ‘Hello World’
Phase III: System Calls
A system call is an OS level task. We can repurpose these tasks so that we don’t have to write all the code required to print to screen some text.
On my linux distribution there’s a file called unistd_64.h which lists all the available system calls. For me this is found in:
Above is an example of the unistd_64.h file. You can see each system call and it’s numeric value. Write has a value of 1, Open a value of 2, etc.
In Assembly we invoke a system call using the syscall keyword.
For our little hello world app, we have to Write to screen and Exit the program afterwards. To do this we can leverage system calls. Write has a syscall value of 0 and exit has a syscall value of 60 (all found within the unistd_64.h file.) There is one issue with using Write. We need to know the parameters it takes.
Invocation of SysCalls
- SYSCALL # must be moved into the RAX register
- First argument for the SYSCALL is in the RDI register
- 2nd argument (if applicable) goes to the RSI register
- 3rd argument (if applicable) goes to the RDX register
- 4th argument (if applicable) goes to the R10 register
- 5th argument (if applicable) goes to the R8 register
- 6th argument (if applicable) goes to the R9 register
At this point we need to know what the write function takes. Then we need to supply those values via Assembly.
According to the man page on write(2), the write function takes three parameters:
- File descriptor (where we write to, screen, disk, etc.)
- Buffer (what we’re going to send)
- size_t (length of what we’re sending)
For the FD, we want to make use of STDOUT. Accordingly we can check around to see that stdout has a value of 1.
For the value we are sending in to the buffer we going to use the label we created in the data section (hello_world.)
Finally for the size, we want to capture the length of the value held in the hello_world label.
The length is a trick learned from Vivek Ramachandrin’s course. We need to pass in the length of our string and Vivek shows that we can set a label as follows:
length: equ $-hello_world
This code above is saying this: Length is equal to the current label location ($) minus the location for hello_world.
We put that in the data section.
This block of commands would look like:
mov rax, 1
mov rdi, 1 (setting the stdout)
mov rsi, hello_world
mov rdx, length
The syscall keyword follows the block. In the data section we have the length defined.
Great, now we need to call the exit instruction. We found the syscall for exit as #60. It takes one parameter (the exit value it will return – which is typically 0.)
mov rax, 60
mov rdi, 0
Now the full script looks like this:
Phase IV: Assembling & Linking
To assemble our file we run:
nasm -f elf64 [filename] -o [outputfile.o]
ld [outputfile.o] -o [final file]
we can now run the final output file.