Assembly – Writing Hello World

Course notes from Vivek Ramachandran’s online class “x86/64 Assembly and Shellcoding on Linux

This is a simple hello world application, written in assembly.  When run, it will output to the screen the words “Hello World.”  Below, I will go through the phases of how I constructed it, based on the course material linked at the top of this post.

Phase I: Sections

When starting with a simple Assembly language file, we identify the sections.  Sections declare what is allowed to occur within themselves.  For example one section will be to set values to a string “hello world” and another section will be used to hold our code.

This file will have two sections:

  • text
  • data

Sections are defined using the section keyword, followed by dot syntax of the type. For example:

section .text

The above defines the text section.  In Assembly the .text section refers to the actual flow of the logic – basically it’s where the code is kept.

The other section that will be used here is the .data section, this is where we declare our initialized values.

At this point the file looks like this:

section .text

section .data

Phase II: Data & Refactor

Declaring our hello world will work like so:

section .data
          hello_world: db ‘Hello World’

The above code declares the symbol hello_world and used db (define byte) to set it’s value to the string Hello World.

At this time we can also consider our code.  We have a text section with nothing in it.  We should create a procedure that will start the code (later we will put all our code here) and we should make a global call to it when the binary is executed.

Inside the text section add:

  _start:
           ; this is where the code will go

_start: is a procedure. It will contain all the Assembly language instructions to present the words “hello World” to screen.  The ; is a comment.

The application will need guidance on what procedure to start with (or what code to launch when it’s run), we define this with a first instruction that sets a global call to the above procedure:

global _start

At this point our file looks like:

global _start

section .text
            _start:
                    ; put code here.

section .data
           hello_world: db ‘Hello World’

Phase III: System Calls

A system call is an OS level task.  We can repurpose these tasks so that we don’t have to write all the code required to print to screen some text.

On my linux distribution there’s a file called unistd_64.h which lists all the available system calls.  For me this is found in:

/usr/include/x86_64-linux-gnu/asm/unistd_64.h

Screen Shot 2015-11-05 at 2.40.39 PM

Above is an example of the unistd_64.h file.  You can see each system call and it’s numeric value.  Write has a value of 1, Open a value of 2, etc.

In Assembly we invoke a system call using the syscall keyword.

For our little hello world app, we have to Write to screen and Exit the program afterwards.  To do this we can leverage system calls.  Write has a syscall value of 0 and exit has a syscall value of 60 (all found within the unistd_64.h file.)  There is one issue with using Write.  We need to know the parameters it takes.

Invocation of SysCalls

  1. SYSCALL # must be moved into the RAX register
  2. First argument for the SYSCALL is in the RDI register
  3. 2nd argument (if applicable) goes to the RSI register
  4. 3rd argument (if applicable) goes to the RDX register
  5. 4th argument (if applicable) goes to the R10 register
  6. 5th argument (if applicable) goes to the R8 register
  7. 6th argument (if applicable) goes to the R9 register

At this point we need to know what the write function takes.  Then we need to supply those values via Assembly.

According to the man page on write(2), the write function takes three parameters:

  • File descriptor (where we write to, screen, disk, etc.)
  • Buffer (what we’re going to send)
  • size_t (length of what we’re sending)

File Descriptor 

For the FD, we want to make use of STDOUT.  Accordingly we can check around to see that stdout has a value of 1.

Buffer

For the value we are sending in to the buffer we going to use the label we created in the data section (hello_world.)

Size

Finally for the size, we want to capture the length of the value held in the hello_world label.

The length is a trick learned from Vivek Ramachandrin’s course. We need to pass in the length of our string and Vivek shows that we can set a label as follows:

length: equ $-hello_world

This code above is saying this: Length is equal to the current label location ($) minus the location for hello_world.

We put that in the data section.

This block of commands would look like:

mov rax, 1
mov rdi, 1 (setting the stdout)
mov rsi, hello_world
mov rdx, length
syscall

The syscall keyword follows the block.  In the data section we have the length defined.

Great, now we need to call the exit instruction.  We found the syscall for exit as #60.  It takes one parameter (the exit value it will return – which is typically 0.)

mov rax, 60
mov rdi, 0
syscall

Now the full script looks like this:

Screen Shot 2015-11-05 at 3.52.12 PM

 Phase IV: Assembling & Linking

To assemble our file we run:

nasm -f elf64 [filename] -o [outputfile.o]

ld [outputfile.o] -o [final file]

we can now run the final output file.

Posted in: ASM

Leave a Reply

Your email address will not be published. Required fields are marked *