Skip to main content

Command Palette

Search for a command to run...

How to Write a Hello World Program in Assembly Language

Published
4 min read

Now, let's look at the traditional "hello, world" program in programming. I'm not a big fan of it, but it's basic and helps us see the output on the terminal for the first time, so it's fine here. Below is a simple C++ code example, which is considered good for "hello, world." Some people don't like using a global namespace, but it's okay here because it keeps the code simple for a single file.

#include <iostream>
using namespace std;

int main() {
    cout << "Hello, World!";
    return 0;
}

Now, let's look at the equivalent "hello, world" in assembly language. It's not exactly the same because we see a 0 at the end of the output when we run it in the terminal. Let's go through a line-by-line explanation of this code.

.global _start

.section .data
msg:
  .ascii "Hello, World\n"
len = . - msg

.section .text
_start:
  mov $1, %rax
  mov $1, %rdi
  mov $msg, %rsi
  mov $len, %rdx
  syscall

  mov $60, %rax
  xor %rdi, %rdi
  syscall

I will skip the explanation for the exit code and the global start as well.

What we see are these two different parts in our program.

.section .data
.section .text

What is .section here and what is .data here?

They are assembler directives. That's what they're called. For now, let's just assume these are instructions that the assembler uses to read and remember, but they are not passed on to the CPU.

So, when we write this...

.section .data

we will declare all the data-related stuff below this section, and for the code below, we will declare all the code that will do the heavy lifting.

.section .text

So now, let's go over the .data section and understand it thoroughly.

.section .data
msg:
    .ascii "Hello, World\n"
len = . - msg

This line where we write msg is used to declare a variable. It's like an identifier, but in assembly terms, we call this a label. You can name it anything you like, so it's kind of name agnostic.

msg:
    .ascii "Hello, world\n"

So, what are we declaring inside this msg label here?

We are declaring an ASCII type word, which doesn't have the null-terminated strings that C has. Null-terminated strings are available in GNU assembly, but they need to use .asciz as a declaration so the assembler knows to include them.

This means we now have some bytes of data stored in memory, which we can use again.

len = . - msg

What does this line signify here?

This line means we have a len variable where we store the length of the msg label we declared. We need this length for the write syscall. Here, the "." represents the current memory location. We subtract the length of msg from it to allocate space in memory, allowing us to run the program.

mov $1, %rax
mov $1, %rdi
mov $msg, %rsi
mov $len, %rdx
syscall

This is a very basic write syscall. As I mentioned in the first blog, it follows this format:

syscall (syscall number, fileDescriptor, buffer, length)

The text above provides a mental model for the write syscall, and here's how we can perform this syscall:

First, the write syscall code is 1, so we move the value 1 into rax.

mov $1, %rax

And now, for the file descriptor value, we move 1 into rdi because it acts as the file descriptor in this code. This allows the value to be printed to stdout in the terminal.

mov $1, %rdi

Now that this is done, we need to send the buffer to rsi. We will move the buffer to rsi.

mov $msg, %rsi

And the last step is to move the length value to rdx, then perform the syscall. This will successfully print "Hello, World."

mov $len, %rdx
syscall

This concludes the hello, world program.

Note: If you don't include the exit syscall after this write syscall, the terminal may wait for a while or indefinitely. It's important to inform the system that we have finished and are ready to exit the program.