Zero-extension refers to the process of expanding a binary number by adding zeros to the higher-order bits. In this post, we will explore how zero-extension of 32-bit results works in the x86-64 architecture.

How does zero-extension in x86-64 work?

In general, byte and word operands are stored in the low 8 or 16 bits of GPRs without modifying their high 56 or 48 bits, respectively. Doubleword operands, however, are normally stored in the low 32 bits of GPRs and zero-extended to 64 bits.

From AMD64 Architecture Programmer’s Manual Volume 1: Application Programming

For example, if the value in register RAX is 0x0001000100010001, what will be the value of RAX after the following instruction is executed?

add eax, eax

The answer is - 0x0000000000020002, as the higher 32 bits of register RAX is automatically zero-extended (cleared).

Tips and Tricks

The zero idiom in IA-32 is using the XOR instruction instead of MOV to initialize a register to 0, as the former generates shorter opcode.

For example, the following instruction is preferred when clearing register EAX:

xor eax, eax

How about clearing register RAX in the 64-bit mode of x86-64? Let's experiment with the following C code hello.c:

#include <stdio.h>

int main() {
    printf("hello, world\n");
    return 0;
}

Use gcc to generate the O2-optimized assembly code:

gcc hello.c -O2 -S -o hello.S

The content of hello.S is shown as follows:

        .file   "hello.c"
        .text
        .section        .rodata.str1.1,"aMS",@progbits,1
.LC0:
        .string "hello, world"
        .section        .text.startup,"ax",@progbits
        .p2align 4
        .globl  main
        .type   main, @function
main:
.LFB11:
        .cfi_startproc
        subq    $8, %rsp
        .cfi_def_cfa_offset 16
        leaq    .LC0(%rip), %rdi
        call    puts@PLT
        xorl    %eax, %eax
        addq    $8, %rsp
        .cfi_def_cfa_offset 8
        ret
        .cfi_endproc
.LFE11:
        .size   main, .-main
        .ident  "GCC: (GNU) 15.2.1 20251112"
        .section        .note.GNU-stack,"",@progbits

According to the x86-64 calling convention, integer return values are stored in RAX. Therefore, the instruction of setting the return value to 0 is

xorl %eax, %eax

As mentioned earlier, the results of higher 32 bits of GPRs will be automatically zero-extended. Therefore, clearing EAX implictly sets the higher 32 bits of RAX to 0.

But you may wonder, why does the compiler choose to generate assemebly code xor eax, eax rather than xor rax, rax? The reason lies in the difference in opcode length:

Mnemonic	Opcode
xor eax, eax	31 c0
xor rax, rax	48 31 c0

As you can see, the former instruction has shorter opcode.

The prefix 0x48 in the opcode is called REX prefix, according to the AMD64 manual:

For most instructions, the default operand size in 64-bit mode is 32 bits. To access 16-bit operand sizes, an instruction must contain an operand-size prefix (66h), as described in Section 3.2.3, “Operand Sizes and Overrides,” on page 41. To access the full 64-bit operand size, most instructions must contain a REX prefix.

From AMD64 Architecture Programmer’s Manual Volume 1: Application Programming

Zero-Extension of 32-bit Results in x86-64

How does zero-extension in x86-64 work?

Tips and Tricks