In assembly language, strings are sequences of characters stored in memory. Unlike high-level languages, assembly doesn't have built-in string data types. Instead, strings are represented as arrays of bytes or words.
Strings in assembly are typically declared using directives like DB
(Define Byte) or DW
(Define Word). The string is usually terminated with a null byte (0) to mark its end.
message DB 'Hello, World!', 0
Assembly doesn't provide high-level string manipulation functions. Programmers must implement their own routines or use system calls for string operations. Common tasks include:
; Calculate string length
mov si, offset message ; Load string address
xor cx, cx ; Initialize counter to 0
count_loop:
lodsb ; Load byte from SI into AL
test al, al ; Check if it's null terminator
jz done ; If zero, we're done
inc cx ; Increment counter
jmp count_loop ; Continue loop
done:
; CX now contains the string length
For string I/O, assembly programmers often rely on system calls or BIOS interrupts. These vary depending on the operating system and architecture. For example, in DOS, INT 21h provides functions for string input and output.
mov ah, 09h ; DOS function: print string
mov dx, offset message
int 21h ; Call DOS interrupt
Working with strings in assembly requires a deep understanding of memory management and low-level operations. While challenging, it offers fine-grained control over text processing. For more complex operations, consider leveraging system calls or creating reusable subroutines.