Trouble opening and reading file in ARM64 assembly on apple mac M1 cpu

.section __DATA,__data
    .p2align 2
    buffer:
        .zero 4096

.section __TEXT,__text
.global _main
.build_version macos, 13, 0


.p2align 2
_main:
    // x9: buf ptr
    // x10: file descriptor storage
    // x11: file size in bytes

    //init ptr to buf
    adrp    x9, buffer@PAGE              
    add     x9, x9, buffer@PAGEOFF

    // open file
    adr     x0, file_path                      
    mov     x1, #0
    mov     x2, #444
    mov     x16, #5
    svc     0

    // copy file descriptor to x10
    mov     x10, x0                    

.p2align 2
stream_buffer:
    // make syscall read, file descriptor is in x10                                   
    mov     x0, x10
    mov     x1, x9                      
    mov     x2, #4096
    mov     x16, #3
    svc     0

    // if x0 == 0, exit, no bytes were read
    cmp     x0, #0
    beq     exit
    blt     error

    // store number of bytes read
    mov     x11, x0

    // write to stdout from buffer
    mov     x0, #1
    mov     x1, x9
    mov     x2, x11        
    mov     x16, #4
    svc     0

    b       stream_buffer

.p2align 2
exit:
    // exit with status code 0
    mov     x0, #0                      
    mov     x16, #1
    svc     0

.p2align 2
error:
    mov     x0, #1
    adr     x1, file_not_found_error_string
    mov     x2, #20
    mov     x16, #4
    svc     0
    b       exit

.p2align 2
file_path:
    .asciz "/test.txt"

.p2align 2
file_not_found_error_string:
    .asciz "file was not found.\n"


I am trying to learn assembly by writing a simple program that models the ‘cat’ linux command. I am on a macbook air 2020 with an M1 chip. My program compiles fine, but when executing the binary, I am met with my program expecting input, to which it then echos whatever was input. I believe that I am misusing my file descriptors. Any help appreciated.

  • Tracing system calls using dtrace should be helpful to see what system calls your program actually makes, decoding their args and (error) return values.

    – 

Oh this is hilarious.

The fact that your code ends up reading from stdin is the culmination of bugs in your code, paired with some unexpected OS behaviour.

Let’s look at this from a high-level perspective first:

  1. You open /test.txt for reading.
  2. You read up to 4096 bytes from it.
  3. You write those bytes to stdout.

But you’re on arm64 macOS, which means that unless you’ve gone to great lengths to mess with the OS, the system volume is readonly and /test.txt does not exist.

So your open syscall is failing, but you don’t detect that because you don’t do error checking there. Bad!
Now, you might assume x0 to be -1 in that case because that’s what open() does if called from C, but that’s not the syscall ABI. If you look at /usr/lib/system/libsystem_kernel.dylib is a disassembler and seek to ___open, you’ll see this:

;-- ___open:
;-- func.00002308:
0x00002308      b00080d2       mov x16, 5
0x0000230c      011000d4       svc 0x80
0x00002310      03010054       b.lo 0x2330
0x00002314      7f2303d5       pacibsp
0x00002318      fd7bbfa9       stp x29, x30, [sp, -0x10]!
0x0000231c      fd030091       mov x29, sp
0x00002320      8a030094       bl sym._cerror
0x00002324      bf030091       mov sp, x29
0x00002328      fd7bc1a8       ldp x29, x30, [sp], 0x10
0x0000232c      ff0f5fd6       retab
0x00002330      c0035fd6       ret

The key part here is the b.lo. Syscalls use the carry flag (the “C” in NZCV) to signal whether there was an error or not. This means that:

  • b.lo -> x0 holds a file descriptor
  • b.hs -> x0 holds an errno value

So your syscall fails and returns an error value in x0. Specifically ENOENT, since it can’t find the file you asked for. And ENOENT happens to be 2, so when you pass that error value to your next syscall, you end up reading from file descriptor 2, which is stderr. But now, because you invoked your binary from the command line, file descriptors 0, 1 and 2 all happen to just be one and the same file descriptor, so reading from stderr in this case behaves like reading from stdin.

So how do you fix this? Put a b.hs error after the first svc.
And then pick a file path that actually exists.

Leave a Comment