我正在为我的班级做一个项目,该项目要求我逐行从文件中读入。我们的最终目标是使用 MIPS 语言制作一个 MIPS 汇编器,但与本文相关的问题是我无法从文件中逐行读取。到目前为止,我已经创建了这个作为从文件中读取信息的机制。
.data
file_loc: .asciiz "test.asm" #note: when launching from commandline, test.asm should be within the same folder as Mars.jar
buffer: .space 1024 #buffer of 1024
new_line: .asciiz "\n" #where would I actually use this?
#error strings
readErrorMsg: .asciiz "\nError in reading file\n"
openErrorMsg: .asciiz "\nError in opening file\n"
.text
main:
jal openFile
j endProgram
openFile:
#Open file for for reading purposes
li $v0, 13 #syscall 13 - open file
la $a0, file_loc #passing in file name
li $a1, 0 #set to read mode
li $a2, 0 #mode is ignored
syscall
bltz $v0, openError #if $v0 is less than 0, there is an error found
move $s0, $v0 #else save the file descriptor
#Read input from file
li $v0, 14 #syscall 14 - read filea
move $a0, $s0 #sets $a0 to file descriptor
la $a1, buffer #stores read info into buffer
li $a2, 1024 #hardcoded size of buffer
syscall
bltz $v0, readError #if error it will go to read error
li $v0, 4
la $a0, buffer
syscall
#Close the file
li $v0, 16 # system call for close file
move $a0, $s0 # file descriptor to close
syscall # close file
jr $ra
openError:
la $a0, openErrorMsg
li $v0, 4
syscall
j endProgram
readError:
la $a0, readErrorMsg
li $v0, 4
syscall
j endProgram
endProgram:
li $v0, 10
syscall
问题是读入文件将读入缓冲区(1024)中可以保存的尽可能多的字节,而不是行。
例如,读取包含以下数据的名为 test.asm 的文件:
test abc abc abc
test2 1231 123 123
将产生以下输出:
test abc abc abc
test2 1231 123 123
我希望一次读一行:
test abc abc abc
我知道减少缓冲区大小将有助于限制这些信息,但在长行输入的情况下可能会导致一些问题。我希望看看是否有人知道如何从缓冲区读取一定数量的内容,然后在新行指示器处将其拆分(我很确定它的“ ”)。
任何帮助/提示将不胜感激!谢谢!
当然可以像您一样读取大块内容。这在最小化系统调用方面是有效的,但代码编写起来有点麻烦。您需要一个循环来遍历每个块以定位任何换行符,并需要一个缓冲区来存储需要多次读取才能构建的不完整行。
一种更容易编码的惰性方法是逐字节读取文件,这使得换行符检测变得毫不费力:
# if current byte is a newline, consume line
lb $s4 ($s1) # load the current byte from the buffer
li $t0 10 # ASCII newline
beq $s4 $t0 consume_line
您可以使用
sbrk
(系统调用 9)为缓冲区分配更多内存以处理任意长的行,但对于本示例,我假设行的长度永远不会超过 1024 字节。我留下更好的错误处理、行消耗代码和模块化作为练习,以便保持示例最少。
您可以将其另存为
read_file_lines.s
,使用 spim -f read_file_lines.s
运行,它将打印自己的源代码。
.data
fin: .asciiz "read_file_lines.s"
buffer: .space 1
line: .space 1024
.globl main
.text
main:
la $s1 buffer
la $s2 line
li $s3 0 # current line length
# open file
li $v0 13 # syscall for open file
la $a0 fin # input file name
li $a1 0 # read flag
li $a2 0 # ignore mode
syscall # open file
move $s0 $v0 # save the file descriptor
read_loop:
# read byte from file
li $v0 14 # syscall for read file
move $a0 $s0 # file descriptor
move $a1 $s1 # address of dest buffer
li $a2 1 # buffer length
syscall # read byte from file
# keep reading until bytes read <= 0
blez $v0 read_done
# naively handle exceeding line size by exiting
slti $t0 $s3 1024
beqz $t0 read_done
# if current byte is a newline, consume line
lb $s4 ($s1)
li $t0 10
beq $s4 $t0 consume_line
# otherwise, append byte to line
add $s5 $s3 $s2
sb $s4 ($s5)
# increment line length
addi $s3 $s3 1
b read_loop
consume_line:
# null terminate line
add $s5 $s3 $s2
sb $zero ($s5)
# reset bytes read
li $s3 0
# print line (or consume it some other way)
move $a0 $s2
li $v0 4
syscall
# print newline
li $a0 10
li $v0 11
syscall
b read_loop
read_done:
# close file
li $v0 16 # syscall for close file
move $a0 $s0 # file descriptor to close
syscall # close file
# exit the program
li $v0 10
syscall