Results 1 to 6 of 6

Thread: MIPS Assembly, an Introduction

  1. #1
    Elite Hacker
    Join Date
    Mar 2003
    Posts
    1,407

    MIPS Assembly, an Introduction

    MIPS Assembly, an Introduction
    by: h3r3tic

    The following is a short introduction to programming in assembly for the MIPS architecture. I know what you're thinking,
    "I don't even know what MIPS architecture is, why would I want to write assembly for it?" Well, I'll tell you why.
    There is a simulator available for download here: http://www.cs.wisc.edu/~larus/spim.html
    which will let you run your MIPS assembly programs without having a computer with the MIPS architecture.
    I've dabbled a little with nasm, and I just think MIPS is a lot more straightforward, and everything seems to
    have a simple explaination of why it's there or what it's doing. So I have decided to share this perhaps
    little known, or perhaps well-known "language" with you. (I only knew about it after starting a class where
    we use it).

    I'll start by showing you how to write the traditional "Hello World!" program. Now, there is no set way
    to do this, and I'm sure it can be done many ways, but I will show you two ways to do it.
    So, the uncommented code is as follows:
    Code:
    .data
    hellostring:	.ascii "Hello "
    		.asciiz "World!\n"
    
    .text
    
    main:
    	la	$a0, hellostring
    	li	$v0, 4
    	syscall
    
    	li	$v0, 10
    	syscall
    and here is the commented code:
    Code:
    .data						# comments are preceeded by the # character
    						# .data is where you define "variables", or how I understand them...
    						# references to the space in memory where the following defined data is stored
    
    hellostring:	.ascii "Hello "			# .ascii is how you define a non-null terminated
    						# string.
    
    		.asciiz "World!\n"		# this .asciiz is a continuation of hellostring,
    						# putting the rest of the string in memory, with
    						# a null terminator, which is what the .asciiz if for
    						# I could have made it one line with:
    						# .asciiz "Hello World!\n";
    
    .text						# this defines where code to execute starts
    
    main:						# this can be defined as many things in my opinion
    						# I would have to say it is most accurately described
    						# as a memory reference, but you could also say
    						# it defines the beginning of a procedure or function.
    
    						# la = load address
    	la	$a0, hellostring			# this loads the address which hellostring refers to into
    						# the register $a0, which is one of 3 used for arguments
    						# to "functions"
    
    						# li = load immediate
    						# which is used to load characters and integers
    						# onto registers
    	li	$v0, 4				# we are loading the integer 4 onto the register
    						# $v0, which tells the system we want to output
    						# a string.  I will go over more system calls later
    
    	syscall					# this tells the system to execute the "function"
    						# defined by loading 4 onto $v0.  $v0 is out register
    						# for system functions.  When you do your own functions
    						# you don't need to use $v0, but can, and you won't
    						# have to say syscall, and don't know if you even can
    
    	li	$v0, 10				# 10 is the call to end the program, and when this value
    						# is on $v0 and a syscall is made as here, the program
    						# will terminate.
    	syscall
    The preceeding code will output:
    Hello World!

    Pretty simple for being assembly eh? To run your code you can save it with any/no extension, and on linux
    just type the command:
    spim -file yourfilename

    and it will run your program (assuming spim is in your path and you saved your file in the current directory you
    are in as "yourfilename" without quotes :P).

    On windows with pcspim, I think you can do the same thing on the command line.
    If not check out the site you downloaded it from (http://www.cs.wisc.edu/~larus/spim.html) they have a bunch of
    info including how to run programs in the pcspim gui. You can probably figure it out on your own though. Moving on.

    Let's take a look at a more interesting approach to the hello world program:
    full uncommented code:
    Code:
    .data
    stringstore:	.space 20
    
    .text
    
    main:
    	la	$s0, stringstore
    
    	li	$t0, 'H'
    	sb	$t0, ($s0)
    	addi	$s0, $s0, 1
    
    	li	$t0, 'e'
    	sb	$t0, ($s0)
    	addi	$s0, $s0, 1
    
    	li	$t0, 'l'
    	sb	$t0, ($s0)
    	addi	$s0, $s0, 1
    
    	sb	$t0, ($s0)
    	addi	$s0, $s0, 1
    
    	li	$t0, 'o'
    	sb	$t0, ($s0)
    	addi	$s0, $s0, 1
    
    	li	$t0, ' '
    	sb	$t0, ($s0)
    	addi	$s0, $s0, 1
    
    	li	$t0, 'W'
    	sb	$t0, ($s0)
    	addi	$s0, $s0, 1
    
    	li	$t0, 'o'
    	sb	$t0, ($s0)
    	addi	$s0, $s0, 1
    
    	li	$t0, 'r'
    	sb	$t0, ($s0)
    	addi	$s0, $s0, 1
    
    	li	$t0, 'l'
    	sb	$t0, ($s0)
    	addi	$s0, $s0, 1
    
    	li	$t0, 'd'
    	sb	$t0, ($s0)
    	addi	$s0, $s0, 1
    
    	li	$t0, '!'
    	sb	$t0, ($s0)
    	addi	$s0, $s0, 1
    
    	li	$t0, '
    '
    	sb	$t0, ($s0)
    	addi	$s0, $s0, 1
    
    	sb	$zero, ($s0)
    
    	la	$a0, stringstore
    	li	$v0, 4
    	syscall
    
    	jr	$ra
    leaving out repetitive parts and commented:
    Code:
    .data
    stringstore:	.space 20			# allocate 20 bytes starting at address referenced by
    						# stringstore
    
    .text
    
    main:
    	la	$s0, stringstore			# load address of stringstore into $s0.  $s0, is one of
    						# seven save registers ($s0-$s7).  You can store stuff in these
    						# save registers with confidence it will be there later
    						# I believe each register is 32 bytes.
    
    	li	$t0, 'H'				# $t0 is, like $s0, one of seven of its kind ($t0-$t7).  But unlike 
    						# $s0 it is meant for temporary storage, which is why it's called
    						# temporary register. You use it to load things
    						# you will need access to soon, but not throughout the program.
    						# I am loading the literal character H onto $t0, (or maybe
    						# its ascii value, who knows what it's actually doing)
    
    						# sb = store byte
    	sb	$t0, ($s0)			# ok, this part is important, and took me a while to figure out.
    						# notice the parenthesis around $s0.  Those are
    						# indicating that I am storing the contents of $t0 
    						# at the address on $s0.  I didn't know this was possible
    						# at first, but it is very useful, for me at least.
    						# I really needed to store stuff at an address I had
    						# on a register in most of my programs.
    
    	addi	$s0, $s0, 1			# $s0 = $s0 + 1;  The far left register is the destination.
    						# the i on the end stands for immediate, meaning you can
    						# specify integers rather than having to load them onto a
    						# register to use the normal add.  We are adding 1 here
    						# to increment to the next address to store the next letter
    						# in our buffer of 20 bytes called stringstore.  Remember
    						# we loaded its address onto $s0 at the beginning of main
    						# now we are at a location one byte away from its address
    						# which is convenient since characters are 1 byte, so we can
    						# now store our second character.
    
    	li	$t0, 'e'
    	sb	$t0,  ($s0)
    	addi	$s0, $s0, 1
    	li	$t0, 'l'
    	sb	$t0, ($s0)
    	addi	$s0, $s0, 1
    	sb	$t0, ($s0)			# notice I didn't load a new character onto $t0, that's because
    						# I already loaded 'l' onto $t0 and it's still there, no need to load
    						# it again.
    
    	li	$t0, 'o'
    	sb	$t0, ($s0)
    						# I think you can see where this is going
    						# to load a newline character you can give it literally, like this
    
    	li	$t0, '
    '
    						# notice the ending single quote is on the next line.  That's because
    						# I hit enter there to indicate a newline character.  I think I tried this
    						# using '\n' and it didn't work.  So this should work for you.
    
    						# ok, so now you're loaded and stored all the proper characters into stringstore
    						# assuming you filled in code I left out because it repeats
    						# we probably don't need to null terminate the buffer, but let's
    						# do it anyways just to be safe.  I am assuming here that you incremented
    						# $s0 after storing the last character into stringstore
    
    	sb	$zero, ($s0)			# $zero is a reserved register in mips, for zero.  Which can also be used
    						# to terminate strings, and as false for booleans etc..  $0 is the same as
    						# $zero.  So now we have all our characters in our "buffer" and they are
    						# "null" terminated.  Let's load our "string" for output
    
    	la	$a0, stringstore			# all that stuff we were doing earlier was directly affecting the memory
    						# allocated for stringstore, so essentially, stringstore is a pointer to
    						# the address of the first character in our string.  Let's do the same syscall
    						# as in the previous hello world program, which will output what's in our buffer
    						# until we hit a our terminator (where we stored $zero)
    
    	li	$v0, 4				# call for string output
    	syscall					# do string output
    
    	jr	$ra				# another way to end the program other than loading 10 onto $v0
    						# this is assuming $ra hasn't been modified, which it hasn't.
    						# $ra gets modified when you do a jal (jump and link) which
    						# stores the spot after your jump in your program into $ra
    						# so you can return to it by jumping to it.
    For me this version of hello world relates heavily to C. $s0 in this program is like a character pointer,
    and you are simply moving through an array of bytes storing each character to make a string.

    Also, in the example above, I used sb to store one byte of data at the address which was on
    a register. You can also work with an address on a register with load byte (lb).

    lb $t0, ($s0)

    Will load the value stored at the address on $s0, which will probably be a character, or if
    an integer (4 bytes) is stored there, you will only get part of it. So I recommend you only
    use this when working with characters as above or anything that is only one byte.

    Here are some of the basic system calls

    1: When you load immediate (li) 1 onto $v0, you are telling the system that you want to output an integer.
    I will show you an example using integers later. You will have to load the integer you want to output onto
    $a0, before you do your syscall. $a0, is the argument for this system call, in this case an integer.

    4: When you load immediate 4 onto $v0, you are telling the system that you want to output a string.
    We have discussed this a bit already. Basically it will output what's at the address on $a0 until it hits
    a null terminator. An address must be loaded onto $a0 as an argument to this function prior to your syscall.

    5: When you load immediate 5 onto $v0, you are telling the system that you want to read in an integer.
    I will also show you an example using this later. This call does not use $a0, instead it stores the integer
    you input onto $v0, which is the standard for return values. You can then move that value onto another register
    or store it in a "variable".

    8: When you load immediate 8 onto $v0, you are telling the system you want to read a string. This function
    takes two arguments which you load onto $a0 and $a1. On $a0 you put the address which you want the string
    you enter to be stored at, and on $a1 you put the length of the string you want to enter. You can use li to load
    your length onto $a1, which lets you load an integer rather than having to load it onto another register than move it.
    One thing I noticed about this function is that it seems like when you specify a length, it only reads up to one less
    than the length you specified. So just be aware of that, and maybe add one to the actual length you want when using
    this. I will also give an example of this later.

    Yay, later is here, I can give you all those examples I promised. Or cram them all into one little I/O example.
    First the uncommented version for easier pasting:
    Code:
    .data
    num1:	.word 0
    num2:	.word 0
    str1:	.space 50
    prompt1:	.asciiz "Enter a string: "
    prompt2:	.asciiz "Enter a number: "
    prompt3:	.asciiz "Enter another number: "
    msg:		.asciiz "You entered the string: "
    msg2:		.asciiz "You entered the numbers "
    msg3:		.asciiz " and "
    newline		.asciiz "\n"
    
    .text
    
    main:
    	la	$a0, prompt1
    	li	$v0, 4
    	syscall
    
    	la	$a0, str1
    	li	$a1, 51
    	li	$v0, 8
    	syscall
    
    	la	$a0, msg
    	li	$v0, 4
    	syscall
    
    	la	$a0, str1
    	li	$v0, 4
    	syscall
    
    	la	$a0, prompt2
    	li	$v0, 4
    	syscall
    
    	li	$v0, 5
    	syscall
    	sw	$v0, num1
    
    	la	$a0, prompt3
    	li	$v0, 4
    	syscall
    
    	li	$v0, 5
    	syscall
    	sw	$v0, num2
    
    	la	$a0, msg2
    	li	$v0, 4
    	syscall
    
    	lw	$a0, num1
    	li	$v0, 1
    	syscall
    
    	la	$a0, msg3
    	li	$v0, 4
    	syscall
    
    	lw	$a0, num2
    	li	$v0, 1
    	syscall
    
    	la	$a0, newline
    	li	$v0, 4
    	syscall
    
    	jr	$ra
    now the commented version:
    Code:
    .data
    num1:	.word 0						# .word is for storing integers, or just 4 bytes of data.
    							# We initialize it to 0 just because we're going to 
    							# change it anyway
    num2:	.word 0
    str1:	.space 50					# allocate 50 bytes for string input
    
    prompt1:	.asciiz "Enter a string: "			# prompts to output to user
    prompt2:	.asciiz "Enter a number: "
    prompt3:	.asciiz "Enter another number: "
    
    msg:		.asciiz "You entered the string "		# strings to prepend to the user input for labeling
    							# our output
    msg2:		.asciiz "You entered the numbers "
    msg3:		.asciiz " and "
    newline		.asciiz "\n"				# "declaring" the newline character so we can output a newline
    							# when it's not already there
    
    .text
    
    main:
    	la	$a0, prompt1				# load the address of prompt for string output
    	li	$v0, 4					# and output it
    	syscall
    
    	la	$a0, str1				# load the address of str1 (our buffer for input) onto
    							# $a0 as the first argument for string input.
    							# the input will be stored at this address.
    	li	$a1, 51					# then load the length we want to read onto $a1
    							# as our second argument for string input.
    	li	$v0, 8					# li 8 onto $v0 for the call for string input
    	syscall						# make the call
    
    	la	$a0, msg				# you should know what this does by now
    	li	$v0, 4
    	syscall
    
    	la	$a0, str1				# loading the address of our buffer which should
    							# now have the users string in it.
    	li	$v0, 4
    	syscall						# output our buffer
    
    	la	$a0, prompt2				# standard string output routine, same as before
    	li	$v0, 4
    	syscall
    
    	li	$v0, 5					# this is our integer input call, notice we don't have
    							# any arguments here, i.e. nothing needs to be
    							# done with the $a* registers.
    	syscall
    							# sw = store word
    	sw	$v0, num1				# the users input is stored in $v0 after the syscall, this
    							# stores it in num1.  This is similar to sb.  any register containing
    							# a word (integer) can be used here instead of $v0, it's just that in this
    							# case we want what's on $v0.
    
    	la	$a0, prompt3				# seen it
    	li	$v0, 4
    	syscall
    
    	li	$v0, 5					# getting our second integer
    	syscall
    	sw	$v0, num2				# storing it.
    
    	la	$a0, msg2				# seen it
    	li	$v0, 4
    	syscall
    
    							# lw = load word
    	lw	$a0, num1				# load integer at address num1 (not the address which would
    							# be loaded with la) onto register $a0 as an argument for integer
    							# output.  You can lw onto most of your regsiters, including
    							# all your save and temporary registers
    
    	li	$v0, 1					# load immediate 1 onto $v0, which tells system we want to do
    							# integer output
    	syscall						# tell the system/do it
    
    	la	$a0, msg3				# seen it
    	li	$v0, 4
    	syscall
    
    	lw	$a0, num2				# same as with num1, except this time we're loading what's at num2
    	li	$v0, 1
    	syscall
    
    	la	$a0, newline				# output our newline so we don't get the command prompt
    							# right after our output
    	li	$v0, 4
    	syscall
    
    	jr	$ra					# end program
    Just as a side note, you can use lw to load a word from an address stored on a register, just as you can
    use sw to store a word at an address stored on a register. to use sw in this way it's just as we have seen
    before with sb, you do:

    sw $t0, ($s0)

    assuming you have a word/integer on $t0 and an address on $s0. Using lw to load from an address on a register
    is accomplished in this way:

    lw $t0, ($s0)

    This is loading the value stored at the address on $s0 into $t0.

    I would like to mention two more things before I end this tutorial. First, the move operation. I didn't use it in any
    of the examples above, but it is important. Let's say you had an address on $s0, and you wanted to load that address
    onto $a0 for string output. It would be cumbersome to store what you have on one of your variables to load into $a0. This
    is where move comes in. you can simply do this in the situation described a moment ago:

    move $a0, $s0

    in this operation, the register on the left, in this case $a0, is your destination, and the one on the right is your source.
    So whatever is in the $s0 register will be "moved" into the $a0 register. Another good thing is that it is also still on
    the $s0 register, even though the name almost indicates it will not be on there anymore but will be moved to $a0.

    The last thing I wanted to mention is offsets. If I do another tutorial I will go into this in more detail most likey doing
    some array examples. Let's say you had a .space buffer with 80 bytes allocated. On this buffer you have an 80 character
    string. Let's say you only wanted to print from the 5th character to the end. What you could do is load the address onto
    $a0 with an offset like this:

    la $a0, 4($s0)

    I used 4 here because 0($s0) would be the first character of the buffer assuming the address of the buffer was stored on
    $s0. So if 0 is the first character that makes offset 4 the 5th character. When working with "arrays" of words, you will have
    to multiply whichever "element" you want by 4, because each "element" is 4 bytes. So assuming we want the 5th
    number in an array, we would do:

    lw $t0, 16($s0)

    same as before, except we're loading a word stored at the position we want, and each position occupies 4 bytes instead of 1.
    That is a short summary of offsets, but I think that's pretty much all you need to know about them to use them.

    I have hardly scratched the surface of programming in mips assembly, but I have almost taught you everything I know.
    I hope that you have learned a lot and will take the time to mess around with this. It's very interesting and if you're a
    geek like me, it is also fun. If I get enough positive feedback on this, and it seems like you all want to know more, I
    might write another tutorial which will probably bring everyone here up to where I am. Enjoy!

  2. #2
    Senior Member
    Join Date
    Apr 2004
    Posts
    1,130
    nice tut.
    Im glad to know that are others platforms but Mainframe that use a "readable" assembly language. the construction and ops are very similar to System/370 assembly.
    Meu sítio

    FORMAT C: Yes ...Yes??? ...Nooooo!!! ^C ^C ^C ^C ^C
    If I die before I sleep, I pray the Lord my soul to encrypt.
    If I die before I wake, I pray the Lord my soul to brake.

  3. #3
    Elite Hacker
    Join Date
    Mar 2003
    Posts
    1,407
    Yeah, that's what I like about it. It's so easy to read and understand. Quick question. In a tut like this,
    do you prefer it how I did it with comments in the code explaining what it does, or paragraphs after the code to explain?

  4. #4
    () \/V |\| 3 |) |3\/ |\|3G47|\/3
    Join Date
    Sep 2002
    Posts
    744
    Originally posted here by h3r3tic
    Quick question. In a tut like this, do you prefer it how I did it with
    comments in the code explaining what it does, or paragraphs after the code to explain?
    I really liked how you wrote the code without the comments...and then wrote it again with the comments on the side. It was actually more straight forward than my asm books!

    Go Finland!
    Deviant Gallery

  5. #5
    Elite Hacker
    Join Date
    Mar 2003
    Posts
    1,407
    Yeah, I have a hard time reading commented code, so I figured if I gave a commented and uncommented version,
    those like me would be happy, and those that liked commented code would also be happy. I guess I'm weird.

  6. #6
    Senior Member
    Join Date
    Apr 2004
    Posts
    1,130
    Originally posted here by h3r3tic
    Yeah, that's what I like about it. It's so easy to read and understand. Quick question. In a tut like this,
    do you prefer it how I did it with comments in the code explaining what it does, or paragraphs after the code to explain?
    I think the way you wrote is just fine. Easy to understand. That is the way i use to comment my assembly pgms (in the code).
    Meu sítio

    FORMAT C: Yes ...Yes??? ...Nooooo!!! ^C ^C ^C ^C ^C
    If I die before I sleep, I pray the Lord my soul to encrypt.
    If I die before I wake, I pray the Lord my soul to brake.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •