Nios II is a RISC instruction set architecture for 32-bit embedded processors (specifically for Altera FPGAs), with 32 registers. The DE1-SoC loads Nios II as a soft core processor.
It’s little-endian,1 and has a memory space that supports words, half-words, and bytes. Each separate byte has its own address in memory.
Instructions
Instructions have two basic modes. One is immediate mode, where we use a 16-bit number in our instructions. This is always suffixed by i
. The other is register mode, where the values are held in the registers already.
Basic register operations:
- Moving data
mov rX, rY
moves the data ofrY
torX
.movi rX, Imm16
to “move immediate” a 16-bit number intorX
. There will be a sign extension.movhi rX, Imm16
to move to the upper 16-bits of the register.movia rX, Imm32
, for a 32-bit number. This is a macro, i.e., it is essentially two instructions at once: amovhi
of the upper 16 bits then anaddi
of the lower 16 bits.
- Loading data, note that this requires the register to already be holding a memory address (via a move instruction).
ldw rX, n(rN)
, load intorX
the value atrN
shifted byn
bytes forward.- This is basically pointer arithmetic and dereferencing.
- The preceding instruction should be:
movia rN addr32
.- We’re basically loading a 32-bit address into register
N
, then when we callldw
, we load from the address stored inN
. This is due to the restrictions of the architecture, a load-store style. - We can alternatively specify the address for a given word with a directive:
list: .word 10
— wherever the assembler chooses to place the memory word with value 10, the “variable”list
will refer to that address.- Variable positions in memory will change! So this makes the most sense to do
- We’re basically loading a 32-bit address into register
- The
w
after theld
refers to loading every 4 bytes.
ldwio rX, n(rN)
, whererN
stores the address of an I/O device.ldh
,ldhu
,ldwu
,ldb
,ldbu
,ldui
to specify the contents of the move.u
means unsigned. By default, registers hold signed values. Sign extensions will be with 0.h
means half-word. A sign extension will be done for the next 16-bits. We can only enter an address ending in 0.b
means byte. The other bytes are the sign bit.i
for immediate mode.
- Storing data, same address restriction as above
stw rX, n(rN)
, store into the shifted address atrN
, the value atrX
.stwio rX, n(rN)
Arithmetic, logic, and bitwise operations:
- Arithmetic operations
add
,addi
sub
,subi
mul
,mulu
, no immediate variantdiv
,divu
, no immediate variant
- Logic operations
and
,andi
or
,ori
xor
,xori
- Note that these operate bit-by-bit. The upper 16 bits are left-alone.
- Bit shifting, where bits are lost and set to 0.
- Logical shift, this is an unsigned shift
srli rX, rY, Imm16
;srl
sll
,slli
- Arithmetic shift, this is a signed shift where the sign bit is duplicated
srai
,sra
,slai
,sla
- Logical shift, this is an unsigned shift
- Bit rotating, note that only the last 5 bits really matter, since registers are only 32-bits
- Rotate right →,
ror
,rori
- Rotate left ←,
rol
,roli
- Rotate right →,
Flow control and functions:
- Branches
br BRANCH
to unconditionally enterBRANCH
.- For conditional branches, we have the general form
bXX rX, rY, BRANCH
:beq
, forx == y
bne
, forx != y
bge
, for a signed comparisonx >= y
bgeu
, for an unsigned comparisonx >= y
blt
, forx < y
bgt
, forx > y
- and so on.
- Subroutines:
call NAME
calls a subroutineNAME
.ret
returns from a subroutine. It moves the value ofra
intopc
, such that the next instruction isra
.
- Interrupts:
eret
returns from an interrupt. It moves the value ofea
intopc
.rdctl
reads control registers.wrctl
writes to control registers.
Directives
A full list here.
.data
to specify a section, for variables in memory..align n
, to specify the “memory alignment”. This is used when specifying words of data;n
will be the exponent in an offset by .- “align with next available address divisible by “
.text
, to specify instructions..global
to specify global variables..section
to specify code sections..word
/.byte
/.hword
specifies the format of the stored data..word
and.hword
will be aligned automatically..byte
must be aligned manually on 0. Alternatively, we can use the.skip
directive.
.skip
will skip by bytes before assigning a memory address..equ SYMBOL ADDRESS
to essentially set an alias to a certain address..exceptions "ax"
specifies code that executes when interrupts occur.
Registers
By convention, we reserve registers for subroutines:
r2
is the function return value. This is for a single word. If more than one value comes back from the callee, then this information must be put on the stack and popped by the caller.r4
tor7
is for function arguments from the caller to the callee. Any more parameters must be put on the stack.r8
tor15
are the caller saved registers, i.e., the responsibility of the caller. They must be saved on the stack by the caller if it wants to preserve them, i.e., we save, then call a subroutine, then the caller restores them when returned.r16
tor23
are the callee saved registers, i.e., the responsibility of the callee. If it wants to use these registers, their contents must be saved before being changed, and restored before returning.
Some other specialised registers:
r0
will only store'b0
r24
is the exception type register.r27
/sp
is the stack pointer, for the word at the top of the stack- We must initialise
sp
at the beginning of any program. By convention, this is at0x20000
.
- We must initialise
r29
/ea
is the exception return addressr31
/ra
is the return register. It stores the address to go back to when a subroutine is called.- If the subroutine calls other subroutines, we need to save
ra
on the stack.
- If the subroutine calls other subroutines, we need to save
et
is for “exception temporary”, which may be used by the assembler/linkerpc
is the program counter, not one of the 32 registers. It stores the next instruction to be executed, and is incremented by 4.
Programming
We can download Nios II programs onto FPGA boards using the Quartus Monitor Program. This uses a JTAG interface.
If Quartus says that it Could not query JTAG Instance IDs
, this is often because the board’s been turned off and on again. Follow these steps:
Actions > Download System
to load the.sof
file onto the board.Actions > Configure HPS
Action > Connect to System
- Then things are fine
During compilation, it may also throw multiple definition errors, even when you’ve been careful to use include guards. One cause of this is that Quartus doesn’t like global structs defined in header files, even if it’s a valid C language construct. Why this happens is anybody’s guess.
Addendums
The Nios memory structure is as follows: Prof Moshovos says it’s “ridiculously close to MIPS and RISC-V”. Nios II was since replaced by Nios V, a RISC-V based architecture.
Footnotes
-
”Why? I don’t care.” - Prof Moshovos ↩