TACKY Reference Material

Every semester, there's a different instruction set for CPE480. It's actually quite difficult to keep coming up with something different, interesting, and vaguely reasonable. This semester, it's something really tacky... yes, TACKY: the twin accumulator computer from Kentucky.

Of the many computers I've designed, this is one of the most unusual... but that doesn't mean it's bad. In fact, I think it's quite clever. It's a 16-bit machine, with 16-bit instructions, but it's actually a VLIW (Very Long Instruction Word) machine. Ok, 16 bits isn't really "very long" -- but it typically packs two instructions into each instruction word, so it's not hard to get two instructions executing per clock in compute-heavy code.

A TACKY Overview

The TACKY instruction set is very simple, but the "twin" aspect of it is a little strange. So, let's ignore that for now.

Ok, not that we've gone over the other aspects, how does that "twin" stuff work? Well, there are two accumulators. As usual, you don't specify the accumulator explicitly; it is always implied as the first operand and destination. But how does that work with two accumulators? The answer is that the accumulator to be used is specified by the slot the instruction occupies within the instruction word. Let's take a simple example:

add $5, mul $6

Because the add $5 is in "slot 0" of the instruction word, this instruction is really doing $0 = $0 + $5. Similarly, mul $6 is really doing $1 = $1 + $6. These paired instructions would be expected to execute simultaneously... which does suggest a potential problem. Suppose the pair is:

add $5, a2r $0

Both of those instructions want to write into the same register, $0. Because the operations are supposed to happen simultaneously, the result would be undefined... which is a polite way of saying that both instructions within an instruction word should never write into the same register. You don't need to have your assembler detect this and flag an error, but be careful you don't accidentally do this when testing one of your processor designs later in this course.

Ok, that all seems simple enough. However, what happens if you don't have two instructions to pack into one instruction word? Well, that's easy enough: here's a funny-looking pair of null operations:

r2a $0, r2a $1

The TACKY Instruction Set

Because TACKY is actually a two-wide VLIW architecture, the instruction encoding is a bit strange. Each operation is nominally 8 bits, but an instruction word is 16 bits. Some single instructions take an entire 16-bit word by themself. In other cases, two instructions can be packed side-by-side within an instruction.

Instruction Description Functionality Result Type Pack
a2r $r Copy acc to register, copy type $r = $acc typeof(acc) Field acc
add $r Typeof(acc) add register to acc $acc += $r typeof(acc) Field acc
and $r Bitwise AND register to acc $acc = ($acc & $r) typeof(acc) Field acc
cf8 $r,imm8 Load {pre, imm8} into reg $r = {pre, imm8} float Span 0,1
ci8 $r,imm8 Load {pre, imm8} into reg $r = {pre, imm8} int Span 0,1
cvt $r Convert int to float or float to int $acc = ((oppositetypeof($r)) $r) oppositetypeof(r) Field acc
div $r Typeof(acc) divide acc by register $acc /= $r typeof(acc) Field acc
jnz8 $r,imm8 Jump to {pre, imm8} if r is not 0 if ($r!=0) pc = {pre, imm8} Span 0,1
jp8 imm8 Jump to {pre, imm8} pc = {pre, imm8} Span 0,1
jr $r Jump to register (int) pc = $r Either 0,1
jz8 $r,imm8 Jump to {pre, imm8} if r is 0 if ($r==0) pc = {pre, imm8} Span 0,1
lf $r Load float from memory into reg $r = memory[$acc] float Field acc
li $r Load int from memory into reg $r = memory[$acc] int Field acc
mul $r Typeof(acc) multiply acc by register $acc *= $r typeof(acc) Field acc
not $r Bitwise NOT register to acc $acc = (~$r) typeof(acc) Field acc
or $r Bitwise OR register to acc $acc = ($acc | $r) typeof(acc) Field acc
pre imm8 Load 8-bit prefix register pre = imm8 Span 0,1
r2a $r Copy register into acc, copy type $acc = $r typeof(r) Field acc
sh $r Typeof(acc) shift left/right by register $acc = shift($acc,$r) where $r holds an int typeof(acc) Field acc
slt $r Typeof(acc) set acc less than register $acc = ($acc<$r) int Field acc
st $r Store acc into memory[register] memory[$r] = $acc Field acc
sub $r Typeof(acc) subtract register from acc $acc -= $r typeof(acc) Field acc
sys imm8 System call system(imm8) Span 0,1
xor $r Bitwise XOR register to acc $acc = ($acc ^ $r) typeof(acc) Field acc

Macro Description Functionality Result Type Pack
cf $r,imm16 Constant float Sequence of pre, cf8 float Span 0,1
ci $r,imm16 Constant int Sequence of pre, ci8 int Span 0,1
jnz $r,addr Jump to addr if r is not 0 Sequence of pre, jnz8 Span 0,1
jp addr Jump to addr Sequence of pre, jp8 Span 0,1
jz $r,addr Jump to addr if r is 0 Sequence of pre, jz8 Span 0,1

The TACKY Registers

There are just 8 registers... which isn't a lot, so we'll try not to waste them. They all have names as well as numbers, and either can be used interchangeably; $r3 and $(4-1) would be treated identically. Perhaps the best way to give both is the following specification (formatted as an AIK specification):

.const {r0	r1	r2	r3	r4	ra	rv	sp}

The registers that have special meanings are:

Register Number Register Name Use
$0 $r0 accumulator for slot 0 instructions
$1 $r1 accumulator for slot 1 instructions
$5 $ra return address for simple functions
$6 $rv return value
$7 $sp stack pointer (there is no frame pointer)

The TACKY Floating Point

You might be surprised, or perhaps a bit scared, to learn that TACKY supports float arithmetic. Yes, there is floating-point hardware. IEEE 754-2008 floating point typically uses at least 32 bits to represent a value, whereas here we get 16 bits. Actually, 16-bit floats are not a new invention; they are sometimes called half precision. An IEEE single float normally has a sign bit, an 8-bit exponent, and a 24-bit mantissa magnitude stored in just 23 bits. So, how many bits is each of those things in a 16-bit float? Well, IEEE suggests 1+5+11 bits. However, we sacrifice IEEE compliance to get a more useful dynamic range... we were always going to ignore denorms, infinities, NaNs, and rounding modes anyway. ;-)

The 16-bit float format used in TACKY basically looks like the first 16 bits of an IEEE 32-bit float. That means 1 sign bit, 8 exponent bits, and 8 mantissa magnitude bits. It's not a huge change, but sacrificing some precision buys us a much larger dynamic range and means, for example, that mul only needs to do an 8x8 bit multiply -- which credibly can be implemented within a single clock cycle without a rediculous amount of circuitry.


EE480 Advanced Computer Architecture.