Assignment 1: SIK Encoding And Assembler

Instruction set design is hard. Prof. Dietz has designed dozens of instruction sets in the three decades he's been a professor, and it still isn't easy for him to get things right. Thus, rather than giving you complete freedom to design your own instruction set, we're going to walk through the design logic for a reasonably well-crafted one that he built specifically for Spring 2017 EE480. However, this design is not complete -- each student must devise their own encoding of the instructions and implement their own assembler.

SIK Overview

What the heck is SIK? It refers to the "Stacking In Kentucky" instruction set, but you can call it "sick." It's what our target machine will be for Spring 2017 EE480... but it is also pretty much (but not exactly) the stack assembly language we used in EE380. It's a 16-bit machine with instructions, addresses, and data all organized as 16-bit words. For now, you should assume that the instruction memory (i.e., the .text segment) holds 65,536 16-bit instructions and the data memory (i.e., the .data segment) holds 65,536 16-bit data words.

This instruction set is complete enough that I'll be giving you a compiler (including full C source code) that translates programs written in a significant subset of C into SIK code. It's not a particularly smart compiler (ok, it's really dumb), but it will show you how SIK can be used for complete programs.

The SIK Instruction Set

If you took EE380 here, SIK should look familiar. However, in EE380 we never discussed how instructions would be encoded; it was just a simple step in explaining how to code high-level constructs in assembly language, and you really didn't have to use it. So, here's the instruction set assembly language syntax, with instruction names listed in alphabetical order.

Instruction Description Functionality

Add Addition d=sp-1; s=sp; --sp; reg[d]+=reg[s];

And Bitwise AND d=sp-1; s=sp; --sp; reg[d]&=reg[s];

Call immed16 Call d=sp+1; ++sp; reg[d]=pc+1; pc=prefix({(pc>>12), immed12});

Dup Duplicate d=sp+1; s=sp; ++sp; reg[d]=reg[s];

Get immed12 Get from stack d=sp+1; s=sp-unsigned(immed12); ++sp; reg[d]=reg[s];

JumpF immed16 Jump if False if (!torf) pc=prefix({(pc>>12), immed12});

Jump immed16 Jump pc=prefix({(pc>>12), immed12});

JumpT immed16 Jump if True if (torf) pc=prefix({(pc>>12), immed12});

Load Load from memory d=sp; reg[d]=mem[reg[d]];

Lt Less Than d=sp-1; s=sp; --sp; reg[d]=(reg[d] < reg[s]);

Or Bitwise OR d=sp-1; s=sp; --sp; reg[d]|=reg[s];

Pop immed12 Pop (sets sp=0 if unsigned(immed12)>sp) sp-=unsigned(immed12);

Pre immed16 Set PREfix to top 4 bits of immed16 pre=unsigned(immed16)>>12;

Push immed16 Push d=sp+1; ++sp; reg[d]=prefix(sign_extend(immed12));

Put immed12 Put into stack d=sp-unsigned(immed12); s=sp; reg[d]=reg[s];

Ret Return (indirect jump) s=sp; --sp; pc=reg[s];

Store Store to memory d=sp-1; s=sp; --sp; mem[reg[d]]=reg[s]; reg[d]=reg[s];

Sub Subtract d=sp-1; s=sp; --sp; reg[d]-=reg[s];

Sys SYStem call this does a bunch of things...

Test Test s=sp; --sp; torf = (reg[s] != 0);

Xor Bitwise XOR d=sp-1; s=sp; --sp; reg[d]^=reg[s];

Instruction	Description	Functionality
`Add`	Addition	`d=sp-1; s=sp; --sp; reg[d]+=reg[s];`
`And`	Bitwise AND	`d=sp-1; s=sp; --sp; reg[d]&=reg[s];`
`Call immed16`	Call	`d=sp+1; ++sp; reg[d]=pc+1; pc=prefix({(pc>>12), immed12});`
`Dup`	Duplicate	`d=sp+1; s=sp; ++sp; reg[d]=reg[s];`
`Get immed12`	Get from stack	`d=sp+1; s=sp-unsigned(immed12); ++sp; reg[d]=reg[s];`
`JumpF immed16`	Jump if False	`if (!torf) pc=prefix({(pc>>12), immed12});`
`Jump immed16`	Jump	`pc=prefix({(pc>>12), immed12});`
`JumpT immed16`	Jump if True	`if (torf) pc=prefix({(pc>>12), immed12});`
`Load`	Load from memory	`d=sp; reg[d]=mem[reg[d]];`
`Lt`	Less Than	`d=sp-1; s=sp; --sp; reg[d]=(reg[d] < reg[s]);`
`Or`	Bitwise OR	`d=sp-1; s=sp; --sp; reg[d]\|=reg[s];`
`Pop immed12`	Pop (sets `sp=0` if `unsigned(immed12)>sp`)	`sp-=unsigned(immed12);`
`Pre immed16`	Set PREfix to top 4 bits of `immed16`	`pre=unsigned(immed16)>>12;`
`Push immed16`	Push	`d=sp+1; ++sp; reg[d]=prefix(sign_extend(immed12));`
`Put immed12`	Put into stack	`d=sp-unsigned(immed12); s=sp; reg[d]=reg[s];`
`Ret`	Return (indirect jump)	`s=sp; --sp; pc=reg[s];`
`Store`	Store to memory	`d=sp-1; s=sp; --sp; mem[reg[d]]=reg[s]; reg[d]=reg[s];`
`Sub`	Subtract	`d=sp-1; s=sp; --sp; reg[d]-=reg[s];`
`Sys`	SYStem call	this does a bunch of things...
`Test`	Test	`s=sp; --sp; torf = (reg[s] != 0);`
`Xor`	Bitwise XOR	`d=sp-1; s=sp; --sp; reg[d]^=reg[s];`

A few details about the above:

Each instruction must be encoded in a single 16-bit word
The sp above is not user visible and refers to an internal register stack; don't confuse it with any stack that might be built in memory by the compiler
The pre and torf registers are not user visible
The immed16 instructions should be pseudo-variable length using the Pre prefix instruction to specify how the top 4 bits are set. For example, suppose the assembler was given the instruction: JumpF 8197. The decimal value 8197 is binary 001000000101. If the pc happens to look like 0010xxxxxxxx, then JumpF 8197 can be encoded as a single 16-bit instruction with 5 in the 12-bit immediate field. However, if the top 4 bits of the pc don't look like what we want, JumpF 8197 would end-up as two 16-bit instructions with a Pre 8197 inserted before the JumpF. Of course, Pre 8197 doesn't need to remember the whole 16-bit immediate, but just the top 4 bits, so the effect would be pre=2.
prefix(x) means if there was Pre instruction with an unused pre value before this, {pre, (x & 0x0fff)}, else x

In case you were wondering, yes, the Pre instruction is downright odd. However, other types of prefix instruction encodings haven been used by companies like Intel for decades. The logic is pretty simple: using this prefix form keeps all instructions fixed length, yet allows multiple different types of instructions to have variable-length immediate values with the opcode in the last word. I'm not saying this is a great thing, but it's reasonable, and you certainly will not find a Verilog implementation of anything like it on the WWW. ;-)

So, are you expected to write Pre instructions? Well, maybe for testing, but generally no. You're expected to handle this automatically in the AIK-generated assembler. For example, the AIK specification of JumpF will require two rules that look something like:

JumpF .immed ?((. & 0xf000) == (.immed & 0xf000)) := 16-bit JumpF encoding
JumpF .immed := 16-bit Pre then 16-bit JumpF encoding

Disappointed that I didn't show you the bit pattern encoding? Well, of course I didn't! Determining how to encode the above instructions as bit patterns is a key part of your project. By now you also should have noticed that there are 21 different types of instructions, but a 16-bit instruction format leaves only 4 bits for an opcode field after reserving space for a 12-bit immediate value -- that's deliberate. You'll figure it out. ;-)

The SIK Registers

There are none. You heard me. Well, ok, actually there are lots... but they aren't programmer visible. The pc, sp, pre, and torf registers are all hidden from view, as is a huge array of registers that make up the stack. Anyway, they're not visible in the assembly language so there are no register names to deal with....

Your Project

Your project is simply to design the instruction set encoding and implement an assembler using AIK. Here's a simple test case:

	.text
	.origin 0
lab:	Add
	And
	Call	lab
	Dup
	Get	2
	JumpF	lab
	Jump	lab
	JumpT	lab
	Load
	Lt
	Or
	Pop	1
	Pre	8197
	Push	lab3
	Put	2
	Ret
	.data
	.origin 0x8000
lab3:	.word lab2
	.text	; pick-up where we were
	Store
	Sub
	Sys
	Test
	Xor
lab2:

Obviously, I can't show you sample output without giving-away how I've encoded the instructions.

Due Dates

The recommended due date for this assignment is before class, Wednesday, February 15, 2017. This submission window will close when class begins on Monday, February 20, 2017. You may submit as many times as you wish, but only the last submission that you make before class begins on Monday, February 20, 2017 will be counted toward your course grade.

Note that you can ensure that you get at least half credit for this project by simply submitting a tar of an "implementor's notes" document explaining that your project doesn't work because you have not done it yet. Given that, perhaps you should start by immediately making and submitting your implementor's notes document? (I would!)

Submission Procedure

For each project, you will be submitting a tarball (i.e., a file with the name ending in .tar or .tgz) that contains all things relevant to your work on the project. Minimally, each project tarball includes the source code for the project and a semi-formal "implementors notes" document as a PDF named notes.pdf. It also may include test cases, sample output, a make file, etc., but should not include any files that are built by your Makefile (e.g., no binary executables). For this particular project, name the AIK source file sik.aik.

Submit your tarball below. The file can be either an ordinary .tar file created using tar cvf file.tar yourprojectfiles or a compressed .tgz file file created using tar zcvf file.tgz yourprojectfiles. Be careful about using * as a shorthand in listing yourprojectfiles on the command line, because if the output tar file is listed in the expansion, the result can be an infinite file (which is not ok).

Advanced Computer Architecture.