2020-01-25 19:17:39 -05:00
|
|
|
# VM
|
|
|
|
|
|
|
|
|
|
This is an outline of the VM that drives this language.
|
|
|
|
|
|
|
|
|
|
# Primitives
|
|
|
|
|
|
|
|
|
|
* Numbers may be big endian (BE) or little endian (LE) at the byte level. This guide will use LE.
|
|
|
|
|
* Addresses point to single bytes.
|
2020-01-26 10:59:25 -05:00
|
|
|
* Signed numbers use two's complement.
|
2020-01-25 19:17:39 -05:00
|
|
|
|
|
|
|
|
| Type | Size (bits) |
|
|
|
|
|
| - | - |
|
|
|
|
|
| Address | 64 |
|
|
|
|
|
| Word | 64 |
|
|
|
|
|
| Halfword | 32 |
|
|
|
|
|
| Byte | 8 |
|
|
|
|
|
|
|
|
|
|
# Registers
|
|
|
|
|
|
|
|
|
|
CPU registers are addressed by a value between 0-63 (6 bits). All registers are 64 bits wide.
|
|
|
|
|
|
|
|
|
|
* IP - Instruction pointer
|
|
|
|
|
* SP - Stack pointer
|
|
|
|
|
* FP - Frame pointer
|
|
|
|
|
* FLAGS - CPU flags
|
2020-01-26 11:15:09 -05:00
|
|
|
* (9 unused registers)
|
|
|
|
|
* STATUS - Generic status code
|
2020-01-25 19:17:39 -05:00
|
|
|
* R0-R49
|
|
|
|
|
|
|
|
|
|
## CPU Flags
|
|
|
|
|
|
|
|
|
|
CPU flags are addressed by bit index, going from right to left.
|
|
|
|
|
|
|
|
|
|
* `00` - Halt flag
|
|
|
|
|
* `01` - Compare flag
|
|
|
|
|
|
|
|
|
|
### Flag ideas
|
|
|
|
|
|
|
|
|
|
* "Trace" flag - halts the CPU when certain conditions are met that may be causing undesired
|
|
|
|
|
behavior - for debugging
|
|
|
|
|
* Overwriting a register without its value being used
|
|
|
|
|
* Mixing arithmetic with bit twiddling on the same target
|
|
|
|
|
|
2020-01-27 18:42:15 -05:00
|
|
|
## Register ideas
|
|
|
|
|
|
|
|
|
|
* NULL - a register that will always be zero for reading and will not change after writing.
|
|
|
|
|
* Other possible names: Z, NIL
|
|
|
|
|
|
2020-01-25 19:17:39 -05:00
|
|
|
# Instructions
|
|
|
|
|
|
2020-01-28 19:16:52 -05:00
|
|
|
Instructions attempt to be as small as possible while conforming to 8-bit, 16-bit, 32-bit, or 64-bit
|
|
|
|
|
alignment. All instructions have 16-bit opcodes.
|
|
|
|
|
|
2020-01-25 19:17:39 -05:00
|
|
|
## Arithmetic
|
|
|
|
|
|
2020-01-26 10:59:25 -05:00
|
|
|
Arithmetic instructions store their result in the first register specified. Overflow is handled by
|
|
|
|
|
wrapping around to 0.
|
|
|
|
|
|
2020-01-25 19:17:39 -05:00
|
|
|
* Add
|
2020-01-28 19:16:52 -05:00
|
|
|
* Opcode: 0x0000
|
2020-01-25 19:17:39 -05:00
|
|
|
* **Params**: REG1, REG2
|
2020-01-26 10:52:26 -05:00
|
|
|
* `REG1 = REG1 + REG2`
|
2020-01-26 10:59:25 -05:00
|
|
|
* Unsigned addition
|
2020-01-25 19:17:39 -05:00
|
|
|
* Mul
|
2020-01-28 19:16:52 -05:00
|
|
|
* Opcode: 0x0001
|
2020-01-25 19:17:39 -05:00
|
|
|
* **Params**: REG1, REG2
|
2020-01-26 10:52:26 -05:00
|
|
|
* `REG1 = REG1 * REG2`
|
2020-01-26 10:59:25 -05:00
|
|
|
* Unsigned multiplication
|
2020-01-25 19:17:39 -05:00
|
|
|
* Div
|
2020-01-28 19:16:52 -05:00
|
|
|
* Opcode: 0x0002
|
2020-01-25 19:17:39 -05:00
|
|
|
* **Params**: REG1, REG2
|
2020-01-26 10:52:26 -05:00
|
|
|
* `REG1 = REG1 / REG2`
|
2020-01-26 10:59:25 -05:00
|
|
|
* Unsigned division
|
2020-01-25 21:06:43 -05:00
|
|
|
* Mod
|
2020-01-28 19:16:52 -05:00
|
|
|
* Opcode: 0x0003
|
2020-01-25 21:06:43 -05:00
|
|
|
* **Params**: REG1, REG2
|
2020-01-26 10:52:26 -05:00
|
|
|
* `REG1 = REG1 % REG2` (exact semantics TBD)
|
2020-01-26 10:59:25 -05:00
|
|
|
* INeg
|
2020-01-28 19:16:52 -05:00
|
|
|
* Opcode: 0x0004
|
2020-01-25 19:17:39 -05:00
|
|
|
* **Params**: REG1
|
|
|
|
|
* `REG1 = REG1 * -1`
|
2020-01-26 10:59:25 -05:00
|
|
|
* Signed negative
|
2020-01-25 19:17:39 -05:00
|
|
|
* And
|
2020-01-28 19:16:52 -05:00
|
|
|
* Opcode: 0x0005
|
2020-01-25 19:17:39 -05:00
|
|
|
* **Params**: REG1, REG2
|
2020-01-26 10:52:26 -05:00
|
|
|
* `REG1 = REG1 & REG2`
|
2020-01-25 19:17:39 -05:00
|
|
|
* Or
|
2020-01-28 19:16:52 -05:00
|
|
|
* Opcode: 0x0006
|
2020-01-25 19:17:39 -05:00
|
|
|
* **Params**: REG1, REG2
|
2020-01-26 10:52:26 -05:00
|
|
|
* `REG1 = REG1 | REG2`
|
2020-01-28 18:27:19 -05:00
|
|
|
* Inv
|
2020-01-28 19:16:52 -05:00
|
|
|
* Opcode: 0x0007
|
2020-01-28 18:27:19 -05:00
|
|
|
* **Params**: REG1
|
|
|
|
|
* `REG1 = ~REG1`
|
|
|
|
|
* Not
|
2020-01-28 19:16:52 -05:00
|
|
|
* Opcode: 0x0008
|
2020-01-28 18:27:19 -05:00
|
|
|
* **Params**: REG1
|
|
|
|
|
* ```
|
|
|
|
|
if REG1 == 0 {
|
|
|
|
|
REG1 = 0;
|
|
|
|
|
} else {
|
|
|
|
|
REG1 = 1;
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
* Boolean NOT; equivalent of C's `!` unary operator
|
2020-01-25 19:17:39 -05:00
|
|
|
* Xor
|
2020-01-28 19:16:52 -05:00
|
|
|
* Opcode: 0x0009
|
2020-01-25 19:17:39 -05:00
|
|
|
* **Params**: REG1, REG2
|
2020-01-26 10:52:26 -05:00
|
|
|
* `REG1 = REG1 ^ REG2`
|
2020-01-25 19:17:39 -05:00
|
|
|
* Shl
|
2020-01-28 19:16:52 -05:00
|
|
|
* Opcode: 0x000A
|
2020-01-25 19:17:39 -05:00
|
|
|
* **Params**: REG1, REG2
|
|
|
|
|
* `REG1 = REG1 << REG2`
|
|
|
|
|
* Shr
|
2020-01-28 19:16:52 -05:00
|
|
|
* Opcode: 0x000B
|
2020-01-25 19:17:39 -05:00
|
|
|
* **Params**: REG1, REG2
|
|
|
|
|
* `REG1 = REG1 >> REG2`
|
2020-01-26 11:15:09 -05:00
|
|
|
* Does not sign extend
|
2020-01-25 19:17:39 -05:00
|
|
|
|
2020-01-26 10:59:25 -05:00
|
|
|
### TODO
|
|
|
|
|
|
|
|
|
|
* Add signed instructions (iadd, imul, etc)
|
2020-01-26 11:15:09 -05:00
|
|
|
* Sign-extending SHR
|
2020-01-26 10:59:25 -05:00
|
|
|
* Overflow flag?
|
|
|
|
|
|
2020-01-25 19:17:39 -05:00
|
|
|
## Control flow
|
|
|
|
|
|
|
|
|
|
* CmpEq
|
2020-01-28 19:16:52 -05:00
|
|
|
* Opcode: 0x1000
|
2020-01-25 19:17:39 -05:00
|
|
|
* **Params**: REG1, REG2
|
|
|
|
|
* ```
|
|
|
|
|
if REG1 == REG2 {
|
|
|
|
|
FLAGS[1] = 1;
|
|
|
|
|
} else {
|
|
|
|
|
FLAGS[1] = 0;
|
|
|
|
|
}
|
|
|
|
|
```
|
2020-01-26 11:15:09 -05:00
|
|
|
* Sets the COMPARE flag to 1 if REG1 == REG2
|
2020-01-25 19:17:39 -05:00
|
|
|
* CmpLt
|
2020-01-28 19:16:52 -05:00
|
|
|
* Opcode: 0x1001
|
2020-01-25 19:17:39 -05:00
|
|
|
* **Params**: REG1, REG2
|
|
|
|
|
* ```
|
|
|
|
|
if REG1 < REG2 {
|
|
|
|
|
FLAGS[1] = 1;
|
|
|
|
|
} else {
|
|
|
|
|
FLAGS[1] = 0;
|
|
|
|
|
}
|
|
|
|
|
```
|
2020-01-26 11:15:09 -05:00
|
|
|
* Sets the COMPARE flag to 1 if REG1 < REG2
|
2020-01-25 19:17:39 -05:00
|
|
|
* Jz
|
2020-01-28 19:16:52 -05:00
|
|
|
* Opcode: 0x1100
|
2020-01-25 19:17:39 -05:00
|
|
|
* **Params**: REG1
|
|
|
|
|
* ```
|
2020-01-26 11:15:09 -05:00
|
|
|
if FLAGS[1] == 0 {
|
2020-01-25 19:17:39 -05:00
|
|
|
IP = REG1;
|
|
|
|
|
}
|
|
|
|
|
```
|
2020-01-26 11:15:09 -05:00
|
|
|
* Jumps to the address in REG1 if COMPARE flag is 0.
|
2020-01-25 19:17:39 -05:00
|
|
|
* Jnz
|
2020-01-28 19:16:52 -05:00
|
|
|
* Opcode: 0x1001
|
2020-01-25 19:17:39 -05:00
|
|
|
* **Params**: REG1
|
|
|
|
|
* ```
|
2020-01-26 11:15:09 -05:00
|
|
|
if FLAGS[1] != 0 {
|
2020-01-25 19:17:39 -05:00
|
|
|
IP = REG1;
|
|
|
|
|
}
|
|
|
|
|
```
|
2020-01-26 11:15:09 -05:00
|
|
|
* Jumps to the address in REG1 if COMPARE flag is 1.
|
2020-01-25 19:17:39 -05:00
|
|
|
|
|
|
|
|
## Data movement
|
|
|
|
|
|
|
|
|
|
* Load
|
2020-01-28 19:16:52 -05:00
|
|
|
* Opcode: 0x2000
|
2020-01-25 19:17:39 -05:00
|
|
|
* **Params**: REG1, REG2
|
|
|
|
|
* ```
|
|
|
|
|
REG1 = MEM[REG2];
|
|
|
|
|
```
|
2020-01-26 11:15:09 -05:00
|
|
|
* Sets REG1 to the value at the memory address in REG2.
|
2020-01-28 19:16:52 -05:00
|
|
|
* ```
|
|
|
|
|
32 16 10 4 0
|
|
|
|
|
64 - opcode reg1 reg2 unused
|
|
|
|
|
/ / / /
|
|
|
|
|
+-------------------------------------------+
|
|
|
|
|
| 0010000000000000 | ...... | ...... | XXXX |
|
|
|
|
|
+-------------------------------------------+
|
|
|
|
|
```
|
|
|
|
|
* RegCopy
|
|
|
|
|
* Opcode: 0x2001
|
2020-01-25 19:17:39 -05:00
|
|
|
* **Params**: REG1, REG2
|
2020-01-28 19:16:52 -05:00
|
|
|
* `REG1 = REG2`
|
|
|
|
|
* Copies the value in REG2 into REG1.
|
2020-01-25 19:17:39 -05:00
|
|
|
* ```
|
2020-01-28 19:16:52 -05:00
|
|
|
32 16 10 4 0
|
|
|
|
|
opcode reg1 reg2 unused
|
|
|
|
|
/ / / /
|
|
|
|
|
+-------------------------------------------+
|
|
|
|
|
| 0010000000000001 | REG1.. | REG2.. | XXXX |
|
|
|
|
|
+-------------------------------------------+
|
2020-01-25 19:17:39 -05:00
|
|
|
```
|
2020-01-28 19:16:52 -05:00
|
|
|
* StoreImm64
|
|
|
|
|
* Opcode: 0x2100
|
|
|
|
|
* **Params**: REG1, IMM_64
|
|
|
|
|
* `REG1 = IMM_64`
|
|
|
|
|
* Sets REG1 to the specified 64-bit number.
|
2020-01-25 19:17:39 -05:00
|
|
|
* StoreImm32
|
2020-01-28 19:16:52 -05:00
|
|
|
* Opcode: 0x2101
|
2020-01-26 10:52:26 -05:00
|
|
|
* **Params**: REG1, IMM_32
|
|
|
|
|
* `REG1 = IMM_32`
|
2020-01-26 11:15:09 -05:00
|
|
|
* Sets REG1 to the specified 32-bit number.
|
2020-01-28 19:16:52 -05:00
|
|
|
* ```
|
|
|
|
|
64 48 42 36 32 0
|
|
|
|
|
opcode reg1 reg2 unused
|
|
|
|
|
/ / / / immediate 32 bit value
|
|
|
|
|
/ / / / /
|
|
|
|
|
+------------------------------------------------------------------------------+
|
|
|
|
|
| 0010000100000001 | REG1.. | REG2.. | XXXX | IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII |
|
|
|
|
|
+------------------------------------------------------------------------------+
|
|
|
|
|
```
|
2020-01-25 21:02:32 -05:00
|
|
|
* MemCopy
|
2020-01-28 19:16:52 -05:00
|
|
|
* Opcode: 0x2200
|
2020-01-25 19:17:39 -05:00
|
|
|
* **Params**: REG1, REG2
|
|
|
|
|
* `MEM[REG1] = MEM[REG2]`
|
2020-01-26 11:15:09 -05:00
|
|
|
* Copies the value at the memory address in REG2 to the memory address in REG1.
|
2020-01-28 19:16:52 -05:00
|
|
|
* ```
|
|
|
|
|
32 16 10 4 0
|
|
|
|
|
opcode reg1 reg2 unused
|
|
|
|
|
/ / / /
|
|
|
|
|
+-------------------------------------------+
|
|
|
|
|
| 0010001000000000 | REG1.. | REG2.. | XXXX |
|
|
|
|
|
+-------------------------------------------+
|
|
|
|
|
```
|
|
|
|
|
* Store
|
|
|
|
|
* Opcode: 0x2201
|
2020-01-25 21:02:32 -05:00
|
|
|
* **Params**: REG1, REG2
|
2020-01-28 19:16:52 -05:00
|
|
|
* ```
|
|
|
|
|
MEM[REG2] = REG1;
|
|
|
|
|
```
|
|
|
|
|
* Sets the value at the memory address in REG2 to the value in REG1.
|
|
|
|
|
* ```
|
|
|
|
|
32 16 10 4 0
|
|
|
|
|
opcode reg1 reg2 unused
|
|
|
|
|
/ / / /
|
|
|
|
|
+-------------------------------------------+
|
|
|
|
|
| 0010001000000001 | REG1.. | REG2.. | XXXX |
|
|
|
|
|
+-------------------------------------------+
|
|
|
|
|
```
|
2020-01-26 11:15:09 -05:00
|
|
|
|
2020-01-26 11:17:21 -05:00
|
|
|
## Miscellaneous
|
|
|
|
|
|
|
|
|
|
* Halt
|
2020-01-28 19:16:52 -05:00
|
|
|
* Opcode: 0xF000
|
2020-01-26 11:17:21 -05:00
|
|
|
* **Params**: (none)
|
|
|
|
|
* `FLAGS[0] = 1`
|
|
|
|
|
* Halts the machine
|
2020-01-28 19:16:52 -05:00
|
|
|
* ```
|
|
|
|
|
16
|
|
|
|
|
opcode
|
|
|
|
|
/
|
|
|
|
|
+------------------+
|
|
|
|
|
| 0010011000000000 |
|
|
|
|
|
+------------------+
|
|
|
|
|
```
|
2020-01-26 11:17:21 -05:00
|
|
|
* Nop
|
2020-01-28 19:16:52 -05:00
|
|
|
* Opcode: 0xF001
|
2020-01-26 11:17:21 -05:00
|
|
|
* **Params**: (none)
|
|
|
|
|
* Does nothing
|
2020-01-28 19:16:52 -05:00
|
|
|
* ```
|
|
|
|
|
16
|
|
|
|
|
opcode
|
|
|
|
|
/
|
|
|
|
|
+------------------+
|
|
|
|
|
| 0010011000000001 |
|
|
|
|
|
+------------------+
|
|
|
|
|
```
|
2020-01-26 11:17:21 -05:00
|
|
|
|
2020-01-26 11:15:09 -05:00
|
|
|
## Other instructions TODO
|
|
|
|
|
|
|
|
|
|
* Call
|
|
|
|
|
* Takes address and number of bytes on the stack that are for args(?)
|
|
|
|
|
* Updates SP, FP, IP, storing previous values starting at the new FP
|
|
|
|
|
* Ret
|
|
|
|
|
* Uses FP to determine previous SP, FP, and IP and restores them
|
|
|
|
|
* Push
|
|
|
|
|
* Pop
|
|
|
|
|
* More immediate stores?
|
2020-01-28 19:16:52 -05:00
|
|
|
* Idea: Store42 (or whatever number of bits) that maximizes the usage of a 64-bit instruction
|
2020-01-26 11:15:09 -05:00
|
|
|
|
2020-01-28 18:15:07 -05:00
|
|
|
# Binary object format
|
2020-01-28 18:12:31 -05:00
|
|
|
|
2020-01-28 18:15:07 -05:00
|
|
|
The binary object format is composed of a header followed by sections that make up the content of
|
|
|
|
|
the object.
|
2020-01-28 18:12:31 -05:00
|
|
|
|
|
|
|
|
## Header
|
|
|
|
|
|
|
|
|
|
The header is composed of:
|
|
|
|
|
|
|
|
|
|
* 64 bits - A magic number (0xDEAD_BEA7_BA5E_BA11).
|
|
|
|
|
* 16 bits - Version of the file
|
|
|
|
|
* 16 bits - The number of sections in the file
|
|
|
|
|
* 32 bits - Unused
|
|
|
|
|
* section descriptions detailed below
|
|
|
|
|
|
|
|
|
|
Total length: 128 bits
|
|
|
|
|
|
|
|
|
|
## Sections
|
|
|
|
|
|
2020-01-28 18:15:07 -05:00
|
|
|
The rest of the object is a list of sections. A section's layout is a section header, followed by
|
2020-01-28 18:12:31 -05:00
|
|
|
the section contents.
|
|
|
|
|
|
|
|
|
|
### Section header
|
|
|
|
|
|
|
|
|
|
* 8 bits - Section kind
|
|
|
|
|
* 0x00 - Data
|
|
|
|
|
* 0x10 - Code
|
|
|
|
|
* 0xFF - Meta
|
|
|
|
|
* 24 bits - Unused
|
|
|
|
|
* 32 bits - Checksum of the section
|
|
|
|
|
* 64 bits - Length of the section
|
|
|
|
|
|
|
|
|
|
Total length: 128 bits
|
|
|
|
|
|
|
|
|
|
### Data section
|
|
|
|
|
|
|
|
|
|
The data section contains static data that is initialized to some known value.
|
|
|
|
|
|
|
|
|
|
* 64 bits - load location - where in memory the contents of this section are put.
|
|
|
|
|
|
|
|
|
|
### Code section
|
|
|
|
|
|
|
|
|
|
The code section contains executable code.
|
|
|
|
|
|
|
|
|
|
* 64 bits - load location - where in memory the contents of this section are put.
|
|
|
|
|
|
|
|
|
|
The remaining length of the section is the code itself.
|
|
|
|
|
|
|
|
|
|
### Meta section
|
|
|
|
|
|
|
|
|
|
The meta section holds a table of metadata about the binary in a key-value format of strings mapping
|
|
|
|
|
to other strings. All strings are UTF-8 encoded.
|
|
|
|
|
|
|
|
|
|
* 64 bits - the number of key-value entries
|
|
|
|
|
|
|
|
|
|
The remaining length of the section are the key-value pairs.
|
|
|
|
|
|
|
|
|
|
The layout for a key-value pair is the key, followed immediately by the value. The key is always a
|
|
|
|
|
string, and the value may be any type of data. A key starts with the length of the string, followed
|
|
|
|
|
by the key string itself. A value starts with the length of the data, followed by the value data
|
|
|
|
|
itself.
|
|
|
|
|
|
|
|
|
|
The meta section should be used to place data that's readable by the VM, but is not used by the
|
|
|
|
|
executing program. Data in the meta section is not copied to the program memory.
|
|
|
|
|
|
|
|
|
|
A VM must provide support for the following meta-values:
|
|
|
|
|
|
|
|
|
|
* `entry` - a 64-bit address for where the VM should begin executing code.
|
|
|
|
|
|
2020-01-26 11:15:09 -05:00
|
|
|
# General TODO
|
|
|
|
|
|
|
|
|
|
* Interrupts
|
|
|
|
|
* MMIO regions
|
|
|
|
|
* Execution pipeline
|
|
|
|
|
* Helps to define when certain side effects happen (e.g. when the IP increments)
|
|
|
|
|
* Paging?
|