一个简单的cpu模拟器

本文最后更新于 2024年6月17日 凌晨

这几天正在看一些体系结构相关的论文, 想回顾一下本科微机原理的一些东西, 心想不妨用简单的语言模拟一下cpu取指译码执行的这些过程, 于是便用python仿照这个项目写了个简单的cpu模拟器

项目地址: https://github.com/Lincest/simple-cpu

指令集和伪操作定义

定义了几个非常简单常用的指令如下:

instruction description
nop no operation
load load r1 r2 load data with address of r2 to r1
movi movi r1 1234
store store r1 r2 store r2’ s data to address of r1
inc inc r1 r1’s data += 1
cmpi cmpi r1 1234 compare r1’s data with 1234
jnz jnz 0x12 jump to offset 0x12 (addr: current_addr + offset 0x12) if not zero
halt pause the cpu
add add r1, 4 r1’s data += 4

同时定义了一些伪操作, 主要用途是配合label实现重定位:

pseudo (伪操作) description
label label <name> label definition
jnzl jnzl <name> jump to label
movil movil r1 <label> save label’s addr to r1
data data <number> immediate numbers space

汇编器

代码在: https://github.com/Lincest/simple-cpu/blob/master/assembler/assembler.py

这里阐述一下主要的逻辑

  • 首先汇编器将打开源代码xx.s文件, 按行读取每一行汇编代码
  • 按指令区分, 交给对应的处理函数resolver, 转换为机器码
  • 在处理结束后, 进行重定位的处理, 这时将根据label的值区分绝对定位(movi)和相对定位(jnz), 重新填充含有label的指令
  • 最后将机器码按小端的字节序写入输出文件, 交给cpu进行处理

几个需要注意的问题

  • 模拟的字节序号是小端存储, 而内存是用数组模拟的16KB, 前面是代码段, 没有进行明确的代码段和数据段的区分, 这时例如movi r0, 0xFFFE这条指令, 如movi被编码为0x01而r0被编码为0x11, 在内存中将按这种方式存储:
1
2
3
4
5
6
       +--------------+--------------+--------------+--------------+                
| 0x01 | 0x11 | 0xFE | 0xFF |
+--------------+--------------+--------------+--------------+

----------------------------------------------------------------->
low memory high memory
  • jnzl label使用的是相对偏移量, 而movil label使用的是绝对偏移量
  • 寄存器为32bit, 可以存储有符号数, 故数据的范围是-2147483648~21474836470X80000000~0X7FFFFFFF

cpu

cpu的处理流程如下:

  • 首先申请一块16 kB的空间
  • 将程序读入到低地址部分
  • 取指-译码-执行

例子

一个简单的程序:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
movi r1, 0
movi r3, 4096
movil r2, data_addr

label loop
# load memory[r2] to r4
load r4, r2
# store r4's value to memory[r3]
store r3, r4
# load memory[r3] to r5
load r5, r3
# r2 += 4 to fetch next number
add r2, 4
# compare r4's value with number -0x1234
cmpi r5, -0x1234
# jump if not equal
jnzl loop
# program ends
halt

label data_addr
data -55 -30 40
data 0x1234 0x123456ff
data -0x1234

其程序执行的过程如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
memory size = 16384B
mov imme = 0 to r1
mov imme = 4096 to r3
mov imme = 40 to r2
load from memory[r2 = 0x28] = 0x-37 (-55) -> r4
store from r4 = 0x-37 (-55) to memory[r3 = 0x1000]
load from memory[r3 = 0x1000] = 0x-37 (-55) -> r5
r2 += 4 = 44
compare r5 = -55 with -4660, now r15 = 0x1
load from memory[r2 = 0x2c] = 0x-1e (-30) -> r4
store from r4 = 0x-1e (-30) to memory[r3 = 0x1000]
load from memory[r3 = 0x1000] = 0x-1e (-30) -> r5
r2 += 4 = 48
compare r5 = -30 with -4660, now r15 = 0x1
load from memory[r2 = 0x30] = 0x28 (40) -> r4
store from r4 = 0x28 (40) to memory[r3 = 0x1000]
load from memory[r3 = 0x1000] = 0x28 (40) -> r5
r2 += 4 = 52
compare r5 = 40 with -4660, now r15 = 0x1
load from memory[r2 = 0x34] = 0x1234 (4660) -> r4
store from r4 = 0x1234 (4660) to memory[r3 = 0x1000]
load from memory[r3 = 0x1000] = 0x1234 (4660) -> r5
r2 += 4 = 56
compare r5 = 4660 with -4660, now r15 = 0x1
load from memory[r2 = 0x38] = 0x123456ff (305420031) -> r4
store from r4 = 0x123456ff (305420031) to memory[r3 = 0x1000]
load from memory[r3 = 0x1000] = 0x123456ff (305420031) -> r5
r2 += 4 = 60
compare r5 = 305420031 with -4660, now r15 = 0x1
load from memory[r2 = 0x3c] = 0x-1234 (-4660) -> r4
store from r4 = 0x-1234 (-4660) to memory[r3 = 0x1000]
load from memory[r3 = 0x1000] = 0x-1234 (-4660) -> r5
r2 += 4 = 64
compare r5 = -4660 with -4660, now r15 = 0x0
cpu halt

一个简单的cpu模拟器
https://moreality.net/posts/17279/
作者
Moreality
发布于
2022年10月7日
许可协议