↖️ Blog Archive

perilus, Part Three

Bradley Gannon

2026-01-03

TL;DR: perilus can now execute the lw and sw instructions. To get to this point, I needed to add a lot of new functionality, most notably the control unit FSM and the ability to set and inspect memory and register contents in simulation. Now we can start adding support for more instructions in the RV32I set.

Control Unit FSM

My first task this week was to finish the first-pass implementation of the multicycle processor in Harris. The only major remaining element was the control unit, which I’d stubbed out but still needed to fill in. The control unit is the part of the processor that tells all the other parts what to do and when to do it, and it accomplishes this by encoding the necessary operations as a finite state machine (FSM). Chisel makes FSMs pretty easy to implement, which I suppose is intentional because they show up all over the hardware design world.

The Harris processor doesn’t support the entire RV32I instruction set, but the subset that it does support is a good starting point for adding the rest. (The RV32I set itself is also clearly designed in a way that eases hardware implementation.) For now, I’ve built the control unit to exactly match the details of the FSM given in the book, but later on when we can show that perilus matches that design we’ll depart from the book and add support for the rest of the set.

At the moment, the FSM begins in the fetch state, which reads the address pointed to by the program counter and moves it into the instruction register. It also adds four to the program counter to set up for the next instruction, assuming it doesn’t get overriden by the current one. The fetch state unconditionally transitions to the decode state, which reads the opcode field to determine the next state. I won’t describe the remaining states for brevity, but essentially they form mostly unconditional chains of operations to execute the various supported instructions. I suspect that adding support for the remaining instructions will mostly be a matter of adding more conditions to each of these existing chains.

I should mention that I replaced all instances of unlabeled bitstrings in the design with ChiselEnums. This has made the code much clearer because the enum variants carry a lot more semantic information locally than just raw binary numbers, which I have to look up in a table to decode. This is a good example of how Chisel improves on SystemVerilog by applying concepts from typical programming languages.

Exposing Debug IO Ports

Testing the control unit (and later the entire processor) proved somewhat challenging at first because I wasn’t sure how to examine the internal FSM state using the ChiselSim API. A solution to the problem—although probably not the only one—is to create an extra Chisel IO port on the module under test. The trick, though, is to wrap the port in an Option and only return the Some variant when a boolean parameter withDebug is true. The default value of withDebug in the module constructor is false, so normally the debug port is None and does not get translated into SystemVerilog. However, during testing we can explicitly pass withDebug = true and then run foo.io.debug.get to retrieve the debug port. We can put anything we want in the debug port as long as we write some extra lines in the module to keep the debug port updated.

This is admittedly a little cumbersome and seems to be a direct result of Chisel’s design decision to only allow module interactions through explicit IO ports. To be fair, this is how actual hardware really works, at least for integrated circuits. Brady pointed out that it may be possible to annotate an IO port to exclude it from synthesis but keep it available for simulation, similar to how serde allows fields to be skipped. I think this would basically be a first-party implementation of the same feature as the one described above, which would be really nice to have.

When the time came to write the first processor test, I had to nest the debug port definitions by setting one up in the Perilus class that “re-exports” the debug ports from the memory and register file modules. This took some minor acrobatics with the Options to make sure all ports were properly driven in all possible branches.

Notably, the debug ports don’t given unrestriced access to the module’s internals. It’s still a Chisel IO port, so it has to play by the same rules. For this purpose, that means that accessing, say, a particular memory location involves poking a debug address register with the right value and then peeking a different debug register to get the value. The debug ports only support reading data at the moment, but I don’t think there’s any reason why they couldn’t support writing as well. There just hasn’t been any reason for it so far.

Reading Test Data from Files

The processor won’t do much without a program, which means we need a way to set the initial state of the machine. In a real system, we can assume that the program counter is reset to a fixed value (the reset vector) and the memory contains at least a minimal program that sets up everything else (the bootloader). These dependencies lie beyond the scope of the processor itself and may be satisfied by, for example, flashing the memory using an external circuit. In the virtual environment of ChiselSim, we still must satisfy these dependencies, and perilus accomplishes this by reading memory and register file images from the filesystem. Chisel provides an experimental feature for this purpose, which it translates into corresponding SystemVerilog code.

I found this to be a little annoying for two reasons. One is that it would be nice to instead provide, for example, an ArrayBuffer of the proper type and have Chisel handle the details of getting them into SystemVerilog. I intend to explore creating a helper function for this in the future so we don’t have to keep an assets/ directory in the repo with files for every test. The other reason is that there isn’t any warning output when the specified path doesn’t exist. This again can be handled in user code, but it would be nice if it existed by default.

It was interesting to find that Chisel (or more probably the underlying simulator backend) initializes the memory with deterministic random values if no other initializer is provided. This is a good representation of actual hardware in that it discourages the assumption of a particular value on reset.

Executing the First Instruction

With all these prerequisites in place, the first actual instruction test was easy enough. The reset vector is 0x0, so I manually assembled a lw instruction and placed it in a new memory file at that location, with the rest being random values. I also created a register file containing all zeros except for the register containing the base pointer for the load (that is, the register specified by the instruction’s rs1 field). The test computes the memory location to be loaded and retrieves that value. Then, it steps the clock several times to fully execute the instruction. Finally, it verifies that the expected value has been copied from memory into the specified rd register. If all the expects pass, then we can be confident that we’ve executed the load. The test technically doesn’t check all possible side effects, and maybe it should, but it at least checks the source and destination registers and the specified memory location.

Yesterday afternoon, Brady and I had a lovely time adding a test for the sw instruction as well. We mostly got this instruction for free in the Harris design. We also tried to add beq, but we ran into some issue related to management of the program counter. I probably wrote a bug somewhere, so we’ll just need to track it down before continuing.

Tests for the remaining RV32I instructions are likely to take the same form as the ones we’ve written so far. In fact, there will probably be a lot of opportunities for factoring out common test logic into helper functions. I’d like to end up with individual tests for each instruction, which will be tedious but probably quite helpful as we make improvements over time. As we go along I’m sure we’ll also add tests for simple programs too. In the longer term I’m interested in decoupling the system from ChiselSim for more complicated tests, with the eventual goal of adding memory-mapped IO peripherals and getting closer to something like a real computer.