A bulletproof banking system for NES/Gameboy emulators
Table of Contents
Peeking at nesdev’s wiki page about mappers
, can be daunting, knowing you will have to implement dozens of those to get more games working. Most people will either implement a few and call it a day with their NES emulation, and don’t even bother trying with the hardest one.
It is true that to get most games running, you only need to implement the first 5 Nintendo mappers (NROM, MMC1, UxROM, CNROM, MMC3). Even if they only are 5, we have a lot of duplicate functionality. And banking can be a little tricky, as there is a bit of math involved with it. Whenever you are involved with math in code, you definetely want to abstract it out as soon as possible, so you don’t have to deal with it again later. This is what I call “nasty code”™️.
For the Gameboy, banking works exactly the same as the NES, so we can use the same mechanism for both systems.
Mappers mostly have to do the same things:
- Handle their registers writes to the PRG-ROM address range;
- Access to banked PRG/CHR addresses;
- Nametables mirroring (it is the same concept as PRG/CHR banking!)
- Other custom (and usually rare) functionality.
We surely do not want to implement each mapper logic from scratch. With this in mind, we can ease our mappers development with some good old abstraction.
The Mapper interface
First, all mappers should provide a generic interface to work with them. They should store a banking object, which holds all the current game bankings configuration (PRG-ROM, CHR, PRG-RAM, and Nametables VRAM, yes, we also hold the PPU mirroring in there!) The idea is this:
- When we build the mapper, we set up the bankings, and any other needed mapper state.
- When we WRITE to the registers (PRG-ROM address range) to change banks, we update the banking configuration.
- When we READ to the cartridge PRG-ROM, or READ/WRITE to the CHR, PRG-RAM, or Nametables VRAM, we simply translate the address given the current banking configuration.
- For more complex mappers, on each CPU clock, we can handle IRQs, Audio expansions, and scalines detectors (we might want to add even more functions based on how we need to wire the mapper object to the rest of our emulator).
We also need to provide the signature ‘Box
This is a sketch of how the mapper interface should be, and I provide some default implementations, as most mappers will always have the same ranges mappings:
|
|
Whenever we need more complex mapping logic (something like the infamous MMC5, for example), we just ovverride the default implementations. The new() and prg_write() methods always have to be implemented, of course.
Implementing Banking
We now need a generic banking mechanism. On original hardware, bankswitching was INSTANTANEOUS. There was no loading nor delay when banks were switched; everything was handled by the hardware. We can’t do that in software, so we will need to compute the correct address to a big array of data.
A mapper will have a varying amount of ‘slots’ or ‘pages’, which are the system’s memory ranges, mapped to ‘banks’, which are cartridge’s memory ranges.
The number of slots is mapper dependent, you will always have to refer to the mapper’s wiki page to know how many there are.
A slot will be mapped to a bank depending on what was written to the slot select register, this is mapper dependant too.
We will use a slots array, where each value is the configured bank for the specific slot.
Let’s take an hypothetic mapper which uses 4 slots in PRG-ROM:
|
|
When we get an address, we first have to fiure out in which slot we are:
|
|
We then get the bank starting address:
|
|
|
|
Finally, we compute the final mapped address.
|
|
Notice we wrap around the address by the slot size, so that we always get an address inside the bank. We then add this wrapped address to the bank starting address we got from the slot selection. The mapped address can now be used to address the full ROM range.
An example: MMC1’s CHR banking
Let’s take a mapper for our examples: the MMC1 CHR banking, as i think it covers most of our use cases. The MMC1 has two modes for CHR banking. Let’s look at mode1 first.
In this mode, there are two CHR slots. As the CHR address range is 8kb ($0 to $1000), each slot will be 4kb.
Whenever we write to CHR bank 0 ($C000 to $DFFF) or CHR bank 1 ($E000 to $FFFF), we will set the first slot to that bank number.
|
|
Whenever we access CHR, we will get a mapped address to access the full CHR range.
|
|
We basically don’t have to think about mapping anymore: our abstraction is doing all the hard calculations for us!
Now, what for mode 0? Mode 0 will use a SINGLE CHR slot; this means it will be big 8kb. But our configuration is using 2. To deal with this change, we simply treat the 8kb slot as two 4kb slots. This will mean that whenever we write to the chr bank register in this mode, we have to set both slots. We can do something like this:
|
|
In conclusion, when setting up the banks, we always set the slot size as the smallest possible slot size the mapper can have. When dealing with bigger slots, we update the relative slots as if they where still mapping to the smallest banks.
We also have to be sure to update ALL slots whenever we change modes. This is incredibly important, because the mode switch instantly changes how the bank selects behave. Mapper should provide an update_bankings() method, which updates all slots whenever a mode is changed.
Now, let’s code the actual functions.
The code: requirements
What we need:
- how big the ROM data is;
- how many slots the mapper provides;
- how big a slot is;
- the offset of the system memory range ($8000 for PRG-ROM, $2000 for Nametables, $6000 for PRG-RAM).
The code: initialization
|
|
Click for solution 😠
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
pub fn new_chr(header: &CartHeader, pages_count: usize) -> Self {
let pages_size = 8*1024 / pages_count;
Self::new(header.chr_real_size(), 0x0000, pages_size, pages_count)
}
pub fn new_sram(header: &CartHeader) -> Self {
Self::new(header.sram_real_size(), 0x6000, 8*1024, 1)
}
pub fn new_ciram(header: &CartHeader) -> Self {
let mut res = Self::new(4*1024, 0x2000, 1024, 4);
if header.mirroring != Mirroring::FourScreen {
res.banks_count = 2;
}
// this method is on the third exercise!
res.update_mirroring(header.mirroring);
res
}
The code: operations
We then provide these basic operations. Notice how set_slot saves the bank starting address, instead of the bank number. Also be sure to always wrap them around the maximum avaible count.
|
|
Nametable mirroring is the perfect use case for our banking system, as nametable VRAM can be treated as slots and banks!Click for solution 😠
Learn about nametable mirroring here: https://www.nesdev.org/wiki/Mirroring#Nametable_Mirroring 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
pub fn update_mirroring(&mut self, mirroring: Mirroring) {
match mirroring {
Mirroring::Horizontal => {
self.set_page(0, 0);
self.set_page(1, 0);
self.set_page(2, 1);
self.set_page(3, 1);
}
Mirroring::Vertical => {
self.set_page(0, 0);
self.set_page(1, 1);
self.set_page(2, 0);
self.set_page(3, 1);
}
Mirroring::SingleScreenA => for i in 0..4 {
self.set_page(i, 0);
}
Mirroring::SingleScreenB => for i in 0..4 {
self.set_page(i, 1);
}
Mirroring::FourScreen => for i in 0..4 {
self.set_page(i, i);
}
}
}
Optimizing Banking
The first exercise was asking about how you can optimize the banking system, as there are some super nerd trickery we can employ here. The explanation is hidden, if you’d like to think about it before continuing.
Click for explanation
ROM sizes are ALWAYS a power of two! And also mapper banks are always going to be a power of two, as there are always an even number of slots. This means we can turn rems to bitwise ands, and multiplications and divisions to bitshifts!a % b == a & (b-1)
a * b == a << log2(b)
a / b == a >> log2(b)
if and only if b is a power of 2.
We have to explicitly do that in code, because the compiler won’t catch these optimizations if the values aren’t constants. We might also want to cache the log2() values when we initialize the mapper, as they are constant. This is the final optimized code.
|
|
Banking system in action: UxROM
We now have a very handy and convenient interface for developing mappers. Look at how simple it is to fully implement UxROM:
|
|
And that’s it! It is now incredibly addicting to implement mappers. Have fun!
NROM , UxROM , CNROM , AxROM are easy mappers which you should definetely implement in your emulator.
MMC1 is the most used mapper of the NES. It is fairly more complex, but defintely worth to implement.
MMC3 is more complex, but is used in almost 300 games.
MMC2 is only used by Punch Out!!! but can be pretty satifying to implement.
VRC6 is only used by the japanese version of Castlevania III and other few games, but it has an incredible expansion audio chip which is very fun to implement and finally hear.
Banking system in action: my NES emulator
I have developed a NES emulator and roughly 30 mappers are working flawlessly with this system. Have a look at it here: my NES emulator .