Nintendo 64 Part 3: Building a Sample Program
More ramblings about Nintendo 64 homebrew. I’m trying to describe my process in detail, and editing it into a coherent narrative.
Game Jam Updates
The jam has started! These blog posts are being written and posted on a bit of a delay, so the jam has actually started almost a week ago.
The original announcement: YouTube: Nintendo 64’s First Game Jam ().
Theme reveal: N64brew Game Jam #1 - Theme Reveal Trailer (). The theme is size.
According to one of the folks on Discord, there are something like 32 solo and 12 teams participating in the jam. 44 total, nice! We’ll see how many people make it, come December, I’m sure you’ll be able to count the number of finished games on your fingers. (I’m not even sure you’ll need both hands. We’ll see.)
The prize pot is also over $1,000. I donated to the prize pot because I want people to see how exciting this game jam is. There are a couple people in chat who are a bit unhappy with the size of the prize pot…
The current way that things are handled with the Prize pot makes my blood boil so hard. […] I expect there to be lots of bickering when the jam’s over […]
That’s the issue. When there’s money involved, it becomes progressively less and less your choice what the real goals are, and my issue is that this prize pool is going to directly affect the enjoyment and fun of the Jam
My thoughts on the prize pool:
- Too late to change how prize money works.
- In the future, the prizes could go to charity. The money isn’t enough to pay for development time anyway (not by a long shot), and having the prizes go to charity would likely make people feel better about the jam.
- The prize money won’t make much of a difference in my life, but that’s not true for everyone.
Enough about the jam. Time to develop!
Getting an SDK
There appear to be a few different SDKs and libraries.
- LibUltra, Nintendo’s official (proprietary) SDK. People call this LibUltra even though it was historically just “the SDK”. Available for Windows and SGI IRIX.
- Libdragon, an open-source library for Nintendo 64.
- Pseultra, a collection of Nintendo 64 development tools, including a library called libpseultra.
- Libn64, an open-source library for Nintendo 64.
- LibreUltra, a matching decompilation of Nintendo’s LibUltra.
From comments in chat, it sounds like the open-source versions are not completely ready for making 3D games, although the’ll work well enough for 2D. I’ll start with the LibUltra SDK and explore other options later.
I’m using Linux, so I’ll start by trying the SGI IRIX toolchain. (This was the wrong choice! But I’ll learn about that later.) At the very least, it’s more likely that the tools will be distributed in a more familiar format (ideally, a tar file). I found the following items as CD-ROM images:
- Developer Documents
- Developer OS/Library (IRIX or Windows 95)
- Developer Tool Kit (cross-platform)
So, not was the “Developer Tool Kit” CD-ROM image corrupted, but after going through the work of trying to extract files from it, I learned that it does not contain LibUltra or the Nintendo 64 toolchain! Instead, it contains additional libraries that run on top of the N64 toolchain, such as NuSystem. From the docs:
NuSystem reduces the amount of effort needed in the initial stage of program development, making N64 development easier to understand. In NuSystem, each N64 function is a component which can be controlled using callback and front-end functions - facilitating the progress of N64 programs. The flexible design takes processing speed, memory efficiency, and expandability into consideration. With NuSystem you can create a program without delving into the complicated aspects of N64 development.
It turns out that what I want is the disc called “OS/Library”, which I had not entirely understood. The reason it’s called “OS” is because the library, LibUltra, contains a basic operating system for the Nintendo 64 that is linked into N64 games. The latest version is 2.0L, and for maximum confusion, it’s written with a lower-case “L” so you can more easily confuse it with version 2.0I.
The software for SGI is distributed for a software installation program called
inst
, which IRIX uses to install software. Since I don’t have an
SGI IRIX workstation handy (yet—it’s been a dream of mine to have an
Indigo2, Indy, O2, or Octane), I wrote a tool to extract files from
these packages called
SGI Extractor.
The OS/Library 2.0L cd contains only one package. After extracting it, I poked around looking for interesting files:
/usr/lib
: Contains various versions oflibultra.a
, object files that look like they contain microcode for the RSP, the bootloader, some sound files./usr/include
: Contains<ultra64.h>
, the main header for LibUltra, and all the other LibUltra headers. Also containsmake/PRdefs
, which is intended to be included in makefiles./usr/sbin
: All the development tools:makerom
,vadpcm_enc
, etc./usr/src/PR
: Demo N64 programs, sample assets, and partial source code for LibUltra.
Some of the text files were in Japanese, with the Japanese EUC encoding. If you want to read them, you probably want to convert them to Unicode. For example:
$ cd usr/src/PR $ ls README.jp assets demos demos_old doc libsrc $ iconv -f eucJP -t utf-8 <README.jp このディレクトリについて このディレクトリには、下記のものが含まれています。 doc/ リリースノートはこのディレクトリの下にあります。まずはリリース ノートをご覧下さい。過去のリリースノートは doc/relnotes_old/ [...]
Compiler Flags
I see the flags other people are using in the chat:
CFLAGS = -fno-PIC -mabi=32 -mno-shared -mno-abicalls \
-march=vr4300 -mtune=vr4300 -mfix4300 -G 0
Let’s analyze these to see what we need.
Preprocessor Definitions
I also see people using some flags like -D_MIPS_SZLONG=32
and -D_MIPS_SZINT=32
. Are these necessary? We can check:
$ mips64-gcc -dM -E -xc /dev/null | grep _MIPS_SZ #define _MIPS_SZPTR 32 #define _MIPS_SZINT 32 #define _MIPS_SZLONG 32
It turns out that GCC defines these by default.
Position Independent Code
We know that GCC is not generating PIC code by default, because trying to enable it fails:
$ mips64-gcc -fpic -c -xc - cc1: error: position-independent code requires '-mabicalls'
From reading
GCC: MIPS Options
we can figure out that we don’t need -mno-abicalls
because it is
default, we also don’t need -mno-shared
because it has no effect
without -mabicalls
, and we likewise don’t need
-fno-pic
.
Multiplication Bug
The -mfix4300
looks like it is intended to enable a workaround
for a bug in the VR4300 silicon, but does this flag work and what
exactly does it do? It’s not documented in the GCC manual. However, we can
find this note in the GCC source code:
Early VR4300 silicon has a CPU bug where multiplies with certain operands may corrupt immediately following multiplies. This is a simple fix to insert NOPs.
The surrounding code seems to just insert a nop
after a
floating-point multiply. We can test whether this nop
is inserted
with and without the -mfix4300
flag. Create test.c
:
float f(float x, float y) {
return x * y;
}
Compile it with and without -mfix4300
:
$ mips64-gcc -S -O2 test.c && cat test.s f: jr $31 mul.s $f0,$f12,$f13 $ mips64-gcc -mfix4300 -S -O2 test.c && cat test.s f: mul.s $f0,$f12,$f13 nop jr $31 nop
Looks like this option is necessary.
MIPS ABI
The last thing we need to figure out is to set the correct ABI. There are five
ABIs: o32, n32, o64, n64, and eabi. The GCC flag -mabi=32
selects the “o32”
ABI and -mabi=64
selects the “n64” ABI, and just to be
completely clear here the “n” in “n64” stands for “native”,
not “Nintendo”.
If we choose the wrong ABI, we may still be able to build and link our code,
but we may experience anything from mysterious data corruption to crashes.
I thought this would be fairly simple to write up, but as I investigate, I become less sure what the correct ABI option is.
We know that the Nintendo 64 uses a NEC VR4300, which is a MIPS R4300i, and implements the MIPS III instruction set. This is a 64-bit processor, but you already knew that, because Nintendo decided that using a 64-bit architecture was so important that they named the console after it (even though their next console, the Nintendo GameCube, had a 32-bit architecture).
However, since it is a 64-bit processor, the “o32” ABI does not make logical sense. What is “o32”? It is the ABI used for MIPS I and MIPS II processors, which are 32-bit processors. The “n32” ABI is for 64-bit processors only, and was created to provide an efficient ILP32 (32-bit int, long, and pointer) ABI for 64-bit MIPS processors. The n32 ABI makes logical sense, because the Nintendo 64 has a 64-bit processor and a 32-bit address space. From Whats Wrong With O32, N32, N64:
o32 has been an orphan for a long time. Somewhere in the mid-1990s SGI dropped it completely, because all their systems had been using real 64-bit CPUs for some time.
So, how can we find out what ABI the Nintendo 64 LibUltra toolchain uses? We could try looking at the object files in LibUltra:
$ mkdir objs $ cd objs $ ar x /path/to/libultra.a $ mips64-readelf -h bcopy.o ELF Header: Magic: 7f 45 4c 46 01 02 01 00 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2's complement, big endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: REL (Relocatable file) Machine: MIPS R3000 Version: 0x1 Entry point address: 0x0 Start of program headers: 0 (bytes into file) Start of section headers: 4736 (bytes into file) Flags: 0x10000000, mips2 Size of this header: 52 (bytes) Size of program headers: 0 (bytes) Number of program headers: 0 Size of section headers: 40 (bytes) Number of section headers: 8 Section header string table index: 2
This doesn’t give us the information we want.
(Edit: Yes it does!
The mips2
ISA is 32-bit, and it implies the o32 ABI.
I didn’t realize this at the time I wrote this.)
When we compile code with our
mips64-gcc
toolchain, the resulting object files have flags which
specify which ABI we are using, but that information isn’t present here for
whatever reason. What other way can we be confident that we have the right
ABI to link with LibUltra? Well, there are two main differences between the
o32 and n32 ABIs. From
N32 ABI Overview,
we see that the n32 has more argument registers, 8, compared to o32’s 4, but
that would only make a difference for functions with more than four arguments.
We can also look for a function with a 64-bit parameter and look at the
disassembly. One such function is osSetTime()
. Here is the
declaration:
typedef u64 OSTime;
extern void osSetTime(OSTime);
And here is the disassembly:
$ mips64-objdump -d settime.o settime.o: file format elf32-bigmips Disassembly of section .text: 00000000 <osSetTime>: 0: afa40000 sw a0,0(sp) 4: 8fae0000 lw t6,0(sp) 8: afa50004 sw a1,4(sp) c: 3c010000 lui at,0x0 10: 8faf0004 lw t7,4(sp) 14: ac2e0000 sw t6,0(at) 18: 3c010000 lui at,0x0 1c: 03e00008 jr ra 20: ac2f0004 sw t7,4(at) ...
This is clearly the o32 ABI designed for MIPS II, because it splits a
single 64-bit argument between the a0
and a1
registers. As a side note, I’m not used to reading MIPS and thought that
objdump
wasn’t showing me the entire function because the
function didn’t end with a return. Of course the return is the second-to-last
instruction (the jr
), and the following instruction (the
sw
) is in the branch delay slot. MIPS, eh? I’m sure it made the
silicon simpler.
Final Flags
The final set of compiler flags we use are:
CFLAGS = -mabi=32 -ffreestanding -mfix4300 -G 0
I added -ffreestanding
because the Nintendo 64 certainly
qualifies as a freestanding environment. See Language Standards Supported By GCC.
Building a Sample Program
The SDK includes sample programs in /usr/src/PR/demos
. I’d like
to pick a small one and build it. Which one has the fewest lines of code?
$ cd usr/src/PR/demos $ for dir in * ; do if test -d "$dir" ; then echo -n "$dir," cloc "$dir" --csv | tail -n 1 | cut -d, -f5 fi done | sort -n -t, -k2 Texture,49 greset,52 print,84 sramtest,134 ginv,188 gl,334 fault,359 onetri,412 onetri-fpal,416 topgun,507 [...]
The onetri
demo looks the most promising.
I’ll try to build it.
This is the makefile I made, based on the demo’s makefile and using
the flags I figured out above. I just want to compile codesegment.o
from the sample program.
CFLAGS := -mabi=32 -ffreestanding -mfix4300 -G 0 -I../ultra/include
CC := mips64-gcc
LD := mips64-ld
CFLAGS += -DF3DEX_GBI_2
ifdef FINAL
CFLAGS += -O2 -DNDEBUG -D_FINALROM
N64LIB = ultra_rom
else
CFLAGS += -g -DDEBUG
N64LIB = ultra_d
endif
codefiles = onetri.c dram_stack.c rdp_output.c
codeobjects = $(codefiles:.c=.o)
datafiles = static.c cfb.c rsp_cfb.c
dataobjects = $(datafiles:.c=.o)
codesegment.o: $(codeobjects)
$(LD) -nostdlib -r -o $@ $^ ../ultra/lib/lib$(N64LIB).a
clean:
rm -f $(codeobjects) $(dataobjects) codesegment.o
.PHONY: clean
$ make mips64-gcc -mabi=32 -ffreestanding -mfix4300 -G 0 -I../ultra/include -DF3DEX_GBI_2 -g -DDEBUG -c -o onetri.o onetri.c onetri.c:31:10: fatal error: assert.h: No such file or directory 31 | #include <assert.h> | ^~~~~~~~~~ compilation terminated. make: *** [<builtin>: onetri.o] Error 1 [Exit: 2]
I’ll create a really simple <assert.h>
header to get this
compiling, and put it in a folder named system
.
#define assert(x) (void)0
Update the Makefile,
CFLAGS := -mabi=32 -ffreestanding -mfix4300 -G 0 \
-I../ultra/include -I../system
$ make mips64-gcc -mabi=32 -ffreestanding -mfix4300 -G 0 -I../ultra/include -I../system -DF3DEX_GBI_2 -g -DDEBUG -c -o onetri.o onetri.c mips64-gcc -mabi=32 -ffreestanding -mfix4300 -G 0 -I../ultra/include -I../system -DF3DEX_GBI_2 -g -DDEBUG -c -o dram_stack.o dram_stack.c mips64-gcc -mabi=32 -ffreestanding -mfix4300 -G 0 -I../ultra/include -I../system -DF3DEX_GBI_2 -g -DDEBUG -c -o rdp_output.o rdp_output.c mips64-ld -nostdlib -r -o codesegment.o onetri.o dram_stack.o rdp_output.o ../ultra/lib/libultra_d.a mips64-ld: ../ultra/lib/libultra_d.a(exceptasm.o): linking 32-bit code with 64-bit code mips64-ld: failed to merge target specific data of file ../ultra/lib/libultra_d.a(exceptasm.o) mips64-ld: ../ultra/lib/libultra_d.a(ll.o): linking 32-bit code with 64-bit code mips64-ld: failed to merge target specific data of file ../ultra/lib/libultra_d.a(ll.o) mips64-ld: codesegment.o: illegal section name `.gptab.data' mips64-ld: final link failed: nonrepresentable section on output make: *** [Makefile:21: codesegment.o] Error 1
The first message is “linking 32-bit code with 64-bit code”, so I’ll tackle that… by asking in chat. It turns out that the SGI version of the SDK and the Windows version of the SDK are different! The SGI version was compiled with the SGI compiler, and the Windows version was compiled with GCC. The toolchains are not compatible.
Getting the Windows Toolchain
We find an ISO image of the Windows OS/PC disc and mount it.
$ cabextract -d ~/os20l os20l_eng.exe $ cd ~/os20l $ unshield x data1.cab $ find Ultra_Dev_*/usr/include -type f -exec dos2unix '{}' +
Of note, the library in this SDK is named libgultra
, instead
of libultra
. The “g” stands for GNU or GCC.
Building A Sample Program, Mark Two
This toolchain has an <assert.h>
header, so I can delete
my version. Here are the changes to the makefile:
CFLAGS := -mabi=32 -ffreestanding -mfix4300 -G 0 -I../ultra/include
codesegment.o: $(codeobjects)
$(LD) -nostdlib -r -o $@ $^ ../ultra/lib/libg$(N64LIB).a
The code segment does build now.
$ make mips64-gcc -mabi=32 -ffreestanding -mfix4300 -G 0 -I../ultra/include -DF3DEX_GBI_2 -g -DDEBUG -c -o onetri.o onetri.c mips64-gcc -mabi=32 -ffreestanding -mfix4300 -G 0 -I../ultra/include -DF3DEX_GBI_2 -g -DDEBUG -c -o dram_stack.o dram_stack.c mips64-gcc -mabi=32 -ffreestanding -mfix4300 -G 0 -I../ultra/include -DF3DEX_GBI_2 -g -DDEBUG -c -o rdp_output.o rdp_output.c mips64-ld -nostdlib -r -o codesegment.o onetri.o dram_stack.o rdp_output.o ../ultra/lib/libgultra_d.a
We install Spicy and add rules to our makefile:
all: onetri.n64
onetri.n64: spec codesegment.o $(dataobjects)
spicy -r $@ spec
$ make mips64-gcc -mabi=32 -ffreestanding -mfix4300 -G 0 -I../ultra/include -DF3DEX_GBI_2 -g -DDEBUG -c -o onetri.o onetri.c mips64-gcc -mabi=32 -ffreestanding -mfix4300 -G 0 -I../ultra/include -DF3DEX_GBI_2 -g -DDEBUG -c -o dram_stack.o dram_stack.c mips64-gcc -mabi=32 -ffreestanding -mfix4300 -G 0 -I../ultra/include -DF3DEX_GBI_2 -g -DDEBUG -c -o rdp_output.o rdp_output.c mips64-ld -nostdlib -r -o codesegment.o onetri.o dram_stack.o rdp_output.o ../ultra/lib/libgultra_d.a mips64-gcc -mabi=32 -ffreestanding -mfix4300 -G 0 -I../ultra/include -DF3DEX_GBI_2 -g -DDEBUG -c -o static.o static.c mips64-gcc -mabi=32 -ffreestanding -mfix4300 -G 0 -I../ultra/include -DF3DEX_GBI_2 -g -DDEBUG -c -o cfb.o cfb.c mips64-gcc -mabi=32 -ffreestanding -mfix4300 -G 0 -I../ultra/include -DF3DEX_GBI_2 -g -DDEBUG -c -o rsp_cfb.o rsp_cfb.c spicy -r onetri.n64 spec panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x54ced9]
A panic usually means that there is something wrong with the program, so I made my own fork of Spicy, depp/spicy, with better error handling. With the forked version, I can make additional progress. Here is a change to the makefile:
onetri.n64: spec codesegment.o $(dataobjects)
spicy -r $@ spec --toolchain-prefix=mips64-
Trying to compile it, I fail yet again.
$ make spicy -r onetri.n64 spec --toolchain-prefix=mips64- ERRO[0000] Error: spicy.LinkSpec: Error running 'mips64-ld': exit status 1: mips64-ld: codesegment.o: in function `__osInitialize_common': (.text+0x4648): undefined reference to `__udivdi3' mips64-ld: codesegment.o: in function `MonitorInitBreak': ../monutil.s:184: undefined reference to `__umoddi3' mips64-ld: ../monutil.s:184: undefined reference to `__udivdi3' mips64-ld: ../monutil.s:184: undefined reference to `__divdi3' make: *** [Makefile:29: onetri.n64] Error 1
I know what this is. This means I haven’t linked in LibGCC! But there is
only one copy of libgcc.a
that I see, and on investigation,
it uses the o64 ABI.
Here I go again, rebuilding the toolchain.
$ mkdir build-binutils; cd build-binutils $ ../binutils-2.35.1/configure \ --target=mips32-elf --prefix=/opt/n64 \ --program-prefix=mips32-elf- --with-cpu=vr4300 \ --with-sysroot --disable-nls --disable-werror $ make $ sudo make install $ cd .. $ mkdir build-gcc; cd build-gcc $ ../gcc-10.2.0/configure \ --target=mips32-elf --prefix=$prefix \ --program-prefix=mips32-elf- --with-arch=vr4300 \ --with-languages=c,c++ --disable-threads \ --disable-nls --without-headers --with-newlib $ make all-gcc $ make all-target-libgcc $ sudo make install-gcc $ sudo make install-target-libgcc
I’ve renamed things, so I have to change the Makefile. Note that GCC is now compiled to use the o32 ABI by default.
CFLAGS := -ffreestanding -mfix4300 -G 0 -I../ultra/include
CC := mips32-elf-gcc
LD := mips32-elf-ld
codesegment.o: $(codeobjects)
$(LD) -nostdlib -r -o $@ $^ ../ultra/lib/libg$(N64LIB).a \
/opt/n64/lib/gcc/mips32-elf/10.2.0/libgcc.a
onetri.n64: spec codesegment.o $(dataobjects)
spicy -r $@ spec --toolchain-prefix=mips32-elf-
Compiling it this time works:
$ make mips32-elf-gcc -ffreestanding -mfix4300 -G 0 -I../ultra/include -DF3DEX_GBI_2 -g -DDEBUG -c -o onetri.o onetri.c mips32-elf-gcc -ffreestanding -mfix4300 -G 0 -I../ultra/include -DF3DEX_GBI_2 -g -DDEBUG -c -o dram_stack.o dram_stack.c mips32-elf-gcc -ffreestanding -mfix4300 -G 0 -I../ultra/include -DF3DEX_GBI_2 -g -DDEBUG -c -o rdp_output.o rdp_output.c mips32-elf-ld -nostdlib -r -o codesegment.o onetri.o dram_stack.o rdp_output.o ../ultra/lib/libgultra_d.a /opt/n64/lib/gcc/mips32-elf/10.2.0/libgcc.a mips32-elf-gcc -ffreestanding -mfix4300 -G 0 -I../ultra/include -DF3DEX_GBI_2 -g -DDEBUG -c -o static.o static.c mips32-elf-gcc -ffreestanding -mfix4300 -G 0 -I../ultra/include -DF3DEX_GBI_2 -g -DDEBUG -c -o cfb.o cfb.c mips32-elf-gcc -ffreestanding -mfix4300 -G 0 -I../ultra/include -DF3DEX_GBI_2 -g -DDEBUG -c -o rsp_cfb.o rsp_cfb.c spicy -r onetri.n64 spec --toolchain-prefix=mips32-elf-
The emulator crashes, and it’s because we’re missing the boot code.
This is created by MakeMask, a program which has another modern
replica: MakeMask.
This computes a checksum on 1 MiB of data starting at offset
0x1000
, but the program just panics because my ROM is
not padded out to a large enough size. I made my own fork with some
better error handling and made it pad out the ROM:
depp/makemask.
Additionally, the debug build seems not to work, so I need
FINAL=1
.
Update: A previous version of this post wrote
makerom
instead ofmakemask
below.
$ make clean $ make FINAL=1 $ makemask onetri.n64 $ /usr/games/mupen64plus onetri.n64
I’m going to bed. At some point in the future I’ll figure out how to make a game in this Byzantine development environment.