Hacking Persona 3, Part 3

This is part 3 of my writeup on the time I changed Persona 3’s music. You should probably read part 1 and part 2 first.

I’m going to break the pretense of “discovery” and just tell you what the BGM tool does now.

Here are the tasks we need to complete:

  • Re-write the embedded BGM.CVM file table to reflect changes to BGM.CVM.
  • Add a way to map a given song to a list of tracks to choose from randomly.
  • Add a routine to select a song from the list randomly.

All of these are made harder by the fact that there is practically no free space in the source code. Because this is in assembly, everything relies on jumping to specific “lines” – which means we can’t freely insert code. Which means we either need to overwrite existing data, or extend the file.

(The original plan was to use some of the empty data in the file – there is actually a large contiguous block near the end! Except that it all actually gets overwritten at runtime – it seems to be used for some sort of storage. So that’s out of the question, but was a very frustrating week of my time wasted.)

So, how do we extend the file? Well, luckily, the PS2 uses the .ELF format (also used by Linux executables), which means we’ve got a fairly well-documented format to work with. Long story short, we can increase the data loaded beyond the ELF footer (near the bottom of the file) by incrementing the 4-byte value at an offset of 0x00000044 into the file. I also increment 0x0067F514 by the same amount, which contains the address of the start of the heap.

Generating the new BGM.CVM file list isn’t that hard, since we already have the code for a tool that creates a new .CVM. I used the cvm_tool code to read what’s in the new BGM.CVM file table and just follow the format described in Part 2.

The list of songs is where things start to get interesting again. If you’ll recall, the original code worked with two tables: the first converts an internal “song ID” to the address of an entry in the second table containing the song’s filename as a null-terminated string. That seems weird; after all, the original C code is probably something like this…

int songNum = 1;
const char** filename = {"01.ADX", "21.ADX", "101.ADX", ...};
return filename[songNum]; 

But that gets represented a bit differently in assembly! The following is very similar to how Persona 3 actually does it.

//example filename table start
//'\0' is the "null" character - it has a value of 0x00 and is used to indicate the end of a string in C.
//(notice that song names are always aligned on a word boundary - 4 bytes! this means a string always has to take up a multiple of 4 bytes.)
0x00000000 - "01.A"
0x00000004 - "DX\0\0"
0x00000008 - "21.A"
0x0000000C - "DX\0\0"
0x00000010 - "101."
0x00000014 - "ADX\0"

//example filename lookup table start
//(starts at 0x00001000 which I picked arbitrarily for this example)
//the value at (0x00001000 + songNum * 0x0000000C) is the address of the string for that song number
0x00001000 - 0x00000000 //song number 0
0x00001004 - 0x00000000 //unused
0x00001008 - 0x00000000 //unused
0x0000100C - 0x00000008 //song number 1
0x00001010 - 0x00000000 //unused
0x00001014 - 0x00000000 //unused
0x00001018 - 0x00000010 //song number 2
0x0000101C - 0x00000000 //unused
0x00001020 - 0x00000000 //unused

//to actually retrieve data from the array (note: psuedo-instructions used)
//in this example, we retrieve the address of the string for the path of song number 1
li v0, $00000001 //load a value of 1 into register v0
multi v0, v0, $000C //multiply v0 by 0x000C (the array entry size) and store the result in v0
lw v0, $1000(v0) //load the word (4 bytes) at the address stored in v0 + 0x00001000 into v0
//v0 now contains the address of the null-terminated string "21.ADX" 

A very convenient thing to notice here: the filename lookup table allocates 12 bytes per entry. Only 4 bytes are actually used. I’m not sure why this is – the compiler probably had a good reason – but that means we can stick extra data in the lookup table!

To randomly select music, more information is needed. Let’s define some vocabulary first.

  • A song is what’s understood by the original game, in that first array.
  • In our new world order, every song has a list of tracks associated with it, instead of just one piece of music – we can pick any one of them to play.

We’re changing the original C code around to be something closer to this:

//In this example, song number 0 maps to "01.ADX" and "01A.ADX", song number 1 maps to "21.ADX", and song number 2 maps to "101.ADX".

//the compiler would actually concatenate these strings together as if they were all on one line
const char* filenames = "01.ADX\0\0"

struct lookup
    const char* startAddress;
    unsigned int trackCount;
    unsigned int trackAddressLength;

lookup lookupArray[] = { 
    { &filenames[0],  0x00000002, 0x00000008},
    { &filenames[16], 0x00000001, 0x00000008},
    { &filenames[24], 0x00000001, 0x00000008}

Which is pretty goddamn crazy. But it lets us pick a random song like this:

int songNum = 1;
int trackNum = rand() % lookupArray[songNum].trackCount;
return lookupArray[songNum].startAddress + lookupArray[songNum].trackAddressLength * trackNum; 

There are some interesting limitations here – for example, the song address length for every track for a given song ID must be the same. Luckily that doesn’t make much of a difference for our use case since all track names are practically the same length (we rename them to follow “01A.ADX”, “01B.ADX”, etc.).

So, the patcher works by turning the original format of

struct lookupEntry
    unsigned int stringAddress;
    unsigned int unused;
    unsigned int unused2;


struct lookupEntry
    unsigned int firstStringAddress;
    unsigned int trackCount;
    unsigned int trackStringSize;

And creating the associated tables. Rad.

There’s one last piece of the puzzle – writing the assembly code to intercept music playback and pick a random song…