Hacking Persona 3, Part 3

This is part 3 of my writeup on the time I changed Persona 3’s music. You should probably read part 1 and part 2 first.

I’m going to break the pretense of “discovery” and just tell you what the BGM tool does now.

Here are the tasks we need to complete:

  • Re-write the embedded BGM.CVM file table to reflect changes to BGM.CVM.
  • Add a way to map a given song to a list of tracks to choose from randomly.
  • Add a routine to select a song from the list randomly.

All of these are made harder by the fact that there is practically no free space in the source code. Because this is in assembly, everything relies on jumping to specific “lines” – which means we can’t freely insert code. Which means we either need to overwrite existing data, or extend the file.

(The original plan was to use some of the empty data in the file – there is actually a large contiguous block near the end! Except that it all actually gets overwritten at runtime – it seems to be used for some sort of storage. So that’s out of the question, but was a very frustrating week of my time wasted.)

So, how do we extend the file? Well, luckily, the PS2 uses the .ELF format (also used by Linux executables), which means we’ve got a fairly well-documented format to work with. Long story short, we can increase the data loaded beyond the ELF footer (near the bottom of the file) by incrementing the 4-byte value at an offset of 0x00000044 into the file. I also increment 0x0067F514 by the same amount, which contains the address of the start of the heap.

Generating the new BGM.CVM file list isn’t that hard, since we already have the code for a tool that creates a new .CVM. I used the cvm_tool code to read what’s in the new BGM.CVM file table and just follow the format described in Part 2.

The list of songs is where things start to get interesting again. If you’ll recall, the original code worked with two tables: the first converts an internal “song ID” to the address of an entry in the second table containing the song’s filename as a null-terminated string. That seems weird; after all, the original C code is probably something like this…

int songNum = 1;
const char** filename = {"01.ADX", "21.ADX", "101.ADX", ...};
return filename[songNum]; 

But that gets represented a bit differently in assembly! The following is very similar to how Persona 3 actually does it.

//example filename table start
//'\0' is the "null" character - it has a value of 0x00 and is used to indicate the end of a string in C.
//(notice that song names are always aligned on a word boundary - 4 bytes! this means a string always has to take up a multiple of 4 bytes.)
0x00000000 - "01.A"
0x00000004 - "DX\0\0"
0x00000008 - "21.A"
0x0000000C - "DX\0\0"
0x00000010 - "101."
0x00000014 - "ADX\0"
...

//example filename lookup table start
//(starts at 0x00001000 which I picked arbitrarily for this example)
//the value at (0x00001000 + songNum * 0x0000000C) is the address of the string for that song number
0x00001000 - 0x00000000 //song number 0
0x00001004 - 0x00000000 //unused
0x00001008 - 0x00000000 //unused
0x0000100C - 0x00000008 //song number 1
0x00001010 - 0x00000000 //unused
0x00001014 - 0x00000000 //unused
0x00001018 - 0x00000010 //song number 2
0x0000101C - 0x00000000 //unused
0x00001020 - 0x00000000 //unused
...

//to actually retrieve data from the array (note: psuedo-instructions used)
//in this example, we retrieve the address of the string for the path of song number 1
li v0, $00000001 //load a value of 1 into register v0
multi v0, v0, $000C //multiply v0 by 0x000C (the array entry size) and store the result in v0
lw v0, $1000(v0) //load the word (4 bytes) at the address stored in v0 + 0x00001000 into v0
//v0 now contains the address of the null-terminated string "21.ADX" 

A very convenient thing to notice here: the filename lookup table allocates 12 bytes per entry. Only 4 bytes are actually used. I’m not sure why this is – the compiler probably had a good reason – but that means we can stick extra data in the lookup table!

To randomly select music, more information is needed. Let’s define some vocabulary first.

  • A song is what’s understood by the original game, in that first array.
  • In our new world order, every song has a list of tracks associated with it, instead of just one piece of music – we can pick any one of them to play.

We’re changing the original C code around to be something closer to this:

//In this example, song number 0 maps to "01.ADX" and "01A.ADX", song number 1 maps to "21.ADX", and song number 2 maps to "101.ADX".

//the compiler would actually concatenate these strings together as if they were all on one line
const char* filenames = "01.ADX\0\0"
    "01A.ADX\0"
    "21.ADX\0\0"
    "101.ADX\0";

struct lookup
{
    const char* startAddress;
    unsigned int trackCount;
    unsigned int trackAddressLength;
};

lookup lookupArray[] = { 
    { &filenames[0],  0x00000002, 0x00000008},
    { &filenames[16], 0x00000001, 0x00000008},
    { &filenames[24], 0x00000001, 0x00000008}
}; 

Which is pretty goddamn crazy. But it lets us pick a random song like this:

int songNum = 1;
int trackNum = rand() % lookupArray[songNum].trackCount;
return lookupArray[songNum].startAddress + lookupArray[songNum].trackAddressLength * trackNum; 

There are some interesting limitations here – for example, the song address length for every track for a given song ID must be the same. Luckily that doesn’t make much of a difference for our use case since all track names are practically the same length (we rename them to follow “01A.ADX”, “01B.ADX”, etc.).

So, the patcher works by turning the original format of

struct lookupEntry
{
    unsigned int stringAddress;
    unsigned int unused;
    unsigned int unused2;
}; 

into

struct lookupEntry
{
    unsigned int firstStringAddress;
    unsigned int trackCount;
    unsigned int trackStringSize;
}; 

And creating the associated tables. Rad.

There’s one last piece of the puzzle – writing the assembly code to intercept music playback and pick a random song…

Hacking Persona 3, Part 2

In part one, I started a journey to add more music to Persona 3, and to randomly select between new and old tracks.

I left off with a very specific problem when replacing a file: the game is reading the file’s location inside BGM.CVM from something other than the CVM file. The only place I can think of that would have the answer is the source code. A little Googling reveals a program named PS2Dis, a PS2 disassembler. Let’s open the SLUS_216.21 file. ps2dis_1

We’re clearly going to need a little background on assembly. The PS2 runs a 64-bit MIPS R5900 processor. I don’t really know what that means, but now I know what instruction set to Google. This overview got me mostly up to speed pretty fast.

Alright, now we have a small chance of understanding what we come across. After exploring the menus of PS2Dis, it seems we can open a label list by pressing Ctrl-G. ps2dis_search

Neat! Well, we know the music files are all named “#.ADX”, so let’s try entering “01.ADX” (this is the one time you actually type the quotes).

Hmm, there’s a list of song names, but there’s nothing that looks like a file table with offsets and sizes. What if we Find (Ctrl-F) “01.ADX” instead? ps2dis_filetable_1

Bingo. (Side note: you can change the data type PS2Dis displays an address with “C” for instruction, “B” for byte, and “W” for word. It greatly enhances readability, as it’ll automatically display ASCII values for bytes and preview the value of words containing addresses.)

Let’s see if we can find the start of this “file table.” ps2dis_filetable_start

Aha. After some analysis (the output of cvmtool helps a lot), here’s the format:

uint32 file_count
uint32 file_count  //yes, twice
uint32 0x00000014  //magic number!
"#DirLst#"
uint32 0x00000000

for each file:
    uint32    file_size
    uint32    extAttributeLen  //still not 100% sure if this is right, but it's always zero, so...
    uint32    file_offset
    uint8      flags
    uint8      unknown  //I have no idea what this is.  It's nonzero and fluctuates, but just writing 0x00 here works fine.
    char[34] filename   //always 34 bytes, with the unused bytes set to 0x00

Simple enough. Let’s keep investigating – where does the game decide which music file to load? Let’s go back to that first list of song names. ps2dis_load

Cool. If we press Space and then F3, we can use PS2Dis’s analysis tool to find code that refers to this address. ps2dis_songname_referrer

Alright, so – wait a second, that “li” does not refer to the address we were just at. Let’s follow it (press the right arrow) and see what it’s loading… ps2dis_tracks

Ah, it’s a list of the addresses of the filenames, evenly spaced by 12 bytes (0x0C). We can test our theory by changing 0x0010920c’s “load immediate” address to be 0x0C bytes further ahead. Try it and, indeed, just about every song in the game will change, because we’ve offset the array index by one.

We’ve now covered how to modify the BGM.CVM file to add or modify music files, and the necessary source code changes that must be made to accomodate the new files. In Part 3, I’ll talk about the tool I wrote to actually change things, as well as the assembly routine for picking songs.

Hacking Persona 3, Part 1

Persona 3 is one of my favorite games of all time.  I originally watched a friend play through the game, then bought Persona 3 Portable to see it through myself.  Now, I’ve decided to play through Persona 3 FES, the fully-3D PlayStation 2 version.  I wanted to spice it up a little though, since I’d already seen a friend play through it once before.  So I created a tool that allows you to change and add music, randomly selecting between alternates for songs.  This series of articles is a (heavily edited) look at the path I took to create this tool.

If you just want to use the tool, you can download it on my Misc projects page.

So these are my qualifications:

  • Somewhat experienced C++ coder
  • Intermediate CS knowledge
  • Never touched assembly in my life (still don’t know what MIPS stands for)
  • Know nothing about the PS2 hardware

Sounds good! Time to dig in.

First, we need the game files. So we must acquire a Persona 3 FES disc image. Easy enough, since PS2 games don’t seem to have much in the way of copy protection. ImgBurn has a “Create Image from Disc” option – pop the disc in the tray, enter a path, and hit the button. Done!

Let’s see what we’ve got. windows_iso

So, some “.CVM” files, and a…”.21″ file? What? After some research, it turns out the “SLUS_*” file is the game’s executable, and the “.CVM” files are a format developed by CRI Middleware. The format’s also referred to as “CRI ROFS” (short for “Read Only File System,” I suspect). It’s basically an ISO file with a special header.

So, we’re going to need some software to inspect these files. Luckily, a handy tool by the name of DkZ Studio is like a swiss army knife for this type of thing.

dkz_1

With DkZ Studio, we can open up BGM.CVM…

dkz_2

It’s entirely ADX files, another file format developed by CRI Middleware! It looks like ADX files hold sound. You can open them up with DkZ Studio’s built-in ADX player or VLC to hear them.

All of the game’s music is stored inside this “BGM.CVM” file. There are also a lot of seemingly unused files in here, including little slices of some of the songs. I still have no idea where “THEME.ADX” plays. (Trivia: the Persona 4 BGM.CVM contains the entire Persona 3 soundtrack, though most of it goes unused.)

Let’s start small and try replacing one of the songs. Let’s convert a test MP3 – I chose to replace 26.ADX, the battle music, with the Persona 3 Reincarnation Mass Destruction remix – to WAV then to ADX, replace the original file, and…

dkz_warning

Err, okay, I’ll re-save BGM.CVM and reopen it then.

dkz_error

Drat. It seems DkZ Studio can’t handle creating a new CVM file from scratch, as evidenced by the completely broken BGM.CVM it generated (look at those file sizes!). Time to find a tool that can.

After many hours of research, I stumbled upon this tool created by a fellow named roxfan. It’s a command-line utility that allows you to convert a CVM to an ISO and back again! Excellent. Using that utility, we split the original CVM into an ISO and the CVM header. We pack a new ISO with a tool like ImgBurn, containing the old contents of BGM.CVM but instead of the original 26.ADX, we use our test song.

Finally, we have a new BGM.CVM! Let’s replace the original one in DkZ Studio, re-save the PS2 ISO, and…oh dear. It quits halfway through the save process! I’m not totally sure why this happens, but I have a hunch.

dkz_1

See the file size field for DATA.CVM? It’s not only incorrect, but it’s negative. The actual size of DATA.CVM is around 3gb. I’m guessing it’s being stored as a signed integer instead of unsigned, and the overflow is causing a problem during saving.

After more research, let’s try a tool named UltraISO.

ultraISO

Exactly what we need! With this, we successfully export a new PS2 ISO. Let’s open it in an emulator (PCSX2), and see what happens. Or rather, hear.

We hear some music, but not all. Specifically, all the songs listed before the song we replaced in the ISO table of contents will work perfectly, the song we changed will play but be cut off, and (almost) all the songs after it won’t play anything. To top it all off, with my choice of test files, the dorm music is replaced by the school music for some reason…

Running roxfan’s cvmtool on both the new and old BGM.CVM, we find the following entries in the table of contents:

CVM_NEW

entry flags 0x00, extent 60333 (extattr 0), size 0x2A6000, name '51.ADX;1'

CVM_ORIG

entry flags 0x00, extent 60333 (extattr 0), size 0x1B0000, name '53.ADX;1'

By sheer luck, in the new BGM.CVM, file 53.ADX’s file offset (named “extent” by cvmtool) exactly matches the original BGM.CVM’s file offset for 51.ADX. 53.ADX is the dorm music, which is the music that plays when my save game loaded. If I hadn’t gotten this lucky, the project probably would’ve been dead in the water, because I never would’ve noticed that the game still seems to be loading files from the original BGM.CVM’s ToC, despite the fact that it’s been completely replaced. Which means the game’s not loading files entirely from the CVM!

In part 2, I’ll investigate the source code.