In the previous post, I talked about how the distribution system in Anniversary Crystal came to be, and how it was developed. In the meantime, yet another Pokémon has been distributed. And so, the time has come to talk about…
Part 2: The data
For every Pokémon we intended to distribute, we had to determine the exact data it would contain. The distribution sets consist of sixteen Pokémon from our previous runs and four made-up ones; these were handled differently.
When it came to Pokémon from previous runs, our goal was clear: reconstruct the originals as closely as possible. This required some data diving, going through old saves, and reading the data off hexadecimal dumps. The distributed Pokémon aren’t exact copies of the originals, but they get as close as it was reasonable to achieve. Two instances where this imperfection is visible are experience points (any distributed Pokémon only has enough experience to reach the level they are at, while the originals could have made some progress towards the next level) and damage (Pokémon are distributed fully healed, with full HP, no status conditions, and full PP for all moves).
It is instructive at this point to describe the data that is stored in the distribution system. Every Pokémon is stored as a custom encrypted 42-byte data structure, from which the game rebuilds the full data for the Pokémon when the player enters a distribution code. The first 8 bytes are related to the passwords; the remaining 34 (before encryption) are as follows:
- Species: 1 byte
- Held item: 1 byte
- Moves: 1 byte each (4 total)
- OT ID: 2 bytes (big-endian)
- DVs: 2 bytes (in the usual format)
- Level: 1 byte
- OT name: 11 bytes (terminated with $50)
- Nickname: 11 bytes (terminated with $50)
- Flags: 1 byte (usually zero, more on this later)
This structure explains the differences mentioned a few paragraphs above. In order to keep the system simpler, unnecessary data such as stats or experience points isn’t stored; this data is recalculated when the Pokémon is obtained. Stat experience is also missing, as it would amount to a lot of data that wasn’t practical to obtain; every Pokémon is distributed with zero stat experience, and thus somewhat lower stats than their original counterparts. The missing values are recalculated as follows: experience points are set to the minimum needed for the indicated level, stat experience is set to zero, stats are recalculated on the spot (as it happens when you withdraw a Pokémon from a box), move PPs are calculated as well with PP Ups set to zero, encounter data is set to indicate that the Pokémon was received in a trade and caught at an unknown time and location (as a matter of fact, it is wholly zeroed out), Pokérus data is cleared (i.e., set to the default value of “has never been infected”), and happiness is set to zero. Most of the values set to zero aren’t explicitly set — the distribution code simply zeroes out the whole party slot before regenerating the distributed Pokémon.
We intentionally kept the number of made-up sets low. While making sets may be fun, and the code can support up to 255 distributed Pokémon, the main focus of the system was distributing Pokémon from old TPP runs. That being said, we made three sets in recognition of staff members (namely Koolboyman, PikalaxALT, and our original Streamer). We only chose the OT ID and OT name for those sets (and the DVs for the shiny Slowpoke); the sets themselves were made by the respective staff members.
There is an additional made-up set, and that’s the one for Phancero, which was distributed both as a reference within Prism and through a regular code post. Since this would be the only way to obtain a Phancero in the game (other than trading from Prism, but we already knew that would be a pipe dream; it will happen some day, but not soon), and all distribution Pokémon have fixed DVs, we chose perfect DVs for it; shiny DVs were another option, but obtaining a shiny Phancero is already easy in Prism (all you need is a Shiny Ball). We also wanted the player to actually be able to own the Pokémon, as if they had caught it — and thus the “flags” field was born. The flags field in the distribution data is set to 0 for all other Pokémon, but to 1 for Phancero; setting it to 1 causes the game to set the OT name and ID to the current player’s, and to ask the player to enter a nickname. The three corresponding fields in the data are thus empty, since their values aren’t meaningful.
When it came to sets from our previous runs, the hardest part was gathering the data. In most cases, the data was available directly from savefiles; however, this was not always the case. The parties for previous runs’ trainers contained within AC itself already had the correct DVs, so that made the search effort a lot simpler, since for those Pokémon we could take the DVs from the source code we already had and the rest of the data from twitchplayspokemon.org (which fortunately shows the ID for the player character, the only “obscure” part of the data). Not everything was so easy, though — for instance, reconstructing DUX‘s OT data proved problematic. We had successfully dumped the rest of the data, but the OT name seemed to be invalid! It turns out that, in generation 1, all in-game trades have their OT stored as a single byte, $5D, which is shown as
TRAINER in game; Bulbapedia thankfully explained this.
A particularly complicated case was M4. In my last post, I posted the original data for M4. I manually recovered that data from the final Emerald save by looking at a hexadecimal dump and a description of the data structures, and writing down the values; that’s why the text file looks like messy scribbled-down notes (which is what they are). But there’s a fundamental problem with that data: the IVs are in newer-gen format, in which all IVs are independent and range from 0 to 31. Those IVs had to be converted to generation 2 format, in which there are only four DVs ranging from 0 to 15: the special attack and special defense DVs are shared, and the HP DV is calculated from the other four. Converting DV values to IV values is as simple as doubling them; the reverse conversion isn’t as lossless. M4’s original IVs were HP:11, atk:23, def:19, spe:24, spA:19, spD:26. First step, halve the values: HP:5½, atk:11½, def:9½, spe:12, spA:9½, spD:13. Now, since there’s a single special DV, average them: 11¼. Finally, rounding. Ideally, HP would have to be 5 or 6; for this, we’d need the attack DV to be even, the defense DV to be odd, and the remaining two to have different parities (odd speed gives 6, odd special gives 5). Therefore, the attack and defense DVs were rounded to 12 and 9 respectively; since the speed DV was already even and the special DV was closer to an odd value, the special DV was rounded to 11. This way we get the DVs that were finally recorded for M4 in the distribution data: 12 attack, 9 defense, 12 speed, 11 special, giving 5 HP. This would represent generation 3 IVs of 10/24/18/24/22/22, which is as close as we could get to the original 11/23/19/24/19/26. (Note that speed is given after defense because this is the order in which stats appear internally in every single game.)
And that’s all for now. Of course, feel free to ask any questions. In the next post, I’ll finally reveal how the data is encrypted and encoded, and how everything works. I might even post some code, for those who can read it. For now, I’ll leave you with a screenshot of the data file that is parsed and built into the distribution data that goes into the ROM:
(and yes, when the source code is released, you should be able to edit this file to generate your own codes)
Original Reddit thread (some links may no longer work)