Fixing messed up VGM tracks from Shining Force

This is Elven Town, from Shining Force II for the Sega Genesis. Here's the beginning of the original track:

Excerpt from Elven Town

It sounds kind of janky.
Computer, isolate the bass line and slow it down to 65%.

Isolated bass notes, slowed down to 65% as requested

Yea, definitely some problem with that high note. The Sega Genesis sound chip (YM2612) has 6 FM channels; these bass notes are played on Channels 0, 1, and 5. As we will see, this bass line has a software-controlled vibrato, meaning that the channel frequency is modulated rapidly by the game's music engine, not using a built-in LFO of the YM2612 chip. And the music engine evidently has a small bug 🐞.

This music comes from a VGM file. The VGM format captures a stream of commands sent from the game to synthesizer chips. I ran vgm2txt from vgmtools library to convert the VGM stream to something readable and filtered the output for Channel 1 (good) and Channel 5 (corrupt). So the difference appears in these commands highlighted below:

Output of vgm2txt comparing Channel 1 vs. Channel 5

The VGM text annotation tells us that these are frequency commands. Channel 1 has a slowly evolving values, and channel 5 has some noisy values in what would otherwise be smooth sequence. The "set LSB" byte values are plotted like this, to verify:

Vibrato vs. janky vibrato

Now it's time to figure out how to repair those values. In order to repair them, we have to understand the VGM command stream for the YM2612 a little better.

From the right-hand side above (channel 5), all the commands we are interested in have this format:

53 A6 __
53 A2 __
VGM commands

From various sources, we learn the following:

https://vgmrips.net/wiki/VGM_Specification

0x52 aa dd      YM2612 port 0, write value dd to register aa
0x53 aa dd      YM2612 port 1, write value dd to register aa

Cool. So what are registers $A6 and $A2? How do we interpret the value bytes?

https://blog.thea.codes/genesynth-part-2-basic-communication/

Port 0 = Key on/off and Channels 0,1,2
Port 1 = Channels 3,4,5

YM2612 assignments for registers $A0 onward:

Channel  High byte   Low byte
0 / 3    $A4         $A0
1 / 4    $A5         $A1
2 / 5    $A6         $A2
YM2612 registers.
Part I = port 0, Part II = port 1

https://plutiedev.com/ym2612-registers#reg-A0

The register format is:

YM2612 registers $A4, $A5, $A6: frequency (high)
Bit 7   Bit 6   Bit 5   Bit 4   Bit 3   Bit 2   Bit 1   Bit 0
0       0       BLK:2   BLK:1   BLK:0   FREQ:10 FREQ:9  FREQ:8

YM2612 registers $A0, $A1, $A2: frequency (low)
Bit 7   Bit 6   Bit 5   Bit 4   Bit 3   Bit 2   Bit 1   Bit 0
FREQ:7  FREQ:6  FREQ:5  FREQ:4  FREQ:3  FREQ:2  FREQ:1  FREQ:0

FREQ: frequency
BLK: block (octave)

…i.e. 2 unused bits, 3 octave bits, 11 frequency bits.

Now we are prepared to write some code that will interpret these pairs of "high" and "low" frequency commands, combine them together, and get the resulting frequency for channel 5.

const fs = require('fs');

const buf = fs.readFileSync('elven-town.vgm');
let i, j;  
for (i = 256; i < buf.length; i++) {
  // High byte
  const [b0, b1, hi] = buf.subarray(i);
  if (b0 === 0x53 && b1 === 0xA6) {
    i += 2;
    // High and low commands can be separated by some silence bytes (0x7n)
    for (j = i; j < i + 6; j++) {
      // Low byte
      const [b3, b4, lo] = buf.subarray(j);
      if (b3 === 0x53 && b4 === 0xA2) {
        // Combine the bytes with bitwise operations
        const baseFreq = ((hi & 7) << 8) | lo;
        const oct = (hi >> 3) - 5;
        const freq = baseFreq * Math.pow(2, oct);
        console.log(freq);
      }
    }
  }
}
High and low bytes combined to reveal frequency

Now we can see that the spurious frequencies are clearly separated from the normal vibrato stream; and we can make a simple condition to filter them out. We just reuse the previous value when we encounter one of these. Putting it all together:

const fs = require('fs');

const buf = fs.readFileSync('elven-town.vgm');
let lastHi = 0;
let lastLo = 0;
let lastFreq = 0;
let ctr = 0;

let i, j;
for (i = 256; i < buf.length; i++) {
  // High byte
  const [b0, b1, hi] = buf.subarray(i);
  if (b0 === 0x53 && b1 === 0xA6) {
    i += 2;
    // High and low commands can be separated by some silence bytes (0x7n)
    for (j = i; j < i + 6; j++) {
      // Low byte
      const [b3, b4, lo] = buf.subarray(j);
      if (b3 === 0x53 && b4 === 0xA2) {
        const baseFreq = ((hi & 7) << 8) | lo;
        const oct = (hi >> 3) - 5;
        const freq = baseFreq * Math.pow(2, oct);
        // Filter out the bad frequencies
        if (freq > 400 || (freq > 300 && (ctr < 2200 || ctr > 2800))) {
          buf[i+2] = lastHi;
          buf[j+2] = lastLo;
        } else {
          lastHi = hi;
          lastLo = lo;
          lastFreq = freq;
        }
        console.log(lastFreq);
        ctr++;
      }
    }
  }
}

fs.writeFileSync('elven-town-fixed.vgm', buf);

The result looks like this:

Spurious commands removed

Computer, now resynthesize the bass line with the repaired vibrato command stream!!

We should be a little more curious about how these unintended frequency commands got there. There is some clear structure in the interference.

Let's take another look at the VGM command stream...

Unintentional F-Num (frequency) commands sent to Ch 5

Here is what's happening. Every time there is a Key On command for Channel 3 or Channel 4, an unintentional frequency command is sent to Channel 5 immediately before. The bytes (highlighted in yellow and green) match the frequency intended for the Key On event. The commands outlined in red are mistakes. Now we can rewrite the script to surgically remove these errors:

let buf = fs.readFileSync(filename);
let patchCount = 0;
const lookback = 30;
for (let i = 0; i < buf.length - 2; i++) {
  const [b0, b1, b2] = buf.subarray(i);
  if (b0 === 0x52 && b1 === 0x28 && (b2 === 0xF4 || b2 === 0xF5)) {
    for (let j = 3; j < lookback; j++) {
      const idx = i - j;
      const [n0, n1, n2] = buf.subarray(idx);
      if (n0 === 0x53 && (n1 === 0xA6 || n1 === 0xA2)) {
        patchCount++;
        buf[i-j  ] = 0x7F;
        buf[i-j+1] = 0x7F;
        buf[i-j+2] = 0x7F;
        if (n1 === 0xA6) break;
      }
    }
  }
}

I ran this script on the entire soundtrack since the error appears throughout. Listen to repaired Shining Force II on Chip Player or download the repaired VGM files here (originals included for comparison):

Bonus: this process repaired a song that was previously unplayable in Chip Player (29 - Zeon.vgz).

MEGAfm Teardown

Here is a look inside the Twisted Electrons MEGAfm. It is a nicely constructed boutique synth built around Sega Genesis sound chips.

An upper PCB containing LEDs and audio/MIDI jacks has been removed in these photos. The dual YM2612 chips are clearly visible. The microcontroller is visible in the bottom right. The other ICs on the board are TI multiplexers, an 8-segment LED controller, MIDI optocoupler, and an extra 512k EEPROM.

The microcontroller is an Atmel ATmega 1284P. It is a 20 MHz, 8-bit RISC CPU with 128k of program memory, similar to those used in the Arduino series and in Mutable Instruments Shruthi. Its role is mostly to convert MIDI controller data into YM2612 control signals. The internal EEPROM is only 4 KB, so I suspect an extra EEPROM was necessary for user presets. Just to give an idea, this is the kind of computing power we're talking about.

The YM2612 has a built-in DAC (known for its "ladder effect" at low volumes). I'm not sure if the analog audio output of the YM2612 chips is summed on the board or if their signals make a round trip to be mixed in the ATmega. If I had to guess, I'd say no. The volume knob is throttled to prevent zippering.

32 sliders and 14 potentiometers – FM synthesis doesn't get more hands-on than this.
Two original Yamaha YM2612 chips for a total of 12 FM voices.
The MEGAfm is controlled by an Amtel ATmega 1284P AVR chip.

Hotkeys in Chip Player JS

List of super-secret keyboard shortcuts for Chip Player JS:

Shortcut Action
Media Keys Playback control is fully supported
(Left Arrow) Seek backward 5 seconds
(Right Arrow) Seek forward 5 seconds
(Space bar) Toggle play/pause
- (Minus) Decrease speed 10%
= (Equals) Increase speed 10%
Shift+- (Underscore) Decrease speed 1%
Shift+= (Plus) Increase speed 1%

Chip Player JS

Chip Player JS
Chip Player JS began in August 2018.

Ten years ago, I stopped using Windows and switched to Mac. But I held on to a Windows XP virtual machine – for the sole purpose of playing ancient music formats in Winamp. Even while I was working at Spotify, Winamp was running in the background.

Chip Player JS was built to replace this inconvenient way of playing chiptunes and MIDI files. There are other modern alternatives, such as Audio Overload by Richard Bannister. But nothing had the charm and configurability of the old Winamp plugins.

Sequenced Music

Why go to all this trouble? What's the big deal with chiptunes and MIDI files?

Since these files consist of raw performance data – not a flattened audio signal – they allow you to explore and interact with music in ways that aren't possible with recorded audio.

These formats offer the ability to visualize note data on a piano roll, adjust playback speed, mute channels, reassign instruments, and transform performance data. I believe a music player for sequenced formats should take advantage of these capabilities and provide interactivity. This is a major design goal of Chip Player JS.

Psychology of Limitation

Chiptune formats were born of technical limitations: 4 primitive waveform channels of the Ricoh 2A03 (NES), 128 fixed instrument sounds of General MIDI, 6 FM channels of the YM2612 (Sega Genesis), etc.

Each sound chip is a musical instrument in its own right, like a guitar or a violin. Each has a unique sound character and a basis of interesting contstraints that demand to be tested. Chiptune artists have developed a repertoire of performance techniques that work with the limitations of the chips, akin to stuff like artificial harmonics, whammy bar dives, and extended stretch tapping in the guitar world.

YouTuber explod2A03 has explained some of these effects for the NES in a series of cool videos. Check it out:

This post is a work in progress.

Flat-Shaded Polygon 3D Games

I've started maintaining a list of flat-shaded 3D games from the 1980s and 1990s over here: Flat-Shaded 3D Polygon Games.

NES APU Note Table

The NES RP2A03 CPU clock runs at 1.789773MHz (NTSC). In producing audio, the wave period of the pulse ("square wave") channels is specified in timer units of 16 clock cycles. This means the period is quantized to units of 16/1,789,773 second. The triangle channel, usually used for bass lines, uses 32-cycle timer units, so for a given timer period, it's an octave lower than the pulse channels.

The pulse channels are incapable of producing a note lower than A1, with a timer period of 2033, because the timer period is represented by 11 bits, giving a maximum value of 2048.

This table shows the closest timer period for each note. It becomes less inaccurate, especially for the triangle channel, in the higher register.

Note Name MIDI Note Frequency Piano Note APU Index Pulse Period Tuning (Cents) Triangle Period Tuning (Cents)
A0 21 27.5 1       2033 0
A#0 22 29.1352 2       1919 0
B0 23 30.8677 3       1811 0
C1 24 32.7031 4 0     1709 0
C#1 25 34.6478 5 1     1613 0
D1 26 36.708 6 2     1523 0
D#1 27 38.8908 7 3     1437 0
E1 28 41.2034 8 4     1356 1
F1 29 43.6535 9 5     1280 0
F#1 30 46.2493 10 6     1208 0
G1 31 48.9994 11 7     1140 1
G#1 32 51.913 12 8     1076 1
A1 33 55 13 9 2033 0 1016 0
A#1 34 58.2704 14 10 1919 0 959 0
B1 35 61.7354 15 11 1811 0 905 0
C2 36 65.4063 16 12 1709 0 854 0
C#2 37 69.2956 17 13 1613 0 806 0
D2 38 73.4161 18 14 1523 0 761 0
D#2 39 77.7817 19 15 1437 0 718 0
E2 40 82.4068 20 16 1356 1 678 -1
F2 41 87.307 21 17 1280 0 640 -1
F#2 42 92.4986 22 18 1208 0 604 -1
G2 43 97.9988 23 19 1140 1 570 -1
G#2 44 103.8261 24 20 1076 1 538 -1
A2 45 110 25 21 1016 0 507 2
A#2 46 116.5409 26 22 959 0 479 0
B2 47 123.4708 27 23 905 0 452 0
C3 48 130.8127 28 24 854 0 427 -2
C#3 49 138.5913 29 25 806 0 403 -2
D3 50 146.8323 30 26 761 0 380 0
D#3 51 155.5634 31 27 718 0 359 -2
E3 52 164.8137 32 28 678 -1 338 2
F3 53 174.6141 33 29 640 -1 319 2
F#3 54 184.9972 34 30 604 -1 301 2
G3 55 195.9977 35 31 570 -1 284 2
G#3 56 207.6523 36 32 538 -1 268 2
A3 57 220 37 33 507 2 253 2
A#3 58 233.0818 38 34 479 0 239 0
B3 59 246.9416 39 35 452 0 225 4
C4 60 261.6255 40 36 427 -2 213 -2
C#4 61 277.1826 41 37 403 -2 201 -2
D4 62 293.6647 42 38 380 0 189 4
D#4 63 311.1269 43 39 359 -2 179 -2
E4 64 329.6275 44 40 338 2 169 -3
F4 65 349.2282 45 41 319 2 159 2
F#4 66 369.9944 46 42 301 2 150 2
G4 67 391.9954 47 43 284 2 142 -4
G#4 68 415.3046 48 44 268 2 134 -4
A4 69 440 49 45 253 2 126 2
A#4 70 466.1637 50 46 239 0 119 0
B4 71 493.8833 51 47 225 4 112 4
C5 72 523.2511 52 48 213 -2 106 -2
C#5 73 554.3652 53 49 201 -2 100 -2
D5 74 587.3295 54 50 189 4 94 4
D#5 75 622.2539 55 51 179 -2 89 -2
E5 76 659.2551 56 52 169 -3 84 -3
F5 77 698.4564 57 53 159 2 79 2
F#5 78 739.9888 58 54 150 2 75 -10
G5 79 783.9908 59 55 142 -4 70 8
G#5 80 830.6093 60 56 134 -4 66 9
A5 81 880 61 57 126 2 63 -12
A#5 82 932.3275 62 58 119 0 59 0
B5 83 987.7666 63 59 112 4 56 -11
C6 84 1046.5022 64 60 106 -2 52 14
C#6 85 1108.7305 65 61 100 -2 49 15
D6 86 1174.659 66 62 94 4 47 -14
D#6 87 1244.5079 67 63 89 -2 44 -2
E6 88 1318.5102 68 64 84 -3 41 17
F6 89 1396.9129 69 65 79 2 39 2
F#6 90 1479.9776 70 66 75 -10 37 -10
G6 91 1567.9817 71 67 70 8 35 -16
G#6 92 1661.2187 72 68 66 9 33 -17
A6 93 1760 73 69 63 -12 31 -12
A#6 94 1864.655 74 70 59 0 29 0
B6 95 1975.5332 75 71 56 -11 27 19
C7 96 2093.0045 76 72 52 14 26 -18
C#7 97 2217.461 77 73 49 15 24 15
D7 98 2349.3181 78 74 47 -14 23 -14
D#7 99 2489.0158 79 75 44 -2 21 37
E7 100 2637.0204 80 76 41 17 20 17
F7 101 2793.8258 81 77 39 2 19 2
F#7 102 2959.9553 82 78 37 -10 18 -10
G7 103 3135.9634 83 79 35 -16 17 -16
G#7 104 3322.4375 84 80 33 -17 16 -17
A7 105 3520 85 81 31 -12 15 -12
A#7 106 3729.31 86 82 29 0 14 0
B7 107 3951.0664 87 83 27 19 13 19
C8 108 4186.009 88 84 26 -18 12 47
C#8 109 4434.922   85 24 15    
D8 110 4698.6362   86 23 -14 11 -14
D#8 111 4978.0317   87 21 37 10 37
E8 112 5274.0409   88 20 17    
F8 113 5587.6517   89 19 2 9 2
F#8 114 5919.9107   90 18 -10    
G8 115 6271.9269   91 17 -16 8 -16

Source spreadsheet on Google Sheets: NES APU Note Table

An Update on UMG Watermarks

April 2020 Update: Most UMG tracks are now watermark-free on Spotify. 

Today, years after first complaining about Spotify sound quality issues, I'm happy to report this problem is fading out.

It appears that watermarks are gradually disappearing from the Universal catalog. The following tracks have undergone a marked improvement, first noticed in mid-June 2019:

They used to have a distinct watermark, and now they don't. Importantly, these tracks come from the back catalog. They're not merely new tracks that were delivered under the latest watermarking policy.

How to type emoji in one keystroke on a Mac

It's easy to create keyboard shortcuts on macOS without using 3rd party apps (like Karabiner) or Text Replacement (which doesn't work in Chrome).
Here's how to do it.

You'll need to edit this Key Binding dict:

$ vi ~/Library/KeyBindings/DefaultKeyBinding.dict

Here's my DefaultKeyBinding.dict as an example:

{
    /* Remap Home/End to Windows-like behavior */
    "\UF729" = "moveToBeginningOfLine:";                    /* Home */
    "\UF72B" = "moveToEndOfLine:";                          /* End */
    "$\UF729" = "moveToBeginningOfLineAndModifySelection:"; /* Shift + Home */
    "$\UF72B" = "moveToEndOfLineAndModifySelection:";       /* Shift + End */

    /* Shortcuts for some unicode symbols */
    "~y" = (insertText:, "👍");
}

The emoji shortcut is created in the last line. The tilde (~) means Option should be held down.
Now I can press Option+Y to type a thumbs up emoji. Cool!

Save the file, restart your application, and the new keybinding will take effect in that application.

Here's the full Apple developer documentation on Key Bindings.