A Phase-Aligned Oscilloscope for Web Audio

A music synthesizer should produce nice periodic waveforms when a note is played. We should be able to see that regularity when we visualize the sound pressure with an oscilloscope, here demonstrated with the Yamaha DX7 emulator:

Oscilloscope before phase alignment

The Problem

We can see the regularity, but hold on. The waveform is jumping around, flickering left and right. It doesn't appear fixed in one spot. That makes it awfully hard to see how the waveform evolves as we hold a note down.

The basic problem is that the visualization update is not synchronized to the wave period. The waveform is drawn by taking a snapshot of audio data – say, 1024 samples – at successive instants. The snapshots are performed in this case by a Web Audio API analyzer node, ideally at 60 times per second. The position (phase) of the periodic wave will not appear aligned in successive snapshots (unless we're playing an E above middle C, which happens to be 659.25 hz, a near multiple of 60 hz). Hmm!

The Solution

We need 2 ingredients to really do this right.

  1. When we get some audio data to draw, we need to know the exact moment in time the data corresponds to. The Web Audio API provides this in the form of AudioContext.currentTime.
  2. We need to know the frequency of the note we're interested in drawing. Let's say whatever note was pressed last.

Every time we want to draw a frame of audio data, we divide the sampleTime by the wave period and call the remainder sampleOffset. The units are in audio samples, running at 44100 samples per second.

Let's say we're drawing two successive frames of audio data. For these two frames, sampleTime might be 10000 and 10705. The note pressed down is middle C at 440 hz, generating a waveform that repeats every 44100 / 440 = 100.2 samples. So we get a sampleOffset of 10000 % 100.2 = 80.2 and 10705 % 100.2 = 83.8. We need to draw the first frame shifted 80.2 samples to the left, and the second frame shifted 83.8 samples to the left. And so on.

Oscilloscope after phase alignment

Ah, much better! The little wobble at the end of this animation shows a pitch vibrato.

Here are the important parts in code. When we get a new note down, we update the periodicity for the visualizer:

var noteFrequency = frequencyFromNoteNumber(synth.getLatestNoteDown());
visualizer.setPeriod(sampleRate / noteFrequency);

and then in our draw loop, subtract the sampleOffset from the x-position:

analyzerNode.getFloatTimeDomainData(data);
var sampleTime = sampleRate * analyzerNode.context.currentTime;
var sampleOffset = sampleTime % this.period;
...
for (var i = 0, l = data.length; i < l; i++) {
  var x = (i - sampleOffset) * WAVE_PIXELS_PER_SAMPLE;
  var y = data[i];
  graphics.lineTo(x, y);
  ...
}

This doesn't so well for polyphonic synthesis, as multiple notes will have different wave periods running all at once. It works nicely if you hold a high note, and then play an octave or a fifth lower. You can see the consonance (and dissonance) in the waveform as you play various intervals.

Yamaha DX7 and MIDI in JavaScript

Check the demo or find the source at https://github.com/mmontag/dx7-synth-js.

This Yamaha DX7 emulator is my attempt to do something cool with the Web Audio APIs. The synth responds to MIDI input (make sure your device is hooked up before you start your browser) including pitch bend, mod wheel, and aftertouch.  I've added the ability to pan the operators for stereo output (this applies to carrier operators only – the solid squares in the algorithm diagram).

I'll cover the basics here, and add some fun implementation details later.

  1. QWERTY keys melodic keyboard control
  2. Space bar panic (all notes off)
  3. Control hold down to increase QWERTY velocity
  4. Mouse wheel over knobs and sliders to increase/decrease value
  5. Click or touch and drag up/down on knobs and sliders to increase/decrease value
  6. Arrow up/down on knobs and sliders to increase/decrease value
  7. Tab moves between controls

There are some DEMO buttons up top to keep the noise going while you twiddle, plus a MIDI file selection at the bottom.

One tricky aspect of the DX7 is the 4-step envelope generator, so here are some tips. The higher the EG RATE slider, the shorter the envelope section lasts. The envelope starts at zero. The first rate slider (R1) tells you how fast the operator will approach the level of the first volume slider (L1), and so on for R2 L2, R3 L3, R4 L4. If you get stuck notes, hit space and increase your rates.

Shout out to Alva Snædís for sparking my interest in FM synthesis.

Thoughts on Rdio UX

Rdio was known for its clean design and simple user interface. The attention to detail really made a difference in the product and it was one of my favorite parts of working there.

Some years ago, I collected ideas for further improving Rdio's web UX in the following document. Many of the points are obsolete and were addressed in later updates. I'm sharing it here because I believe the principles are still relevant for other web apps.


Application or Website?

Rdio would benefit from incorporating more desktop UI affordances. Rdio should treat tracks in lists as traditional list view items (single click to select, double click to play, arrow up and down to navigate), not like hyperlinks.  

What this might look like:

One counterargument is that people expect web-like behavior in a web browser. This point ignores Rdio desktop app users who expect the same Rdio app in a desktop app container to behave like a desktop app. I believe users embrace app-like behavior in the browser when it is carried out consistently.  

The split between Rdio’s marketing site and the Rdio product is intended to divorce concerns and provide the freedom to put more power into the app; this is a great thing, but the app still seems handicapped by adherence to web affordances. The link-based web paradigm is great for documents (Wikipedia, blogs, etc.), but it does not provide a sufficient set of controls for managing a music collection. 

Fast list views that support multiple selection are important for managing playlists. It is challenging but not impossible to do native-style table views in the browser.

Examples:

  1. Dropbox web app. It is very responsive and does an instant client-side sort.
  2. The track list views within HTML Spotify apps are very fast virtual scroll tables (but without multiple-select or column sort).
  3. Grooveshark also does this well (but ugly), with multiple-select and column sort.
  4. iTunes store track lists are HTML based with sortable columns, but with hover behaviors and no selection. Probably the worst example to emulate.

Sortable Columns

Sortable columns are useful in any track list that includes tracks from multiple albums. Sortable columns allow the user to quickly scan a list by track title, by album title, or by artist name, in order to find what they are browsing for.  Rdio loses a couple important affordances by enabling sort through a dropdown menu instead of column headers:

  1. No ability to do a descending sort
  2. No distinction between which column is album and which is artist
  3. No persistent indication of what fields are sortable
  4. It requires two clicks to do the sort (one to open the menu, one to select an item)

Track Popularity

A popularity meter should be displayed next to every track. It is a very reliable indicator of how awesome a song is. It leads you directly to an artist's representative songs. It makes discovery through browsing/searching more powerful. It makes artist hopping 10 times more efficient. This becomes indispensible usability. Today I overheard someone say he uses album comments to find tracks to listen to because the most popular song will be the one with the most track comments. This is a crude proxy for true popularity.

Browsing patterns must be clearly understood. Our users browse music with intent. Sometimes the intent is loosely defined; users click the album art that catches their eye. Sometimes they browse to get to a specific track they already have in mind, but can't quite remember the name of. But sometimes they browse with another very clear intent: to discover new music that suits their taste. Browsing to discover new music is the intent of our most intimate concern. The popularity meter, guiding the user to preview the best tracks an artist or album has to offer, is a linchpin of this behavior.

Rdio already has a “Top Tracks” feature and the ability to “sort by Most Played.” Isn't that good enough?

No, for three reasons:

  1. We still need to see popularity on unsorted track lists. Album pages, playlists, queue, etc. all lack the ability to sort by popularity. It is crucial for the user to see track popularity on album pages.
  2. Popularity is not uniformly distributed. Popularity bars let you see a “hit single” that might be 10 times more popular than the rest of the tracks. If we don't show popularity, the user sees no distinction between albums with tracks of roughly equal popularity, and an album where there is a single very popular track.  In the case of an album with equally popular tracks, the ranked list can be downright misleading – the popularity measure won't help the user find the best track (or there may not be a clear winner). The user needs to know that in order to inform her listening decisions. The solution is to display the popularity bar next to each track.
  3. Popularity-ranked lists only appear in certain places, and these lists don't have anything that visually sets them apart from unranked lists. In other words: if you sort a list by track title, you will see that the track names are ordered alphabetically. This is an important cue. If you order tracks by Most Played, there's a troubling absence of visual cues to reflect that fact.

The only potential issue with popularity is that recently released songs will be skewed toward the top. But for ranking search results – and in the general case – this is what users want. If users want to separate the trending popular songs from the historically popular songs, this is a separate problem that might be worth investigating.

Invisible Elements

The UI should not hide important elements until some hover event.  The user doesn't know where to go with the mouse until he hovers over the special place.

At the extreme, hover elements completely hide functionality. A perfect example of this is on the profile Reviews page, where the Edit and Delete links don't appear until you hover over the review, or on the Queue page, where you can't see the "Mix in related artists" checkbox until you're playing an artist station and hover over the station track list header. You also don’t see the “CLEAR” button until you hover over the track list.

Second item has mouseover, showing Edit and Delete links:

Without mouseover:

With mouseover:

‘Hidden until hover’ elements are a trick to make a UI look cleaner. The pattern is much less common in native desktop applications, probably owing a great deal to the :hover CSS pseudoselector. 

Compared to play buttons (whether they are hidden until hover or not), the ubiquitous track selection pattern from iTunes/Spotify/Winamp is easier for users. From a raw UI mechanics standpoint, it is easier to double-click the area of an entire track row (typically 1000x24, a 24,000 pixel area) than it is to click on a much smaller play icon (22x22, a 500 pixel area).

Screen Real Estate

Track rows in Rdio are currently 38 pixels tall. In the search results layout, tracks occupy 50 pixels of vertical space. Albums occupy 100 pixels and artists occupy 70 pixels.  

I suggest that Track rows should be a maximum of 23 pixels tall.  The large track row style is derived from thinking of Rdio as a website and not an application. Track rows are 19 pixels tall in iTunes, and 20 pixels tall in Spotify, 23 pixels in Grooveshark. Dropdown menu items are 19 pixels tall in OS X UI.  This is a number that maximizes readability and information density.

Screen real estate is valuable. The 98x36 pixel space allocated for Explicit/Preview/Unavailable badge in track rows is quite wasteful. Text should extend to that area when the badge is hidden. Music services in general must be mindful of long track titles, album titles, and artist names. This is bad:

The question is, what recourse does the user have when track titles are truncated? Can she resize the window, or does the column have maximum width limit? Can the user resize columns? Can she hover to get a tooltip with the full title? All these affordances are baked in to desktop table views that have evolved over decades, but get lost in the reduction to web. This may be a very specific symptom of the pitfalls of designing with fake text, using idealized placeholders like "Nicki Minaj" and "Madonna."

And the broader question behind pixel allocation is the value of white space. It's a complicated question because the value of simplicity is large at the beginning of the user's relationship with an application, but asymptotically approaches a lower value as the user becomes comfortable with the application and needs to carry out more complex tasks.

Context Menus

Rdio action menus should be made available through a right-click context menu.

Like the small play button, the small Action Menu button makes for a difficult interaction. The menu only appears on mouseup, which makes it seem more unresponsive. Once users discover a right-click, it’s much easier to perform than clicking this small icon. Most of these Action Menu actions are also available through drag-and-drop (with the exception of Download and Sync to Mobile), but providing multiple action vectors is a great thing.

Grooveshark actually makes action menus available through both right-click and a hover button:

Google Docs has a really outstanding right click menu. It supports up/down arrow keys to select items and displays keyboard shortcuts.

Reimplementing these desktop application affordances is one of the things that makes web applications difficult to build – but they are essential to a great user experience.

How to get your SFMTA Parking Citation dismissed

I got a San Francisco parking ticket a few weeks ago (street cleaning, $68) and decided to protest it because there were no signs visible from my parking location. The protest form is online at sfmta.com/protest and worth a try. My citation was dismissed. Here is what I submitted with my protest; hopefully this helps someone else out.

Montag-SFMTA-Protest-2015-09-17--1


Watermark Listening Test Results

Over the holidays, I looked at the data from the watermark listening test. My hypothesis was that the watermark added by Universal Music Group is at least as audible as 128 kbps MP3 compression artifacts. The test results support the hypothesis for certain types of music.

The most important takeaway is that the UMG watermark is an enormous confounding factor for evaluating audio quality of streaming services. The noise of the watermark in Universal content will overwhelm differences in compression quality.

The listening test asks subjects to identify the watermarked sample from each of 16 pairs. It's similar to this McGill MP3 discrimination test, which makes it useful for comparison. It should be noted that the McGill listening test was administered in a quiet, controlled listening environment with a high-end sound system, so one might expect that subjects taking my test in the wild might not perform as well.  Nonetheless, the results of the McGill test are included in the chart (white bars) for reference:

watermark-test-results

The difference among songs shows that the watermark is highly content-sensitive. At one extreme (Engulfed Cathedral), 163 out of 201 subjects identified the watermark, comparable to a 96 kbps MP3 on the McGill test.  At the opposite extreme, users were not able to detect the watermark in electronic music with strong transients (Jongebeer).

The watermark seems particularly problematic for classical and acoustic works. The four worst-performing samples are piano/orchestral music.

Enable Mac Volume Control for HDMI and DisplayPort Audio Devices

When you hook up an HDMI TV or DisplayPort monitor with built-in audio device, you might discover that you can't change the volume with the Mac software mixer any more. Volume buttons on your keyboard won't work. You're expected to use the TV remote or volume control on the device.

Here's a trick to get around this limitation. Grab Soundflower and switch the output to your HDMI or DisplayPort device as shown here:

Soundflower menu

Then option-click the speaker icon in the menu bar. Switch your Mac audio output device to Soundflower:

Mac audio

Here, 34UM95 is the display with audio device that won't let me adjust volume.

Generate Mac App Icons Photoshop Action

Creating App Icons in all the required sizes for a Mac app can get pretty tedious. Here is a Photoshop action to simplify the task:

Download Mac App Icon actions for Photoshop CS6+

The first thing it does is to paste the clipboard contents into a square document — so it's designed for workflows where you edit your artwork in Illustrator, and simply copy the shapes to the clipboard when you're ready to generate some PNGs. The Photoshop action will generate transparent PNGs at 1024, 512, 256, 128, 64, 32, and 16 pixels square.

Generating App Icons

I use bilinear downsampling from the original 1024 pixel base image to generate each icon. This results in nice crisp edges without ringing (sharpening halos).

Making Sausage: Fixing a Previous Git Commit

Let's say I'm six commits ahead of master on my work branch. I've sent out a review, and find out I need to fix something on the 3rd commit. This is my workflow for fixing up the previous git commit:

  1. Get to a clean state on the work branch. (git stash if needed)
  2. Make the necessary changes and commit. For example, git commit -a -m "Date added fix". This will be a temporary commit.
  3. git rebase -i origin master to do an interactive rebase against remote master (assuming your remote is named origin).
  4. Move the temporary commit after the commit that needs to be amended, and tag it f for fixup. (Or tag it s if you want to combine the new commit message with the old one.)

    Screen Shot 2014-03-31 at 9.24.27 AM

Done!

More philosophy: On Sausage Making.