Watermark Listening Test Results

Over the holidays, I looked at the data from the watermark listening test. My hypothesis was that the watermark added by Universal Music Group is at least as audible as 128 kbps MP3 compression artifacts. The test results support the hypothesis for certain types of music.

The most important takeaway is that the UMG watermark is an enormous confounding factor for evaluating audio quality of streaming services. The noise of the watermark in Universal content will overwhelm differences in compression quality.

The listening test asks subjects to identify the watermarked sample from each of 16 pairs. It's similar to this McGill MP3 discrimination test, which makes it useful for comparison. It should be noted that the McGill listening test was administered in a quiet, controlled listening environment with a high-end sound system, so one might expect that subjects taking my test in the wild might not perform as well.  Nonetheless, the results of the McGill test are included in the chart (white bars) for reference:


The difference among songs shows that the watermark is highly content-sensitive. At one extreme (Engulfed Cathedral), 163 out of 201 subjects identified the watermark, comparable to a 96 kbps MP3 on the McGill test.  At the opposite extreme, users were not able to detect the watermark in electronic music with strong transients (Jongebeer).

The watermark seems particularly problematic for classical and acoustic works. The four worst-performing samples are piano/orchestral music.

5 ResponsesLeave a Reply

  1. Indeed piano/orchestral music is difficult to watermark if the perceptual model is not very precise. SDMI, 4C, MUSE, STEP2000 presented mostly classical pieces and "tones" for testing. It was observed that such music as Joshua Redman, Yellow, and classical pieces presented very difficult testing material. We were pleased with our results.

    Indeed, here is a communication from Jan 10, 1998 and another at the conclusion of the tests follows:

    We have recently completed independent listening tests and spectral analysis of the source material and watermarked material we submitted to the MUSE Embedded Signalling Tests. Our tests were conducted internally, and with Sony's Advanced Technology Labs (Tokyo), and MCA (Universal) Music Media (Hollywood). We provide this information because it concerns us in how it will effect the results of the testing.

    First, we have confirmed that the 4th track in our originally provided U-Matic 1630 tapes had serious artifacts. Unfortunately, this introduces an element of noise that is not related to our digital watermarking process.

    Second, depending on the double-blind listening tests, we believe that the methodology for listening to the embedded signal material may actually not remove a significant source of artifacts-- quantization noise, inherent in transfers of the audio material to and from the U-matic 1630 tapes. As is well-known, especially with 16 bit audio, quantization noise is difficult to eliminate, even with dither, which many audiophiles "dislike", and may be especially audible in material with lots of "Strings". Because the watermarked material went through more transfers, than the original source material to be compared, it is unlikely the audibility tests can successfully compare the tapes without the differences attributed to quantization artifacts. Having the original material pass through the same transfers as the watermarked material (source U-matic-hard disk-sent U-Matic) and recording to the same U-Matic 1630 might alleviate this concern.

    Third, an update on our own CODEC. We have successfully completed development of a still image application using the same approach as with our audio CODEC, which can survive the most stringent test available, StirMark (mentioned in the submissions we provided) available on the web from Fabien Petitcolas at Cambridge. This anti-watermarking application performs a slight crop and then a scale to original size as well as a slight rotation and is able to render all available still image watermarking products "ineffective". Additionally, we have dropped frames in the FFT analysis that is performed by the CODEC which have significant volume/dynamic shift changes (plainly evident with Strings and other similar material), which sacrifices a percentage of the encoding bits ( less than 25%) to reduce encoding rates to approximately 75 bits/per layer/per channel. The fact that still images are far smaller data carriers versus typical audio carriers has allowed us to better define "perceptually significant" regions in the target signals while being more exact in encoding in specific bands in a more perceptually-based model. The approach does not change any of the cryptographic security but increases the price paid to pirates attempting to erase the watermarks.

    Please understand our concerns.

    With warm regards for the New Year,

    Scott Moskowitz
    Blue Spike, Inc."

    July 7, 1999:
    The end of the MUSE Embedded Signalling Tests
    "Mr. Van Hiejningen:

    While we are flattered that the MUSE Tests seem to represent significant
    data to various parties, mostly unknown to us, we are disappointed that no
    results have ever been reported to us or to a public forum. Our stated goal
    has always been to ensure that the best possible technology is adopted to
    the benefit of the entire music industry.

    We are also concerned that any replication of the results constitutes
    further confusion under both NDA and intellectual property rights. Should
    new information be desired, we are willing to consider submission for the
    benefit of the music industry.

    Again, we do not consider ourselves consultants, and plan to successfully
    market and sell our leading edge technologies to a large market audience.
    An audience that has not presented itself under the MUSE Tests.

    Please understand our concerns in light of the number of ongoing tests,
    political debates and the general lack of any hard results that were
    promised to Blue Spike, Inc., as consideration for participation in the
    MUSE Embedded Signalling Tests. We look forward to future work with you all
    and thank you for your apparent evenhanded approach to the very competitive
    debate surrounding "embedded signalling".

    We await the return of "all" of our submitted materials and any internally
    generated archives or copies that have been used, consistent with the
    original Non Disclosure Agreement and IFPI's stated "end" to the MUSE tests
    as reported to us on May 27,1999.

    Scott Moskowitz
    For Blue Spike

    AWP van Heijningen wrote:

    > Dear Mr. Moskowitz,
    > After some internal discussion related to the TNO quality system and the
    > reproducibility of the MUSE tests, we decided to refine the return
    > activity of the submitted MUSE materials.
    > In conformity with the Non Disclosure Agreement between TNO and the
    > respective candidates regarding the submitted materials in the framework
    > of the MUSE project, TNO is obliged to return any and all of the said
    > materials which are in possession of TNO.
    > TNO is about to return these materials.
    > However, the return of the MUSE materials would imply that TNO is not
    > able anymore to fully reproduce the MUSE Embedded Signalling tests for
    > the respective systems using the originally submitted materials.
    > For candidates who appreciate the reproducibility of the TNO MUSE tests
    > for their specific Embedded Signalling system we offer the possibility
    > of enabling TNO to archive (a copy of) the submitted MUSE materials.
    > Candidates can do this by choosing the Return Option (see below) of
    > their preference and to inform TNO accordingly.
    > If your company appreciates the said reproducibility, we ask you to
    > choose Option 1 or Option 2 of the Return Options, otherwise please
    > indicate that you choose Option 3.
    > The three Return Options are:
    > Option 1: Send TNO a written permission to make a copy of the MUSE
    > materials submitted by your company after which copying TNO will return
    > to the candidate any and all of the submitted MUSE materials that are in
    > possession of TNO;
    > Option 2: Send TNO a written permission to let keep TNO and/or IFPI the
    > submitted MUSE materials and store them in a safe place;
    > Option 3: Ask TNO to return the submitted MUSE materials and by doing so
    > accept the consequence that TNO will not be able anymore to fully
    > reproduce the MUSE tests for the returned system due to the absence of
    > originally submitted materials.
    > We kindly request you to inform TNO about your choice within 2 (two)
    > weeks. Please direct your reaction to the attention of Mr. Ad van
    > Heijningen, Division Electronics & EW.
    > In the case that TNO did not receive such reaction from you within this
    > period we will conclude that your preferred Return Option is Option 3
    > and we will consequently return any of your MUSE materials that are in
    > the possession of TNO.
    > We thank you for your kind co-operation in scope of the MUSE project and
    > we wish you success with your further activities in the music industry.
    > For your file, this message will also be sent to you via regular mail.
    > Kind regards,
    > Ad van Heijningen
    > TNO Physics and Electronics Laboratory
    > Electronics & EW division
    > Oude Waalsdorperweg 63
    > P.O. Box 96864
    > 2509 JG The Hague
    > The Netherlands
    > Phone: 070 - 374 0384
    > Fax: 070 - 374 0653
    > Email: vanHeijningen@fel.tno.nl
    > Internet: http://www.tno.nl/instit/fel/

  2. Pappabas

     /  May 25, 2015 Quote

    Hi, tried to take the listening test but the actual "A" and "B" songsamples do not match the title or even each other! Tried it on both IE and Chrome.

  3. Pappabas: Hi, tried to take the listening test but the actual "A" and "B" songsamples do not match the title or even each other!Tried it on both IE and Chrome.

    Thanks for alerting me to this! It was due to a server upgrade. I've fixed the problem.

  4. Wow, I'm glad I'm not crazy after all. I googled about this artifact and stumbled upon your analysis. The only context I could hear this so far is classical solo piano, but it's annoying.

  5. man, thanks for putting the work into this test. tried it with my desktop's headphones, MDR-7506, without shutting up the fans in it or the server cabinet. 12/16! just saw the good news on your front page, I hope they take it out of the rest of the catalogue.

Leave a Reply