How PUIDs Work

This page has not been reviewed by our documentation team (more info).

Contents

Music Analysis vs Fingerprinting

There are two processes that AmpliFIND makes available through its MusicDNS service: Music Analysis and audio fingerprinting. Finally there are the PUIDs which are just IDs, not fingerprints.

In summary:

Music Analysis is:

Audio Fingerprinting is:

Music Analysis in detail

Before a PUID is available for MusicBrainz or Picard to use, Music Analysis must have been performed on a track. MusicAnalysis uses up to 10 minutes of the track and examines all sorts of things. This is the secret sauce that makes MusicIP tick, and that allows the MusicIP mixer (aka MusicMagic Mixer or MMM) to generate playlists of similar music. This is never going to be open sourced. Music analysis takes a while (about 80% of the file's playing time).

In order to generate a new PUID, you must analyze a track fully. Currently you have to use either the MusicIP mixer or MusicIP's genpuid command-line utility in order to do this. The result of this analysis is submitted to the MusicDNS service and is used by the MusicDNS server to do fuzzy matching. This data is closed source, patented, and even secret (the closed source apps send the data to a closed source server, and it never sees the light of the public). The only thing that gets public is the Portable Unique IDentifier (PUID), which is a 128-bit ID of the respective analysis data on the MusicDNS server.

There is a 24 hour latency on the MusicIP side for new PUIDs to become available. This is an artifact of the server architecture, which is optimized to do large numbers of lookups efficiently. (In practice, the latency is currently less than 12 hours.) This should disappear at some point in the future, but not until the MusicIP server architecture is updated.

MusicAnalysis cannot be integrated into the PicardTagger, because the process is closed source and Picard is GPLed.

Audio Fingerprinting and PUIDs in detail

Fingerprinting is a much smaller process - it analyzes about 2 minutes of the track using the open source libofa library to calculate an AudioFingerprint and should take 2-3 seconds per regular-sized track.

With this fingerprint data, you can only do a "lookup" on the MusicDNS web-service, which returns a PUID if a sufficiently close match has been submitted via the MusicAnalysis described above. If MusicAnalysis has not been performed on the track by someone else; no PUID will be found for the track. MusicIP provides free fingerprint lookup services for official MusicBrainz projects and other open source projects.

Note that the PUID is just an arbitrary ID and has no relation to the fingerprint data (except for its relation within the MusicDNS server). This means you cannot generate a new PUID to insert into the database from the fingerprinting process. A new PUID can only be allocated by MusicDNS as a result of the detailed MusicAnalysis process. For the technically minded, consider a fingerprint to be a key that can be used to query for a value, being the PUID.

How Picard uses PUIDs

picard-puids.png

Assume to start with that the tracks Picard is processing do not have MB IDs previously saved in their tags (otherwise they would automatically get matched by Picard without fingerprinting/scanning and the following would not happen.

  1. User selects an unmatched file and select the Scan button.
  2. Picard puts the unmatched file back into the 'new' folder and calculates the fingerprint of the file.
  3. Picard looks up this fingerprint on the MusicDNS server, trying to find a PUID corresponding to the file's fingerprint.
    1. if the fingerprint yields a match, it receives a PUID. This will only ever happen if MusicAnalysis (above) has already been performed on the file by someone else.
    2. if the fingerprint does not yield a match it receives nothing and Picard can do nothing more with this track until someone performs MusicAnalysis on the file (see above)
  4. If a PUID was received, Picard now does a lookup of this PUID in MusicBrainz to find any matching tracks. This relies on a previous user having used Picard (or similar MB script) to say "this MB track off this MB release matches this MusicIP PUID".
    • If it gets a match, Picard retrieves the release meta data from MB, and moves the track to the right hand pane. The process stops here.
  5. If it does not get a match the PUID stays associated with the file in memory, but stays in the left-hand pane. Now the user must manually match the track using another mechanism (e.g. Cluster/Lookup or manual search on the website).
  6. Once the file is matched to a track, the user can choose to Submit the PUIDs back to MB, thus helping future users at step 4.
    • Note that PUID submission currently has some issues with Picard, and PUID submission is not viewed by the development community as one of Picard's primary functions.
  7. When the file is saved, the PUID is saved into the meta data tags of the file and can be used for future lookup or submission.

PUID vs. TRM and the switch to Picard

While this topic is highly subjective, there appear to be common questions about why the switch to Picard and PUID has been undertaken by the MB development teams. Some of the relevant issues are: