Skip to content Skip to sidebar Skip to footer

String Encryption - Generate Unique Pattern Like Spotify Codes

Yesterday I've read the question Algorithm to create costum Template/Code from String. Because the question was not formulated that well it was downvoted just instantly. However, t

Solution 1:

Update: I asked a similar question and it was answered by someone linking the patent for this barcode. To summarize, they use an intermediate look-up-table to link the barcode to the unique Spotify ID.

I have been digging into Spotify Codes some to try and understand them.

Spotify has URIs for each song, album, artists, user, playlist, etc. They look something like this:

spotify:playlist:37i9dQZF1DXcBWIGoYBM5M

If you visit Spotify Codes you can generate a code from the URI. The code for the above URI looks like this:

Image of a spotify barcode

As you noted, they encode the information in the heights of each of the bars, in the same way that the United States Postal Service does in their barcodes (see Intelligent Mail barcode).

The bars in Spotify Codes have 8 different heights they can be. The logo is the max height, and the first and the last bars are always the lowest height. In the image above, the max height is 96 pixels, and the bars fall into 8 different height bins: [96, 84, 74, 62, 52, 40, 28, 18].

Using this (sort of messy Python) code I can grab the octal sequence from the barcode image:

from skimage import io
from skimage.filters import threshold_otsu
from skimage.measure import label, regionprops
from skimage.morphology import square
from skimage.color import label2rgb, rgb2gray

def get_sequence(filename):
    image = io.imread(filename)
    image = rgb2gray(image)
    b_and_w = image > threshold_otsu(image)
    labeled = label(b_and_w)
    bar_dims = [r.bbox for r in regionprops(labeled)]
    bar_dims.sort(key=lambda x: x[1], reverse=False)
    spotify_logo = bar_dims[0]
    max_height = spotify_logo[2] - spotify_logo[0]
    sequence = []
    for bar in bar_dims[1:]:
        height = bar[2] - bar[0]
        ratio = height / max_height
        if ratio < 0.25:
            sequence.append(0)
        elif ratio < 0.33:
            sequence.append(1)
        elif ratio < 0.46:
            sequence.append(2)
        elif ratio < 0.5625:
            sequence.append(3)
        elif ratio < 0.677:
            sequence.append(4)
        elif ratio < 0.8:
            sequence.append(5)
        elif ratio < 0.9:
            sequence.append(6)
        elif ratio < 1.1:
            sequence.append(7)
        else:
            raise ValueError('ratio is too high')
    return sequence

The sequence maps like this: 37i9dQZF1DXcBWIGoYBM5M -> [0, 6, 0, 2, 4, 5, 1, 4, 5, 2, 3, 7, 3, 7, 1, 5, 6, 2, 5, 7, 4, 3, 0]

The weird thing about this is the amount of information in the URI and the spotify code do not match up. The URI is 22 characters long, and contains 0-9 a-z A-Z. This means 62^22 potential URIs, or 2.7 e39. There are 23 bars in the spotify code, but the first and last are always 0, so there are only 21 usable bars. This means 8^21 or 9.22 e18 potential codes. The URI to code mapping is not straightforward since there is not 1 code to 1 URI.

I do not know how they map the URIs to the codes. My guess would be that they have a separate database/lookup table that they use to map the codes to URIs. When creating a code, they hash the URI to a code and store that to look up later. When someone looks up a code, they check that database and map it to the URI. Since there are so many more potential URIs, they just don't ever get used and they don't have to worry about them.


Solution 2:

It appears that you're making the assumption that there is a direct mapping from the string "Coffee" to the graphic that's shown. That assumption is almost certainly incorrect.

First, consider what would happen if there are two different songs called "Coffee." Your proposed algorithm would assign them both the same code. That seems unreasonable. You want the code to uniquely identify the song.

Second, song names can be arbitrarily long. For example, there's a song by Pink Floyd called "Several Species of Small Furry Animals Gathered Together in a Cave and Grooving with a Pict." Your encoding algorithm probably won't be able to fit that into 24 bars. Even if it can, I can always find a longer song title.

Given the letters a-z, there are 11,881,376 possible 5-character strings. If you just want to uniquely encode all possible, you can do that with just 23 bits. Just treat the string as a base-26 number and do the conversion.

Most likely, Spotify is assigning a unique number to each song, and then encoding that number. There is no direct mapping between the string "Coffee" and the graphical code you see on your screen.


Post a Comment for "String Encryption - Generate Unique Pattern Like Spotify Codes"