Spotify — Decision Trees with Music Taste

Jinkim
7 min readNov 26, 2020

THE SITUATION

I’m driving somewhere with my friend Jin, and we leave Spotify playing in the car. We have no clue what song is queued up next, but as each of them begin to play, we comment things like “Natasha I think you would like this song,” “Jin, this would go well with your ___ playlist!” Understanding each other’s taste in music requires us to listen to a few of the other person’s playlist to get a grasp of what they like, but we go by feel, and have no systematic way of determining what song someone would like.

In this article, I’m going to show you how Spotify makes guesses with the use of decision trees and inputs of your playlist data. Since a person usually saves music that they listen to a lot, we can start getting a sense of their music taste by looking through every single song they’ve saved.

TOOLS

Before we dive right in, I’ll take you through some of the tools we’ll be needing. First, we’ll be using Python and Jupyter Notebook. They’re the most commonly used or popular tools in data science today and what we’re doing in our classes, so they were a no brainer to go with. Additionally, Jupyter Notebook gives us the ability to quickly visualize data within the notebook, which comes in helpful as we’re trying to see what’s going on.

Next, we’re using the Spotipy library (that’s Spotify and Python, Spoti-py, if you didn’t catch on!). It’s a great library that gives us functions that automatically tap into the Spotify API. In other words, it serves as a middle man between our Jupyter Notebook and Spotify itself.

And lastly, the Spotify API itself provides a startling amount of interesting data to work with. For example, not only are we pulling (with authentication, which is another word for permission) all the songs from each of our saved songs, Spotify keeps track of various features of each song:

How danceable are your songs?

Putting this all together, we want to save all our songs (and their features) in our saved songs library into a .csv file. As we examine this list, we’ll take note of various metrics of a song, like acousticness, danceability, energy, instrumentalness, key, liveness, loudness, speechiness, tempo, and valence. In fact, these are all metrics that Spotify provides with its API calls.

To get our data, we’ll first make sure that Spotipy library is downloaded by running:

pip3 install spotipy

Then, log on to Spotify Developer through this link. On it, we’ll go to the Dashboard, click create a new app. Make the project, and on a jupyter notebook, run the following code:

Everything is ready for you to run, all you’ll have to replace is the cid and csecret in In [1]. Cid and csecret can be found from going in your project on the right under the title of your project.

After running this, you should be able to find a .csv file that’ll show you all the songs you’ve downloaded, along with various metrics about the songs.

Now that we have a way to get our data — in the form of our .csv files, we can do something with it. Specifically, we’re going to train a model, using decision trees, to find out whether a song is more likely to belong to Jin or I’s (Natasha) Spotify library.

Once we’ve trained our model, we can take a random sample of our songs and see how well our model performs through a confusion matrix (explained later!).

DECISION TREES

First, let’s understand what a decision tree is with a simpler question: How do we decide whether an unknown pokemon is a Charmander, a Squirtle, or neither?

My personal accuracy rate is 50/50 (let’s hope our actual Spotify project does better than that)

We could start by asking a couple questions and filtering out pokemon from there:

  1. Are you a water type? (point for Squirtle)
  2. Are you a fire type? (point for Charmander)
  3. Are you turtle-like? (point for Squirtle)
  4. Are you dinosaur-like? (point for both Charmande)
  5. Etcetera.

Let’s take a look at what this might look like as a diagram.

My Pokemon knowledge is a little nusty… You might be able to spot some exceptions

By asking a series of questions, we’re able to narrow down and classify some input into one of our possible buckets. In our case, we’re going to ask a series of questions to figure out whether a song is more likely to be Jin’s or mine’s. Recall that we have a bunch of those features that Spotify’s internal workings keep track of for each song: “danceability”, “loudness”, “liveness”, and more. Those each have specific numeric values too. We might be able to ask questions such as: “Is this song’s danceability above 0.85?”, and given that Jin’s songs are usually more “danceable” than my own, a song from his library would go down the “yes” path and a song from mine would go down the “no” path (usually). Of course, the decision tree we make will ask a lot more questions to make sure we’re going down the right path.

Implementation wise, we’re saved a lot of trouble with Python’s sklearn or Scikit-Learn library. We can simply import and use it on our data. Underneath the hood, the basic idea behind forming a decision tree for a set of data involves three steps.

First, select the best attributes using Attribute Selection Methods to split the records. (We won’t be diving too deeply into this, but I’d like to highlight a few methods and their core mechanisms.)

  1. Gini Impurity, a measure of how often a randomly chosen element would be labelled incorrectly if it was randomly labelled.
    example: if we labelled a Charmander a water type, it would be wrong 100% of the time, this makes it a very important attribute to distinguish between Charmander and Squirtle
  2. Variance, measuring the split to make sure we’re ruling out as much of the data as we can with each attribute.
    example: if we check if Squirtle was a fire-type, we’re ruling out all fire-type possibilities onwards

Second, make that attribute a decision node (the question upon which it diverges yes or no) and break the data set into continually smaller subsets.

Third, repeat as we build out the tree recursively downwards until there are no more remaining attributes.

RESULTS

Jumping back to the code, we can take a look at some of the results of training our data. First, we can understand how important each of the features were in classifying a song to be mine or Jin’s through feature importance:

One of us has songs that are a lot “speechier” than the other (that’s probably Jin…)

Visually, we can compare some of these features with a bar chart:

Looks like Jin’s songs are a lot more “speechy” than mine

Now that our model is trained, we can check for the accuracy of our model by selecting at random some songs from both of our playlist, run it through our model, and see how well it performed.

*drum roll please*

Getting the accuracy of our model

We hit an accuracy of 0.79! That’s not bad.

CONFUSION MATRIX

Using a confusion matrix comparing our test to our predictions, we can understand where the model went wrong by understanding how many we predicted right but was actually wrong, predicted wrong but was actually right, predicted right and was right, and predicted right and was wrong. That was a mouthful, but here’s a diagram to understand this.

I hope you end up “truly positive” about our project!

Our confusion matrix look like this:

I’m liking the high numbers in the bottom right!

As we can see, our model does pretty well in getting true positives but does have a fair amount of false positives. This means that our model commonly predicts a song to be in a playlist (positive) when it isn’t (false). This makes sense, Jin’s and I’s music tastes have a pretty substantial overlap, and our model finds it hard to distinguish.

CONCLUSION

Decision trees are only one of many ways we can use Data Mining to assign a song to a person’s playlist. In our heads, we make subconscious decision trees all the time even for other tasks. Trying to predict whether someone likes something is just a pool of considering different parameters: “What genre is it?” “Is it before the 2000s?” “Is DiCaprio in it?” Except we use our “gut instinct” to assign each of these decisions weights to come to a conclusion. Using machines help us do this more objectively, and deal with way more parameters than just three I mentioned above.

ACKNOWLEDGEMENTS

This post was written by Natasha Wong, Jinhyuk Kim, and Reymond Pedroza. We couldn’t have done it without help from our data mining class professor, Mike Izbicki (https://izbicki.me/). A good chunk of our project relies on help from the Spotify Workshop at the 2018 5C Hackathon, we’d like to thank Henry and Zihau for hosting that workshop and for the Github code.

--

--