We began with a simple idea: What if we could discover music that matches our emotions, moods, and tastes, rather than following trending charts? Given the vast number of streaming services available, and the amount of new songs released everyday, finding the perfect new song is difficult. We currently rely on complex, opaque algorithms to suggest what we might like, but what if we could make it more transparent and intuitive. Imagine using your feelings, not just keywords or categories, to discover new music.
How it works
Simply go to our website, use the sliders to tell us the emotions that you are currently feeling, and we will provide you with a list of songs that match the emotion.
The Dataset
Like any good ML task, we began with collecting our dataset. We couldn't really find a dataset that worked for us, so we set out to build it ourselves. To do this we built a custom data collection system on our website, where users could search for songs they like, and score them along the emotion schema we selected(a modified version of Plutchik's emotional model).
Data Collection Platform
Early on we realized that getting enough data points and varied opinions on songs would be hard with just the 4 of us scoring songs. So we offered our classmates help with their projects in exchange for them scoring songs for us ;). In the end we managed to get around 800 unique songs. 800 songs, while being an impressive number would still not be enough to train a model from emotion classification. To make up for our lack of data, we trained our model on the GTZAN dataset, which was a dataset mapping songs to genres. We would then perform transfer learning with the trained model on our dataset.
The Model
We used a simple, small CNN model, trained from scratch. We sampled a sliding window of 10 seconds duration over the song, converted that window into its mel frequency cepstrum and passed it through the CNN. To get the final value for the song, we took the average of all windows.
Deployment
We didn't need an ML model running along with the frontend. To save on costs, we decided to to run the model over songs separately, and store the results in a mongodb database. The website could then just query the database to get the songs with the emotion distribution closest to the user input.
This freed us up to use netlify free tier deployment, to deploy the nodejs server to serve just the frontend
Continuous Data Addition
We added a page to the dataset creation system to scrape songs from various spotify playlist, and add them to the datasets and mark them to be classified. This way we could collect songs from billboard top songs playlists, as well as playlists that curate various genres of music and add them to our database.
Then around once a month we run a script to get the songs, classify them and add them to our website
Moving Forward
All in all we we're blown away by how good the website came out and especially how well the model performs. It's still up on netlify here. The initial start might take a little while since netlify takes it down after a certain period of inactivity.
And please do let us know what you think about the website!