1. Eigensounds

Eigensounds Demonstration

Below section demonstrate eigensounds and therefore has a higher quality and sample size.
We use 2 PCA so 1st PCA is processed as the left audio channel and the second PCA as the right.

Making sure the loading process worked properly:

Eigensound of Jazz.

Eigensound of Billboard Top 30 Single Songs Dec'21.

Eigensound of Adele.

Eigensound of Liszt.

Eigensound of Mozart.

Eigensound of Chopin.

2. Models

What is the lowest quality (sample rate) we can use to still be able to classify songs by composer/genre?

Below is the process of loading audiofiles for each group: (see audiocrunch in utils.py comments for more detail about the downrate, scaling and sampling process.)

Things to note: The audio files were separately processed with the Audacity software. After collecting the individual files they were trimmed to remove the beginning and endings and then joined to each other within each music file `.mp3`.*

Here we aggregate the audiosets into X and produce the labels array y.

Data is shuffled and stratified split 70/30 for train/test sets:

Model Selection

We perform a repeated k-Fold cross-validatation for different classification models on our data. This step is without PCA for comparison purposes.

Without any preprocessing (except for [0,1] scaling during the load) SVM classifier had the lowest average training error of 12.85%.

Now we apply the same classifiers for the data with 2-component PCA:

After PCA, Naive Bayes and k-NN were able to make the classification with ~97% accuracy in just under 6ms (including the scaling and PCA processing time.)

Now we prepate the test and train data sets for NB and k-NN classifications.

#

Following code is to investigate how number of PCs effects the accuracy of our classification.