One Million Songs data — One_Million

This data set contains 1,019,318 unique users' music play counts in the Echo Nest, which is available at "http://millionsongdataset.com/tasteprofile/". As a basic step, it is interesting to predict the play counts using the song information collected in the Million Song Dataset (Bertin-Mahieux et al. (2011)). After cleaning up and feature engineering the data in total contains 205,032 observations where we consider the covariates duration, loudness, tempo, artist hotness, song hotness, and album hotness to model the response, the number of song counts.

Usage

One_Million_Songs

Format

A data frame with 4 columns and 309,685 rows.

Counts: Number of playback counts for songs
Duration: Duration of the song
Loudness: Loudness of the song
Tempo: Tempo of the song
Artist_Hotness: A value between 0 and 1
Song_Hotness: A value between 0 and 1
Album_Hotness: A value between 0 and 1

References

McFee B, Bertin-Mahieux T, Ellis DP, Lanckriet GR (2012). “The million song dataset challenge.” In Proceedings of the 21st International Conference on World Wide Web, 909--916. Ai M, Yu J, Zhang H, Wang H (2021). “Optimal subsampling algorithms for big data regressions.” Statistica Sinica, 31(2), 749--772.

Examples

nrow(One_Million_Songs)
#> [1] 205032