What kinds of songs are popular?

visualization spotify

Danceability, tempo, speechiness, and duration: what makes a song popular?

Wanjia Guo https://wanjiag.github.io/
03-05-2021

Data Cleaning

data1 = data %>% 
  select(id, popularity, year, danceability, duration_ms, tempo, speechiness) %>% 
  mutate(popularity = (popularity-min(popularity))/(max(popularity)-min(popularity)),
         danceability = (danceability-min(danceability))/(max(danceability)-min(danceability)),
         duration = 
           (duration_ms-min(duration_ms))/(max(duration_ms)-min(duration_ms)),
         tempo = (tempo-min(tempo))/(max(tempo)-min(tempo)),
         speechiness = (speechiness-min(speechiness))/(max(speechiness)-min(speechiness))) %>% 
  select(-duration_ms)
  

fig1_data = data1 %>% select(-year) %>% 
  mutate(danceability = cut(danceability, 100, labels=FALSE),
         duration = cut(duration, 100, labels=FALSE),
         tempo = cut(tempo, 100, labels=FALSE),
         speechiness = cut(speechiness, 100, labels=FALSE)) %>% 
  pivot_longer(cols = danceability:duration, 
               names_to = "property") %>% 
  group_by(property, value) %>% 
  summarise(popularity_median = median(popularity)) %>% 
  mutate(property = factor(property, levels=c("danceability",
                                              "tempo",
                                              "speechiness",
                                              "duration"),
                           labels = c("Danceability",
                                              "Tempo",
                                              "Speechiness",
                                              "Duration")))

Final Visualization

ggplot(fig1_data, aes(x=value, y=popularity_median, group=property)) + 
  geom_ribbon(aes(ymin = 0, ymax = popularity_median, fill=property), alpha=0.3) + 
  geom_smooth(aes(color=property, fill=property))+
  labs(x = "Normalized Value (0-100)", y = "Median Popularity (0-1)") + 
  scale_y_continuous(  
    limits = c(0, 0.5))+
  scale_fill_OkabeIto()+
  scale_color_OkabeIto(darken=-0.7)+
  facet_wrap(~property, nrow=1)+
  ggdark::dark_theme_gray() +
  theme(legend.position = "none") +
  labs(title = "Relationship between different song properties and popularity")

I think I like this version the best because it not only shows the raw data, but also shows the trend. By separating each property into different panels, it makes it a lot easier to see that danceability have a almost linear relationship with popularity, whereas other properties all have a “sweet spot” for a song to be popular.

Attempt 1

ggplot(fig1_data, aes(x=value, y=popularity_median, group=property)) + 
  geom_ribbon(aes(ymin = 0, ymax = popularity_median, fill=property), alpha=0.6) + 
  labs(x = "Normalized Value (0-100)", y = "Median Popularity (0-100)") + 
  scale_fill_OkabeIto()

Overall, the overlapping part in the figure makes it really hard to distinguish among each property. I also feel it is hard to see the trend because of how noisy the data is.

Attempt 2

ggplot(fig1_data, aes(x=value, y=popularity_median, group=property)) + 
  geom_smooth(aes(color=property, fill=property))+
  labs(x = "Normalized Value (0-100)", y = "Median Popularity (0-100)") + 
  scale_fill_OkabeIto()+
  scale_color_OkabeIto()+
  facet_wrap(~property)+
  gghighlight()+
  theme(legend.position = "none")

I also thought maybe I didn’t have to show the original data to make the figure looks cleaner. This second figure focuses on the trend, or the best fit curve based on each property without showing the actual data. However, I still feel showing how noisy the original data is will be meaningful.

Citation

For attribution, please cite this work as

Guo (2021, March 5). Visualizing Spotify: What kinds of songs are popular?. Retrieved from https://wanjiag.github.io/EDLD652_project_blog/posts/2021-03-05-what-kinds-of-songs-are-popular/

BibTeX citation

@misc{guo2021what,
  author = {Guo, Wanjia},
  title = {Visualizing Spotify: What kinds of songs are popular?},
  url = {https://wanjiag.github.io/EDLD652_project_blog/posts/2021-03-05-what-kinds-of-songs-are-popular/},
  year = {2021}
}