Looking at the most popular musical artists of all time based on their most popular song.
artist_data = data %>%
select(popularity, artists, year, name) %>%
mutate(artists = gsub("\\[|\\]", "", artists)) %>%
separate_rows(artists, sep = ", ") %>%
mutate(artists = gsub("'", "", artists)) %>%
mutate(artists = gsub('"', '', artists))
artist_data_decades = artist_data %>%
mutate(year = year - year %% 10 )
fig3_data = artist_data_decades %>%
group_by(year, artists) %>%
summarise(mean_popularity = mean(popularity)) %>%
arrange(year, desc(mean_popularity)) %>%
mutate(rank = 1:n()) %>%
filter(rank <= 10)
if (exists('top10_df')){
remove("top10_df")
}
for (curr_year in sort(unique(artist_data_decades$year))){
curr_df = artist_data_decades %>%
filter(year <= curr_year) %>%
group_by(artists) %>%
filter(popularity == max(popularity)) %>%
arrange(desc(popularity))
curr_df$rank = 1:1:nrow(curr_df)
curr_df = curr_df %>%
filter(rank <= 10)
if (curr_year == 1920){
curr_df$old = TRUE
}else {
curr_df$old = curr_df$artists %in% pre_df$artists
}
curr_df$year = curr_year
if (exists('top10_df')){
top10_df = rbind(top10_df, curr_df)
}else {
top10_df = curr_df
}
pre_df = curr_df
}
top10_df = top10_df %>%
mutate(name = gsub("\\([^\\]]*\\)", "", name, perl=TRUE))
p3 = ggplot(top10_df) +
aes(xmin = -10,
xmax = popularity,
ymin = rank - .45,
ymax = rank + .45,
y = rank,
group = artists,
fill = old) +
geom_text(aes(label = as.character(year)),
x = 100 , y = -2, alpha=0.8,
hjust = "right",
size = 40, col = "grey40") +
geom_rect(alpha = .9) +
scale_x_continuous(
limits = c(-50, 100),
breaks = c(0, 20, 40, 60, 80, 100)) +
geom_text(hjust = "right",
aes(label = artists,
color = old),
x = -12) +
geom_text(aes(y = rank, label = name),
color = "white",
x = -8,
hjust = "left") +
scale_fill_manual(values=c("#CB7BA7","#1273B0"))+
scale_color_manual(values=c("#CB7BA7","#1273B0"))+
scale_y_reverse() +
labs(x = 'Popularity (0-100)',
y = '',
title = "The most popular artists through the decades.",
subtitle = "<span style = 'color: #CB7BA7'>New comers</span>
and
<span style = 'color: #1273B0'>Defending champions</span>",
caption = "Popularity is based on each artist's most popular song.") +
theme(legend.position = "none",
plot.subtitle = ggtext::element_markdown(size=18),
plot.title = element_text(size=20),
plot.caption = element_text(size=12))
animate(p3 +
transition_states(year,
transition_length = 2,
state_length = 4) +
enter_fade() +
exit_fade(),
width = 675,
height = 400,
nframes = 500,
renderer = magick_renderer())
In general, I liked this version the best because 1) I finally figured out how to slow things down a bit; 2) adding colors makes it easier to detect changes from one decade to the next; 3) with new way of calculating running tops, the transitions are more meaningful and more interesting; 4) added the names of the popular songs for people who are interested,. Its easy to see that the only songs people still listening to from 1920s-1990s are almost exclusively christmas songs.
p1 = ggplot(fig3_data) +
aes(xmin = 18 ,
xmax = mean_popularity) +
aes(ymin = rank - .45,
ymax = rank + .45,
y = rank) +
facet_wrap(~ year) +
geom_rect(alpha = .7) +
scale_x_continuous(
limits = c(-50, 100),
breaks = c(0, 20, 40, 60, 80, 100)) +
geom_text(col = "gray13",
hjust = "right",
aes(label = artists),
x = 10) +
scale_y_reverse() +
labs(x = 'Popularity (0-100)', y = '')
p2 = p1 +
facet_null() +
geom_text(x = 50 , y = -5,
family = "Times",
aes(label = as.character(year)),
size = 25, col = "grey18", alpha=0.5) +
aes(group = artists)
animate(p2 + transition_states(year,
transition_length = 2,
state_length = 5) +
enter_fade() +
exit_fade(),
nframes = 250,
renderer = magick_renderer())
This is an edited version from the draft. One of the major problem I found after I planned to make this figure but had trouble with is the fact that very little people stay within top10 from one year to the next. Therefore, the transition state doesn’t look as nice as a continuous racing chart. The transition look far more sudden than I hoped.
I also had some trouble with flash green screens with gganimate(). Luckily I was able to find a solution online via using the magick_renderer() function.
p3 = ggplot(top10_df) +
aes(xmin = 0,
xmax = popularity,
ymin = rank - .45,
ymax = rank + .45,
y = rank,
group = artists) +
geom_rect(alpha = .7) +
scale_x_continuous(
limits = c(-50, 100),
breaks = c(0, 20, 40, 60, 80, 100)) +
geom_text(col = "gray13",
hjust = "right",
aes(label = artists),
x = -10) +
scale_y_reverse() +
labs(x = 'Popularity (0-100)', y = '') +
geom_text(x = 50 , y = -5,
aes(label = as.character(year)),
size = 35, col = "grey18", alpha=0.5)
animate(p3 +
transition_states(year,transition_length = 3, state_length = 4) +
#ease_aes("sine-in-out") +
enter_fade() +
exit_fade(),
nframes = 350,
renderer = magick_renderer())
#anim_save("popular_artist.gif", animation = last_animation())
This figure is different from the previous in a few perspectives:
I realized that the ranking makes more sense to be based on max, which means that if an aritist has only one mega-hit, the artist deserves to be on the chart, even if the other songs s/he made is horrible.
For attribution, please cite this work as
Guo (2021, March 6). Visualizing Spotify: The most popular artists of all time.. Retrieved from https://wanjiag.github.io/EDLD652_project_blog/posts/2021-03-06-the-most-popular-artists-of-all-time/
BibTeX citation
@misc{guo2021the, author = {Guo, Wanjia}, title = {Visualizing Spotify: The most popular artists of all time.}, url = {https://wanjiag.github.io/EDLD652_project_blog/posts/2021-03-06-the-most-popular-artists-of-all-time/}, year = {2021} }