Vivienne Prince

Logo

Thanks for dropping by my page!

I'm currently a Masters of Data Science student graduating in 2022.

I really enjoy learning new tools, exploring data to help solve problems, and helping others understand ideas and data.

GitHub  •  LinkedIn

Video Game Sales Data Analysis
December 2020  –  with Amanda Norton and Ben Weisman

Overview:
As video game fans, we decided to take a look at the relationship between video game sales and various variables such as critic/ user ratings, game genre, target audience, and the like.
We also were intrested in the distribution of these variables in our video game dataset pulled from VGChartz (video game data tracking website).

Tools:
Tidyverse: data cleaning/ reshaping
ggplot/ plotly: viz

Language used: R/ Markdown

To check out our full findings, take a look at our presentation (pdf).
Data cleaning/ viz code link (github).


These are some visualizations I created for our presentation:


Here’s a test plot I did during EDA and my code:

My code:

vs_sales.byregion.byyear <- vs_byregion %>% 
  group_by(Year, Region)  %>% 
  summarize(SSales = sum(Sales)) 
vs_sales.byregion.byyear$MSales <- vs_byregion %>% 
  group_by(Year, Region)  %>% 
  summarize(means = mean(Sales)) %>%
  pull(means)

summary.data <- videogames.clean %>%
  group_by(Year) %>%
  summarise(SSales = sum(Global_Sales), 
            MSales = mean(Global_Sales),
            Critic = mean(Critic_Score))


ggplotly(
vs_sales.byregion.byyear %>% ggplot(aes(x=Year))+
  theme_minimal() +
  ggtitle("Sales per region superimposed with average critic scores per region over time") +
  ylab("Sales (mil USD)") +
  geom_line(aes(y= SSales, color = Region))+
  geom_line(linetype = "dotted", aes(y= MSales*100, color = Region))+
  geom_bar()
)