Thanks for dropping by my page!
I'm currently a Masters of Data Science student graduating in 2022.
I really enjoy learning new tools, exploring data to help solve problems, and helping others understand ideas and data.
US College Data EDA
August 2020 – with Amanda Norton and Steven Spielman
Overview:
An exploration of US College data found at collegescorecard.ed.gov/data/.
Language used: R
Click here to check out our mardown report. :-)
Here’s a preview of our report:
Code I wrote to obtain data using the scorecard API:
#Package import
# install.packages("googlesheets4")
library(googlesheets4)
library(rscorecard)
#Import field variables
gs4_deauth()
starfish.fields <- read_sheet('https://docs.google.com/spreadsheets/d/1PL5zn6QLU9GSSD8rRreL8r7xaoClrPIrf5qnioZ3eE4/edit?usp=sharing')
head(starfish.fields)
starfish.colnames <- tolower(starfish.fields[['VARIABLE NAME']])
# Accessing Data
sc_key('lt36uO4r7wWfcijac20x6e6FforftHUitahjuh1A')
starfish.df <- sc_init() %>%
sc_filter(stabbr == 'FL') %>%
sc_select_(starfish.colnames) %>%
sc_get()
for (year in 2015:2018) {
starfishdf.temp <- sc_init() %>%
sc_filter(stabbr == 'FL') %>%
sc_select_(starfish.colnames) %>%
sc_year(year) %>%
sc_get()
starfish.df <- rbind(starfish.df, starfishdf.temp)
}
head(starfishdf)
write.csv(starfishdf,'starfishdf.csv')
For more details see our git repo.